Given a string s
, return the lexicographically smallest subsequence of s
that contains all the distinct characters of s
exactly once.
Example 1:
Input: s = "bcabc" Output: "abc"
Example 2:
Input: s = "cbacdcbc" Output: "acdb"
Constraints:
1 <= s.length <= 1000
s
consists of lowercase English letters.Note: This question is the same as 316: https://leetcode.com/problems/remove-duplicate-letters/
This problem asks to find the lexicographically smallest subsequence of a given string that contains all distinct characters exactly once. The optimal approach uses a monotonic stack and a greedy strategy.
Core Idea:
The solution iterates through the string. For each character, it checks if it's already in the result (using a set or bitmask). If not, it compares it with the characters currently on the stack. If the character is smaller than the top of the stack, and there are more occurrences of the top of the stack later in the string, the top of the stack is popped. This process continues until either the stack is empty, or the current character is not smaller than the top of the stack, or there are no more occurrences of the top of the stack later. The current character is then pushed onto the stack. This ensures that the resulting subsequence is lexicographically smallest.
Algorithm:
Record Last Occurrences: Create a map (or array) to store the last index of each character in the input string. This helps determine if a character needs to be kept on the stack or can be removed.
Initialize Stack and Visited Set: Use a stack to build the subsequence. Use a set (or bitmask) to keep track of characters already included in the subsequence.
Iterate and Process: Iterate through the input string.
visited
set, skip it.last
map/array):
visited
set.visited
set.Construct Result: After iterating through the entire string, the stack contains the lexicographically smallest subsequence. Convert the stack into a string and return it.
Time Complexity Analysis:
The algorithm iterates through the string once. The while
loop inside the main loop performs at most O(n)
operations in the worst case, where n
is the length of the string (e.g., if the string is sorted in descending order). Therefore, the overall time complexity is O(n).
Space Complexity Analysis:
The space complexity is determined by the stack and the last
map/array (or visited
set/bitmask). In the worst case, the stack can hold up to 26 characters (all unique lowercase English letters), and the last
map/array uses O(26)
space. The visited set uses O(26)
space. Therefore, the space complexity is O(1) (constant). It does not scale with the input string length.
Code Examples (with detailed comments):
Python:
def smallestSubsequence(s: str) -> str:
last_occurrence = {c: i for i, c in enumerate(s)} # Dictionary to store last indices
stack = []
visited = set() # Set to track already included characters
for i, c in enumerate(s):
if c in visited: # Skip if already in subsequence
continue
while stack and stack[-1] > c and last_occurrence[stack[-1]] > i: #Greedy step: remove larger chars that appear later
visited.remove(stack.pop())
stack.append(c)
visited.add(c)
return "".join(stack)
Java:
import java.util.ArrayDeque;
import java.util.Deque;
import java.util.HashSet;
import java.util.Set;
class Solution {
public String smallestSubsequence(String s) {
int[] lastOccurrence = new int[26]; // Array to store last indices
for (int i = 0; i < s.length(); i++) {
lastOccurrence[s.charAt(i) - 'a'] = i;
}
Deque<Character> stack = new ArrayDeque<>();
Set<Character> visited = new HashSet<>(); //Set to track characters
for (int i = 0; i < s.length(); i++) {
char c = s.charAt(i);
if (visited.contains(c)) {
continue;
}
while (!stack.isEmpty() && stack.peek() > c && lastOccurrence[stack.peek() - 'a'] > i) {
visited.remove(stack.pop());
}
stack.push(c);
visited.add(c);
}
StringBuilder sb = new StringBuilder();
while (!stack.isEmpty()) {
sb.append(stack.pop()); //Pop in reverse order to get the correct order
}
return sb.reverse().toString();
}
}
The other languages (C++, Go, TypeScript) follow a similar structure, leveraging their respective data structures (stacks and sets/bitmasks). The core logic of the greedy approach remains the same across all implementations.