Given the string s
, return the size of the longest substring containing each vowel an even number of times. That is, 'a', 'e', 'i', 'o', and 'u' must appear an even number of times.
Example 1:
Input: s = "eleetminicoworoep" Output: 13 Explanation: The longest substring is "leetminicowor" which contains two each of the vowels: e, i and o and zero of the vowels: a and u.
Example 2:
Input: s = "leetcodeisgreat" Output: 5 Explanation: The longest substring is "leetc" which contains two e's.
Example 3:
Input: s = "bcbcbc" Output: 6 Explanation: In this case, the given string "bcbcbc" is the longest because all vowels: a, e, i, o and u appear zero times.
Constraints:
1 <= s.length <= 5 x 10^5
s
contains only lowercase English letters.This problem asks to find the longest substring within a given string where each vowel (a, e, i, o, u) appears an even number of times. The optimal solution leverages bit manipulation and a hash table (or array) for efficient tracking.
Bitmask Representation: We use a 5-bit integer (mask) to represent the parity of each vowel. Each bit corresponds to a vowel (e.g., bit 0 for 'a', bit 1 for 'e', etc.). A bit is 1 if the vowel count is odd and 0 if it's even. This allows us to compactly represent the vowel parity information for any substring.
Prefix Parity Tracking: We iterate through the string, maintaining a mask
representing the parity of vowels seen so far (a prefix). When a vowel is encountered, we toggle the corresponding bit in the mask
using XOR.
Hash Table (or Array): We use a hash table (Python, JavaScript) or an array (Java, C++, Go) named d
to store the first index at which each mask
value was encountered. d[mask] = index
means the mask
was first seen at index index
. We initialize d[0] = -1
(or 0 for some implementations) to handle the case where the substring starts from the beginning (empty substring).
Finding the Longest Substring: For each prefix, we check if its mask
exists in d
. If it does, it means we have found another prefix with the same vowel parity. The substring between these two prefixes has even counts for all vowels. The length of this substring is the difference between their indices. We update ans
(the length of the longest substring found so far) accordingly. If the mask
is not in d
, we add it to d
with the current index.
Time Complexity: O(n), where n is the length of the string. We iterate through the string once. The hash table lookups (or array accesses) are considered O(1) on average.
Space Complexity: O(1). We use a fixed-size hash table (or array) of size 32 (to accommodate all possible 5-bit masks). The space used doesn't depend on the input string length.
The code examples in different languages below largely follow the same logic:
Python:
class Solution:
def findTheLongestSubstring(self, s: str) -> int:
d = {0: -1} # Initialize hash table: mask 0 (all even) at index -1
ans = mask = 0 # ans: length of longest substring, mask: current parity
for i, c in enumerate(s):
if c in "aeiou": # Check if it's a vowel
mask ^= 1 << ("aeiou".index(c)) # Toggle corresponding bit in mask
if mask in d: # Check if mask exists in the hash table
ans = max(ans, i - d[mask]) # Update ans with the longest substring length
else:
d[mask] = i # Store the index of the first occurrence of this mask
return ans
Java:
class Solution {
public int findTheLongestSubstring(String s) {
int[] d = new int[32]; //Array to store mask indices. Initialized with a large value
Arrays.fill(d, Integer.MAX_VALUE);
d[0] = 0; // mask 0 at index 0
int ans = 0, mask = 0;
for (int i = 0; i < s.length(); ++i) {
char c = s.charAt(i);
if (c == 'a') mask ^= 1;
else if (c == 'e') mask ^= 2;
else if (c == 'i') mask ^= 4;
else if (c == 'o') mask ^= 8;
else if (c == 'u') mask ^= 16;
ans = Math.max(ans, i + 1 - d[mask]); //Update ans
d[mask] = Math.min(d[mask], i + 1); //Update the index
}
return ans;
}
}
C++:
class Solution {
public:
int findTheLongestSubstring(string s) {
vector<int> d(32, INT_MAX); // Array to store mask indices.Initialized with INT_MAX
d[0] = 0; // mask 0 at index 0
int ans = 0, mask = 0;
for (int i = 0; i < s.length(); ++i) {
char c = s[i];
if (c == 'a') mask ^= 1;
else if (c == 'e') mask ^= 2;
else if (c == 'i') mask ^= 4;
else if (c == 'o') mask ^= 8;
else if (c == 'u') mask ^= 16;
ans = max(ans, i + 1 - d[mask]); //Update ans
d[mask] = min(d[mask], i + 1); //Update the index
}
return ans;
}
};
The other languages (Go, TypeScript) follow a very similar structure. The key is the efficient use of bit manipulation to track vowel parity and the hash table/array to quickly find previously seen parity patterns.