This problem asks us to count the number of substrings in a given string s
that start and end with the same character. A brute-force approach would involve iterating through all possible substrings and checking this condition, leading to a time complexity of O(n³), where n is the length of the string. However, a significantly more efficient solution exists.
The key observation is that we don't need to explicitly generate and check all substrings. Instead, we can leverage the fact that the number of substrings starting at a given index i
and ending with the same character is directly related to the frequency of that character before index i
.
Initialization: We use a hash map (or an array if we only have lowercase English letters) to store the frequency count of each character encountered so far.
Iteration: We iterate through the string s
, character by character. For each character c
at index i
:
i
that begin and end with c
. This is because every occurrence of c
before index i
creates a new substring that starts and ends with c
. We add this count to the total count of substrings.Return Value: After iterating through the entire string, the total count represents the number of substrings that begin and end with the same character.
s
. We iterate through the string only once.The code examples below demonstrate this optimized approach in various programming languages:
Python:
from collections import Counter
def numberOfSubstrings(s: str) -> int:
cnt = Counter()
ans = 0
for c in s:
cnt[c] += 1
ans += cnt[c]
return ans
Java:
class Solution {
public long numberOfSubstrings(String s) {
int[] cnt = new int[26];
long ans = 0;
for (char c : s.toCharArray()) {
ans += ++cnt[c - 'a'];
}
return ans;
}
}
C++:
class Solution {
public:
long long numberOfSubstrings(string s) {
int cnt[26] = {0};
long long ans = 0;
for (char c : s) {
ans += ++cnt[c - 'a'];
}
return ans;
}
};
JavaScript:
var numberOfSubstrings = function(s) {
let count = {};
let result = 0;
for (let char of s) {
count[char] = (count[char] || 0) + 1;
result += count[char];
}
return result;
};
These examples all follow the same basic logic: they use a frequency counter to efficiently count the substrings without needing to generate and check every possible substring. The use of an array (for a fixed character set like lowercase English letters) is slightly more efficient than a hash map in most cases because array lookups have constant time complexity.