{x}
blog image

Substrings That Begin and End With the Same Letter

Solution Explanation: Counting Substrings

This problem asks us to count the number of substrings in a given string s that start and end with the same character. A brute-force approach would involve iterating through all possible substrings and checking this condition, leading to a time complexity of O(n³), where n is the length of the string. However, a significantly more efficient solution exists.

Optimized Approach: Frequency Counting

The key observation is that we don't need to explicitly generate and check all substrings. Instead, we can leverage the fact that the number of substrings starting at a given index i and ending with the same character is directly related to the frequency of that character before index i.

  1. Initialization: We use a hash map (or an array if we only have lowercase English letters) to store the frequency count of each character encountered so far.

  2. Iteration: We iterate through the string s, character by character. For each character c at index i:

    • We increment its frequency count in the hash map.
    • The current character's frequency count represents the number of substrings ending at index i that begin and end with c. This is because every occurrence of c before index i creates a new substring that starts and ends with c. We add this count to the total count of substrings.
  3. Return Value: After iterating through the entire string, the total count represents the number of substrings that begin and end with the same character.

Time and Space Complexity

  • Time Complexity: O(n), where n is the length of the string s. We iterate through the string only once.
  • Space Complexity: O(1) in the case of lowercase English letters (we use a fixed-size array of 26 elements). If the character set were significantly larger or varied, it would become O(k), where k is the number of unique characters.

Code Examples

The code examples below demonstrate this optimized approach in various programming languages:

Python:

from collections import Counter
 
def numberOfSubstrings(s: str) -> int:
    cnt = Counter()
    ans = 0
    for c in s:
        cnt[c] += 1
        ans += cnt[c]
    return ans
 

Java:

class Solution {
    public long numberOfSubstrings(String s) {
        int[] cnt = new int[26];
        long ans = 0;
        for (char c : s.toCharArray()) {
            ans += ++cnt[c - 'a'];
        }
        return ans;
    }
}

C++:

class Solution {
public:
    long long numberOfSubstrings(string s) {
        int cnt[26] = {0};
        long long ans = 0;
        for (char c : s) {
            ans += ++cnt[c - 'a'];
        }
        return ans;
    }
};

JavaScript:

var numberOfSubstrings = function(s) {
    let count = {};
    let result = 0;
    for (let char of s) {
        count[char] = (count[char] || 0) + 1;
        result += count[char];
    }
    return result;
};

These examples all follow the same basic logic: they use a frequency counter to efficiently count the substrings without needing to generate and check every possible substring. The use of an array (for a fixed character set like lowercase English letters) is slightly more efficient than a hash map in most cases because array lookups have constant time complexity.