{x}
blog image

Count Number of Homogenous Substrings

Given a string s, return the number of homogenous substrings of s. Since the answer may be too large, return it modulo 109 + 7.

A string is homogenous if all the characters of the string are the same.

A substring is a contiguous sequence of characters within a string.

 

Example 1:

Input: s = "abbcccaa"
Output: 13
Explanation: The homogenous substrings are listed as below:
"a"   appears 3 times.
"aa"  appears 1 time.
"b"   appears 2 times.
"bb"  appears 1 time.
"c"   appears 3 times.
"cc"  appears 2 times.
"ccc" appears 1 time.
3 + 1 + 2 + 1 + 3 + 2 + 1 = 13.

Example 2:

Input: s = "xy"
Output: 2
Explanation: The homogenous substrings are "x" and "y".

Example 3:

Input: s = "zzzzz"
Output: 15

 

Constraints:

  • 1 <= s.length <= 105
  • s consists of lowercase letters.

1759. Count Number of Homogenous Substrings

Problem Description

Given a string s, the task is to find the number of homogenous substrings in s. A homogenous substring is a contiguous sequence of characters where all characters are the same. The result should be modulo 109 + 7.

Solution Approach

The most efficient way to solve this problem is by iterating through the string and keeping track of consecutive homogenous characters. For each sequence of identical characters, we calculate the number of homogenous substrings formed by that sequence and add it to the total count.

For a sequence of k identical characters, the number of homogenous substrings is the sum of integers from 1 to k, which is given by the formula k * (k + 1) / 2.

Detailed Explanation with Code (Python3)

The Python3 code below implements this approach:

class Solution:
    def countHomogenous(self, s: str) -> int:
        mod = 10**9 + 7
        count = 0  # Initialize the count of homogenous substrings
        current_run = 0 # Length of current run of same characters
        
        for i in range(len(s)):
            if i > 0 and s[i] == s[i-1]: #Check if current character is same as previous
                current_run += 1  # Extend current run
            else:
                # New run of characters started
                count = (count + current_run * (current_run + 1) // 2) % mod #Add homogenous substrings from previous run.
                current_run = 1 #Reset current run length to 1
 
        #Add homogenous substrings from last run.
        count = (count + current_run * (current_run + 1) // 2) % mod
        return count
 

Time Complexity Analysis:

The code iterates through the string once, performing constant-time operations in each iteration. Therefore, the time complexity is O(n), where n is the length of the string.

Space Complexity Analysis:

The space complexity is O(1) because we are only using a few constant-size variables to store the count and length of the current run of homogenous characters.

Code in Other Languages

The same approach can be implemented in other languages with minor syntactic variations. Here are examples in Java and C++:

Java:

class Solution {
    public int countHomogenous(String s) {
        long mod = (long)1e9 + 7;
        long count = 0;
        int current_run = 0;
 
        for (int i = 0; i < s.length(); i++) {
            if (i > 0 && s.charAt(i) == s.charAt(i - 1)) {
                current_run++;
            } else {
                count = (count + (long) current_run * (current_run + 1) / 2) % mod;
                current_run = 1;
            }
        }
        count = (count + (long) current_run * (current_run + 1) / 2) % mod;
        return (int) count;
    }
}

C++:

class Solution {
public:
    int countHomogenous(string s) {
        long long mod = 1e9 + 7;
        long long count = 0;
        long long current_run = 0;
 
        for (int i = 0; i < s.length(); i++) {
            if (i > 0 && s[i] == s[i - 1]) {
                current_run++;
            } else {
                count = (count + current_run * (current_run + 1) / 2) % mod;
                current_run = 1;
            }
        }
        count = (count + current_run * (current_run + 1) / 2) % mod;
        return count;
    }
};

The time and space complexity remain the same for all these implementations – O(n) and O(1), respectively.