Given a string s
, return the number of homogenous substrings of s
. Since the answer may be too large, return it modulo 109 + 7
.
A string is homogenous if all the characters of the string are the same.
A substring is a contiguous sequence of characters within a string.
Example 1:
Input: s = "abbcccaa" Output: 13 Explanation: The homogenous substrings are listed as below: "a" appears 3 times. "aa" appears 1 time. "b" appears 2 times. "bb" appears 1 time. "c" appears 3 times. "cc" appears 2 times. "ccc" appears 1 time. 3 + 1 + 2 + 1 + 3 + 2 + 1 = 13.
Example 2:
Input: s = "xy" Output: 2 Explanation: The homogenous substrings are "x" and "y".
Example 3:
Input: s = "zzzzz" Output: 15
Constraints:
1 <= s.length <= 105
s
consists of lowercase letters.Given a string s
, the task is to find the number of homogenous substrings in s
. A homogenous substring is a contiguous sequence of characters where all characters are the same. The result should be modulo 109 + 7.
The most efficient way to solve this problem is by iterating through the string and keeping track of consecutive homogenous characters. For each sequence of identical characters, we calculate the number of homogenous substrings formed by that sequence and add it to the total count.
For a sequence of k
identical characters, the number of homogenous substrings is the sum of integers from 1 to k
, which is given by the formula k * (k + 1) / 2
.
The Python3 code below implements this approach:
class Solution:
def countHomogenous(self, s: str) -> int:
mod = 10**9 + 7
count = 0 # Initialize the count of homogenous substrings
current_run = 0 # Length of current run of same characters
for i in range(len(s)):
if i > 0 and s[i] == s[i-1]: #Check if current character is same as previous
current_run += 1 # Extend current run
else:
# New run of characters started
count = (count + current_run * (current_run + 1) // 2) % mod #Add homogenous substrings from previous run.
current_run = 1 #Reset current run length to 1
#Add homogenous substrings from last run.
count = (count + current_run * (current_run + 1) // 2) % mod
return count
Time Complexity Analysis:
The code iterates through the string once, performing constant-time operations in each iteration. Therefore, the time complexity is O(n), where n is the length of the string.
Space Complexity Analysis:
The space complexity is O(1) because we are only using a few constant-size variables to store the count and length of the current run of homogenous characters.
The same approach can be implemented in other languages with minor syntactic variations. Here are examples in Java and C++:
Java:
class Solution {
public int countHomogenous(String s) {
long mod = (long)1e9 + 7;
long count = 0;
int current_run = 0;
for (int i = 0; i < s.length(); i++) {
if (i > 0 && s.charAt(i) == s.charAt(i - 1)) {
current_run++;
} else {
count = (count + (long) current_run * (current_run + 1) / 2) % mod;
current_run = 1;
}
}
count = (count + (long) current_run * (current_run + 1) / 2) % mod;
return (int) count;
}
}
C++:
class Solution {
public:
int countHomogenous(string s) {
long long mod = 1e9 + 7;
long long count = 0;
long long current_run = 0;
for (int i = 0; i < s.length(); i++) {
if (i > 0 && s[i] == s[i - 1]) {
current_run++;
} else {
count = (count + current_run * (current_run + 1) / 2) % mod;
current_run = 1;
}
}
count = (count + current_run * (current_run + 1) / 2) % mod;
return count;
}
};
The time and space complexity remain the same for all these implementations – O(n) and O(1), respectively.