Let's define a function countUniqueChars(s)
that returns the number of unique characters in s
.
countUniqueChars(s)
if s = "LEETCODE"
then "L"
, "T"
, "C"
, "O"
, "D"
are the unique characters since they appear only once in s
, therefore countUniqueChars(s) = 5
.Given a string s
, return the sum of countUniqueChars(t)
where t
is a substring of s
. The test cases are generated such that the answer fits in a 32-bit integer.
Notice that some substrings can be repeated so in this case you have to count the repeated ones too.
Example 1:
Input: s = "ABC" Output: 10 Explanation: All possible substrings are: "A","B","C","AB","BC" and "ABC". Every substring is composed with only unique letters. Sum of lengths of all substring is 1 + 1 + 1 + 2 + 2 + 3 = 10
Example 2:
Input: s = "ABA"
Output: 8
Explanation: The same as example 1, except countUniqueChars
("ABA") = 1.
Example 3:
Input: s = "LEETCODE" Output: 92
Constraints:
1 <= s.length <= 105
s
consists of uppercase English letters only.This problem asks to calculate the sum of unique characters across all substrings of a given string. A naive approach of generating all substrings and then counting unique characters for each would be computationally expensive (O(n³)). The optimal solution leverages a more efficient approach with linear time complexity.
The core idea is to analyze the contribution of each individual character to the total count of unique characters across all substrings. Instead of generating all substrings, we directly count how many substrings each character contributes a unique character to.
Algorithm:
Store Character Positions: Create a data structure (like a dictionary or list of lists in Python, or a HashMap in Java) to store the indices (positions) of each character in the input string s
. This allows for quick access to all occurrences of a particular character.
Calculate Contribution: For each character c
and each of its occurrences at index i
:
c
(let's call it prev
) and the index of the next occurrence of c
(let's call it next
). If there is no previous or next occurrence, set prev
to -1 and next
to the length of the string.c
is unique at position i
is calculated as (i - prev) * (next - i)
. This is because the substring must start after prev
and end before next
to contain only one instance of c
at position i
.Sum Contributions: Sum up the contributions of all characters across all their occurrences. This final sum represents the total number of unique characters across all substrings.
Time and Space Complexity:
The following code examples demonstrate the solution in different programming languages. The core logic remains the same across all of them.
Python:
from collections import defaultdict
def uniqueLetterString(s: str) -> int:
d = defaultdict(list) # Store indices for each character
for i, c in enumerate(s):
d[c].append(i)
ans = 0
for char_indices in d.values():
indices = [-1] + char_indices + [len(s)] # Add sentinel values for edge cases
for i in range(1, len(indices) - 1):
ans += (indices[i] - indices[i - 1]) * (indices[i + 1] - indices[i])
return ans
Java:
import java.util.*;
class Solution {
public int uniqueLetterString(String s) {
Map<Character, List<Integer>> charIndices = new HashMap<>();
for (int i = 0; i < s.length(); i++) {
charIndices.computeIfAbsent(s.charAt(i), k -> new ArrayList<>()).add(i);
}
int ans = 0;
for (List<Integer> indices : charIndices.values()) {
List<Integer> extendedIndices = new ArrayList<>(List.of(-1));
extendedIndices.addAll(indices);
extendedIndices.add(s.length());
for (int i = 1; i < extendedIndices.size() - 1; i++) {
ans += (extendedIndices.get(i) - extendedIndices.get(i - 1)) * (extendedIndices.get(i + 1) - extendedIndices.get(i));
}
}
return ans;
}
}
C++:
#include <iostream>
#include <string>
#include <vector>
#include <map>
using namespace std;
int uniqueLetterString(string s) {
map<char, vector<int>> charIndices;
for (int i = 0; i < s.length(); i++) {
charIndices[s[i]].push_back(i);
}
int ans = 0;
for (auto const& [key, val] : charIndices) {
vector<int> extendedIndices = {-1};
extendedIndices.insert(extendedIndices.end(), val.begin(), val.end());
extendedIndices.push_back(s.length());
for (size_t i = 1; i < extendedIndices.size() - 1; i++) {
ans += (extendedIndices[i] - extendedIndices[i - 1]) * (extendedIndices[i + 1] - extendedIndices[i]);
}
}
return ans;
}
These examples implement the efficient algorithm described above, achieving a linear time complexity. Remember to handle edge cases (first and last occurrences of characters) appropriately as shown in the code.