{x}
blog image

Unique Substrings With Equal Digit Frequency

Solution Explanation

This problem asks to find the number of unique substrings where each digit appears the same number of times. The solution uses a prefix sum array to efficiently count digit occurrences within substrings.

Approach

  1. Prefix Sum: A 2D array presum of size (n+1) x 10 is created, where n is the length of the input string s. presum[i][j] stores the count of digit j in the substring s[0:i]. This allows for quick calculation of digit counts within any substring s[i:j] using presum[j+1][k] - presum[i][k].

  2. check Function: This function determines if a substring s[i:j] satisfies the condition. It iterates through digits 0-9. For each digit, it calculates its count in the substring using the prefix sum. If the count is greater than 0 and a different count has already been found, it means the digits don't have equal frequencies, and false is returned.

  3. Iteration and Counting: The code iterates through all possible substrings of s. For each substring, it calls check to verify the equal frequency condition. If the condition is met, the substring is added to a set vis to ensure uniqueness. Finally, the size of vis (number of unique substrings) is returned.

Time Complexity Analysis

  • Prefix Sum Calculation: The prefix sum array is calculated in O(n*10) time, which simplifies to O(n).
  • Substring Check: The nested loops iterate through all possible substrings, taking O(n^2) time. Inside the loops, check function takes O(10) time which is O(1).
  • Set Operations: Adding substrings to the set takes O(1) on average, and getting the set size takes O(1) time.

Therefore, the overall time complexity is dominated by the nested loops, resulting in O(n^2).

Space Complexity Analysis

  • Prefix Sum Array: The presum array uses O(n*10) space, which is O(n).
  • Set: The vis set stores unique substrings. In the worst case, it could store all substrings, resulting in O(n^2) space complexity.

Therefore, the overall space complexity is O(n^2) in the worst case.

Code Explanation (Python)

The Python code implements the approach directly. The check function is clearly defined, and the prefix sum array efficiently counts digit frequencies. The use of a set guarantees that only unique substrings are counted.

class Solution:
    def equalDigitFrequency(self, s: str) -> int:
        def check(i, j):
            v = set()
            for k in range(10):
                cnt = presum[j + 1][k] - presum[i][k]
                if cnt > 0:
                    v.add(cnt)
                if len(v) > 1:
                    return False
            return True
 
        n = len(s)
        presum = [[0] * 10 for _ in range(n + 1)]
        for i, c in enumerate(s):
            presum[i + 1][int(c)] += 1
            for j in range(10):
                presum[i + 1][j] += presum[i][j]
        vis = set(s[i : j + 1] for i in range(n) for j in range(i, n) if check(i, j))
        return len(vis)

The Java and Go codes follow the same logic and data structures, with minor syntactic differences. The time and space complexity analyses apply equally to all three languages.