{x}
blog image

Distinct Subsequences II

Given a string s, return the number of distinct non-empty subsequences of s. Since the answer may be very large, return it modulo 109 + 7.

A subsequence of a string is a new string that is formed from the original string by deleting some (can be none) of the characters without disturbing the relative positions of the remaining characters. (i.e., "ace" is a subsequence of "abcde" while "aec" is not.

 

Example 1:

Input: s = "abc"
Output: 7
Explanation: The 7 distinct subsequences are "a", "b", "c", "ab", "ac", "bc", and "abc".

Example 2:

Input: s = "aba"
Output: 6
Explanation: The 6 distinct subsequences are "a", "b", "ab", "aa", "ba", and "aba".

Example 3:

Input: s = "aaa"
Output: 3
Explanation: The 3 distinct subsequences are "a", "aa" and "aaa".

 

Constraints:

  • 1 <= s.length <= 2000
  • s consists of lowercase English letters.

Solution Explanation for Distinct Subsequences II

The problem asks to find the number of distinct non-empty subsequences of a given string s. A subsequence is a sequence that can be derived from another sequence by deleting some or no elements without changing the order of the remaining elements. Since the answer can be very large, we need to return the result modulo 109 + 7.

Approach:

The most efficient approach uses dynamic programming. We iterate through the string, and for each character, we calculate the number of distinct subsequences that include that character. This is done by considering the number of distinct subsequences formed by the characters encountered before the current character.

Detailed Explanation with Code (Python):

class Solution:
    def distinctSubseqII(self, s: str) -> int:
        mod = 10**9 + 7
        dp = [0] * 26  # dp[i] stores the count of distinct subsequences ending with the i-th character ('a' to 'z')
        ans = 0        # Total count of distinct subsequences
        for c in s:
            i = ord(c) - ord('a') # Get the index of the character in dp array
            add = (ans - dp[i] + 1 + mod) % mod # Calculate the additional distinct subsequences formed by including the current character
            ans = (ans + add) % mod             # Update the total count
            dp[i] = (dp[i] + add) % mod          # Update the count for current character
        return ans
 

Time Complexity Analysis:

The code iterates through the string s once. Inside the loop, the operations (like ord, %, addition) take constant time. Therefore, the overall time complexity is O(n), where n is the length of the string.

Space Complexity Analysis:

The space used is dominated by the dp array, which has a fixed size of 26 (for lowercase English letters). Therefore, the space complexity is O(1), which is constant space.

Other Language Implementations:

The core logic remains the same across different programming languages. The variations mainly lie in the syntax and how modulo operations are handled. The provided solutions in Java, C++, Go, TypeScript, Rust, and C demonstrate these variations while maintaining the same underlying DP approach and O(n) time and O(1) space complexity. The key is to efficiently maintain the count of distinct subsequences ending with each character.