{x}
blog image

Shortest Common Supersequence

Given two strings str1 and str2, return the shortest string that has both str1 and str2 as subsequences. If there are multiple valid strings, return any of them.

A string s is a subsequence of string t if deleting some number of characters from t (possibly 0) results in the string s.

 

Example 1:

Input: str1 = "abac", str2 = "cab"
Output: "cabac"
Explanation: 
str1 = "abac" is a subsequence of "cabac" because we can delete the first "c".
str2 = "cab" is a subsequence of "cabac" because we can delete the last "ac".
The answer provided is the shortest such string that satisfies these properties.

Example 2:

Input: str1 = "aaaaaaaa", str2 = "aaaaaaaa"
Output: "aaaaaaaa"

 

Constraints:

  • 1 <= str1.length, str2.length <= 1000
  • str1 and str2 consist of lowercase English letters.

Solution Explanation: Shortest Common Supersequence

This problem asks to find the shortest string that contains both str1 and str2 as subsequences. The solution uses dynamic programming to efficiently solve this.

Approach

  1. Dynamic Programming Table: We create a DP table f of size (m+1) x (n+1), where m and n are the lengths of str1 and str2 respectively. f[i][j] represents the length of the shortest common supersequence of str1[0...i-1] and str2[0...j-1].

  2. Base Cases: f[i][0] = i and f[0][j] = j because the shortest common supersequence of a string and an empty string is the string itself.

  3. Recursive Relation: For i > 0 and j > 0:

    • If str1[i-1] == str2[j-1], it means the last characters are the same. We can simply add this character to the shortest common supersequence of the remaining prefixes: f[i][j] = f[i-1][j-1] + 1.
    • If str1[i-1] != str2[j-1], we have two choices: either include str1[i-1] or str2[j-1] in the supersequence. We choose the option that leads to the shorter supersequence: f[i][j] = min(f[i-1][j] + 1, f[i][j-1] + 1). We add 1 because we're adding one character.
  4. Backtracking: After filling the DP table, we backtrack to reconstruct the shortest common supersequence. We start from f[m][n] and trace back based on the values in the DP table. If f[i][j] == f[i-1][j-1] + 1, it implies that str1[i-1] == str2[j-1], so we add this character to the result. Otherwise, we choose the direction that corresponds to the smaller value in f[i-1][j] or f[i][j-1], indicating which character we've added.

Time and Space Complexity

  • Time Complexity: O(m*n) due to the nested loops for filling the DP table. The backtracking step is also O(m+n) in the worst case, which is dominated by the DP table filling.
  • Space Complexity: O(m*n) to store the DP table f.

Code (Python):

def shortestCommonSupersequence(str1: str, str2: str) -> str:
    m, n = len(str1), len(str2)
    dp = [[0] * (n + 1) for _ in range(m + 1)]
 
    for i in range(1, m + 1):
        for j in range(1, n + 1):
            if str1[i - 1] == str2[j - 1]:
                dp[i][j] = dp[i - 1][j - 1] + 1
            else:
                dp[i][j] = max(dp[i - 1][j], dp[i][j - 1])
 
    res = ""
    i, j = m, n
    while i > 0 or j > 0:
        if i > 0 and j > 0 and str1[i - 1] == str2[j - 1]:
            res = str1[i - 1] + res
            i -= 1
            j -= 1
        elif i > 0 and (j == 0 or dp[i - 1][j] >= dp[i][j - 1]):
            res = str1[i - 1] + res
            i -= 1
        else:
            res = str2[j - 1] + res
            j -= 1
    return res
 

The code in other languages (Java, C++, Go, TypeScript) follows the same algorithmic approach, with minor syntactic differences specific to each language. The core logic remains consistent across all implementations.