Given an array of strings words
, return the smallest string that contains each string in words
as a substring. If there are multiple valid strings of the smallest length, return any of them.
You may assume that no string in words
is a substring of another string in words
.
Example 1:
Input: words = ["alex","loves","leetcode"] Output: "alexlovesleetcode" Explanation: All permutations of "alex","loves","leetcode" would also be accepted.
Example 2:
Input: words = ["catg","ctaagt","gcta","ttca","atgcatc"] Output: "gctaagttcatgcatc"
Constraints:
1 <= words.length <= 12
1 <= words[i].length <= 20
words[i]
consists of lowercase English letters.words
are unique.This problem asks to find the shortest string that contains all strings in the input array words
as substrings. The solution uses a dynamic programming approach with bit manipulation to efficiently explore the possible combinations of string concatenations.
Approach:
Overlap Calculation: The solution starts by computing an overlap matrix g
. g[i][j]
stores the maximum overlap between the end of words[i]
and the beginning of words[j]
. This is crucial for determining how efficiently we can concatenate strings.
Dynamic Programming: The core of the solution is a dynamic programming table dp
. dp[i][j]
represents the maximum total overlap achievable when considering a subset of strings represented by the bitmask i
and ending with string j
. A bitmask is used to efficiently represent subsets of strings; each bit corresponds to a string in words
. If the j-th bit is set, the j-th string is included in the subset.
Backtracking: A second table p
stores the predecessor string index for each state in dp
. This allows for reconstructing the optimal concatenation order after the dp
table is computed. The backtracking starts from the state where all strings are included and traces back to find the optimal sequence.
String Construction: Once the optimal sequence is found, the strings are concatenated considering the overlaps to produce the shortest superstring.
Time Complexity Analysis:
Therefore, the overall time complexity is dominated by the dynamic programming step and is O(n * 2n + n²m). The exponential term makes it clear that this algorithm is not suitable for very large numbers of strings.
Space Complexity Analysis:
The space complexity is dominated by the dp
and p
tables, both of which have dimensions O(n * 2n). The overlap matrix g
requires O(n²) space. Therefore, the overall space complexity is O(n * 2n). This is also exponential with respect to the number of strings.
Code Explanation (Python):
The provided Python code implements this algorithm directly. The key parts are:
g
matrix calculation: Loops through pairs of words to find the maximum overlap.dp
and p
matrix computation: This is the dynamic programming core, recursively finding the best concatenation order using bit manipulation to represent subsets efficiently.p
to reconstruct the optimal sequence of word indices.The code in other languages (Java, C++, Go) follows the same logic and algorithmic structure, differing only in syntax and specific library functions used. The core dynamic programming and backtracking steps remain consistent across all implementations.