{x}
blog image

Maximum Length of a Concatenated String with Unique Characters

You are given an array of strings arr. A string s is formed by the concatenation of a subsequence of arr that has unique characters.

Return the maximum possible length of s.

A subsequence is an array that can be derived from another array by deleting some or no elements without changing the order of the remaining elements.

 

Example 1:

Input: arr = ["un","iq","ue"]
Output: 4
Explanation: All the valid concatenations are:
- ""
- "un"
- "iq"
- "ue"
- "uniq" ("un" + "iq")
- "ique" ("iq" + "ue")
Maximum length is 4.

Example 2:

Input: arr = ["cha","r","act","ers"]
Output: 6
Explanation: Possible longest valid concatenations are "chaers" ("cha" + "ers") and "acters" ("act" + "ers").

Example 3:

Input: arr = ["abcdefghijklmnopqrstuvwxyz"]
Output: 26
Explanation: The only string in arr has all 26 characters.

 

Constraints:

  • 1 <= arr.length <= 16
  • 1 <= arr[i].length <= 26
  • arr[i] contains only lowercase English letters.

Solution Explanation: Maximum Length of a Concatenated String with Unique Characters

This problem asks to find the maximum length of a string formed by concatenating a subsequence of strings from the input array arr, with the constraint that the resulting string must contain only unique characters. The solution leverages bit manipulation for efficient character tracking and backtracking (implicitly through the iterative approach) to explore all possible subsequences.

Approach: Bit Manipulation and State Compression

The core idea is to represent the presence or absence of each character (a-z) in a string using a 26-bit integer. Each bit corresponds to a character: bit 0 for 'a', bit 1 for 'b', and so on. A bit set to 1 indicates the character's presence; 0 indicates its absence. This technique is known as state compression.

  1. Initialization: Start with an empty string (represented by the integer 0).

  2. Iteration: Iterate through the input array arr. For each string t:

    • Create a bitmask x representing t. If t contains duplicate characters, set x to 0 (invalid).
    • Iterate through the existing valid states s (initially just 0). For each state y:
      • Check if x and y have any common bits (meaning common characters). If not, it's safe to concatenate.
      • Create a new state x | y (bitwise OR) representing the concatenation. This combines the characters from both states.
      • Add the new state to the set of valid states s.
      • Update the maximum length ans if the number of set bits in the new state is greater.
  3. Result: The final ans will hold the maximum length of a string with unique characters.

Time and Space Complexity

  • Time Complexity: O(2n + L), where n is the length of the input array arr and L is the sum of lengths of all strings in arr. The O(2n) factor comes from the exponential growth in the number of possible subsequences (subsets of arr). L is the time for character checking and bit manipulation in the loops.

  • Space Complexity: O(2n). In the worst case, we might need to store a bitmask for every possible subsequence.

Code Implementation (Python)

class Solution:
    def maxLength(self, arr: List[str]) -> int:
        s = {0}  # Set to store valid states (bitmasks)
        ans = 0
        for t in arr:
            x = 0
            duplicate = False
            for char in t:
                b = ord(char) - ord('a')
                if (x >> b) & 1:  # Check for duplicates
                    duplicate = True
                    break
                x |= 1 << b
            if not duplicate:
                new_states = set()
                for y in s:
                    if (x & y) == 0:  # Check for common characters
                        new_states.add(x | y)
                s.update(new_states)
                ans = max(ans, bin(max(s)).count('1')) #find max bits set 
        return ans
 

The code in other languages (Java, C++, Go, TypeScript) follows the same logic, differing mainly in syntax and bit manipulation functions. The core principle of bit manipulation and state compression remains consistent across all implementations.