{x}
blog image

String Compression

Given an array of characters chars, compress it using the following algorithm:

Begin with an empty string s. For each group of consecutive repeating characters in chars:

  • If the group's length is 1, append the character to s.
  • Otherwise, append the character followed by the group's length.

The compressed string s should not be returned separately, but instead, be stored in the input character array chars. Note that group lengths that are 10 or longer will be split into multiple characters in chars.

After you are done modifying the input array, return the new length of the array.

You must write an algorithm that uses only constant extra space.

 

Example 1:

Input: chars = ["a","a","b","b","c","c","c"]
Output: Return 6, and the first 6 characters of the input array should be: ["a","2","b","2","c","3"]
Explanation: The groups are "aa", "bb", and "ccc". This compresses to "a2b2c3".

Example 2:

Input: chars = ["a"]
Output: Return 1, and the first character of the input array should be: ["a"]
Explanation: The only group is "a", which remains uncompressed since it's a single character.

Example 3:

Input: chars = ["a","b","b","b","b","b","b","b","b","b","b","b","b"]
Output: Return 4, and the first 4 characters of the input array should be: ["a","b","1","2"].
Explanation: The groups are "a" and "bbbbbbbbbbbb". This compresses to "ab12".

 

Constraints:

  • 1 <= chars.length <= 2000
  • chars[i] is a lowercase English letter, uppercase English letter, digit, or symbol.

Solution Explanation for String Compression

This problem requires compressing a character array by grouping consecutive repeating characters. If a group has length 1, the character is appended to the result; otherwise, the character followed by its count is appended. The compressed string should be stored in the input array, and the function should return the new length.

Algorithm:

The solution uses a two-pointer approach. i is the starting index of a group of repeating characters, and j iterates to find the end of that group. The loop continues until i reaches the end of the input array.

  1. Find Group: The inner while loop finds all consecutive occurrences of chars[i]. j points to the character after the last occurrence in the group.

  2. Append Character: The character chars[i] is appended to the compressed array (at index k).

  3. Append Count (if necessary): If the group's length (j - i) is greater than 1, the count is converted to a string, and each digit is appended to the compressed array.

  4. Update Pointers: i is updated to j, moving to the next group of characters. k is the index of the next available position in the compressed array.

Time Complexity Analysis:

The outer loop iterates through the input array once. The inner loop iterates through each group of repeating characters. In the worst-case scenario, where all characters are unique, the inner loop executes only once for each character. Therefore, the total number of iterations is proportional to the length of the input array. The conversion of the count to a string takes O(log n) time, where n is the maximum length of a repeating character group. The overall time complexity remains O(n) because logarithmic time is dominated by linear time.

Space Complexity Analysis:

The solution uses only a constant amount of extra space for the pointers (i, j, k). The compression happens in-place, modifying the input array directly. Hence, the space complexity is O(1).

Code Explanation (Python):

class Solution:
    def compress(self, chars: List[str]) -> int:
        i, k, n = 0, 0, len(chars)  # i: start of group, k: index for compressed array, n: length of chars
        while i < n:
            j = i + 1
            while j < n and chars[j] == chars[i]: # Find end of group
                j += 1
            chars[k] = chars[i] # Append character
            k += 1
            if j - i > 1: # Append count if group length > 1
                cnt = str(j - i)
                for c in cnt:
                    chars[k] = c
                    k += 1
            i = j # Move to next group
        return k # Return length of compressed array
 

The other language implementations follow the same logic, with minor syntactic variations. The core idea of the two-pointer approach and in-place compression remains consistent across all languages.