{x}
blog image

Word Abbreviation

Solution Explanation for LeetCode 527: Word Abbreviation

This problem requires finding the minimal abbreviations for a list of words, following specific rules:

  1. Initial Abbreviation: The abbreviation starts with the first character, followed by the count of characters between the first and last, and ends with the last character. (e.g., "internal" -> "i7l")

  2. Uniqueness: If multiple words share the same abbreviation, increase the prefix length until all abbreviations are unique. (e.g., "internal", "intension" both initially "i7n" might become "inte4n", "inte4n" and then "int3n", "inte4n")

  3. No Lengthening: If an abbreviation doesn't shorten the word, keep the original word.

Approach: Trie-based Grouping

The most efficient approach leverages a Trie data structure to manage word groups based on their length and last character. This allows for efficient detection and resolution of abbreviation conflicts.

1. Grouping:

Words are grouped into buckets based on their length and their last character. This is because words with the same initial and final character and the same length will always have the same basic abbreviation. For example, "internal" and "intension" share the same last character and length, leading to an initial abbreviation conflict.

2. Trie Implementation:

A Trie is used for each group. Each node in the Trie represents a character in the word prefix. The cnt attribute in each node tracks how many words share that prefix.

  • Insertion: Words are inserted into the Trie character by character. The cnt of each node encountered is incremented.

  • Search: To find the correct prefix length for a word's abbreviation, we traverse the Trie. If a node's cnt is 1, it means the prefix up to that node is unique. We return the length of that prefix. If the whole word is traversed without finding a unique prefix, the word is too short for abbreviation.

3. Abbreviation Generation:

For each word, we perform the following steps:

  • Group Lookup: Identify the word's group based on length and the last character.
  • Trie Search: Search the Trie for the word, determining the optimal prefix length for the abbreviation.
  • Abbreviation Construction: Create the abbreviation based on the prefix length (from the Trie search), middle character count, and the last character.
  • Length Check: If the abbreviation is not shorter than the original word, keep the original word.

4. Time and Space Complexity:

  • Time Complexity: O(L), where L is the sum of the lengths of all words. This is because each character of each word is processed at most once for Trie insertion and search.

  • Space Complexity: O(L). The space used is primarily by the Trie data structure, which in the worst case can store all characters of all words.

Code Implementation (Python):

class Trie:
    def __init__(self):
        self.children = [None] * 26
        self.cnt = 0
 
    def insert(self, word):
        node = self
        for char in word:
            idx = ord(char) - ord('a')
            if not node.children[idx]:
                node.children[idx] = Trie()
            node = node.children[idx]
            node.cnt += 1
 
    def search(self, word):
        node = self
        count = 0
        for char in word:
            count += 1
            idx = ord(char) - ord('a')
            node = node.children[idx]
            if node.cnt == 1:
                return count
        return len(word)
 
 
def wordsAbbreviation(words):
    tries = {}
    for word in words:
        length = len(word)
        last_char = word[-1]
        key = (length, last_char)
        if key not in tries:
            tries[key] = Trie()
        tries[key].insert(word)
 
    result = []
    for word in words:
        length = len(word)
        last_char = word[-1]
        key = (length, last_char)
        prefix_len = tries[key].search(word)
        if prefix_len + 2 >= length:
            result.append(word)
        else:
            abbreviation = word[:prefix_len] + str(length - prefix_len - 1) + last_char
            result.append(abbreviation)
    return result
 

Implementations in other languages (Java, C++, Go, TypeScript) would follow a very similar structure, using appropriate data structures for Tries and maps/dictionaries. The core logic remains consistent across languages.