{x}
blog image

Unique Word Abbreviation

Solution Explanation: Unique Word Abbreviation

This problem requires designing a data structure that efficiently checks if a word's abbreviation is unique within a dictionary. The core idea is to use a hash table to store word abbreviations and their corresponding words.

Abbreviation Function:

The problem defines a word's abbreviation as follows:

  • If the word has fewer than 3 characters, the abbreviation is the word itself.
  • Otherwise, it's the first letter, the count of characters between the first and last letter, and the last letter. For example, "internationalization" becomes "i18n".

We implement a helper function abbr(word) to calculate this abbreviation for any given word.

Data Structure:

A hash table (or dictionary in Python) is the ideal data structure. The keys are the word abbreviations, and the values are sets of words that share that abbreviation. Using a set prevents duplicate words from being stored for the same abbreviation.

ValidWordAbbr Class:

The ValidWordAbbr class has two main methods:

  • __init__(self, dictionary) (constructor): This initializes the hash table. It iterates through the input dictionary, calculates the abbreviation for each word using abbr(), and adds the word to the set associated with its abbreviation in the hash table.

  • isUnique(self, word): This method checks if the given word has a unique abbreviation within the dictionary. It works as follows:

    1. It calculates the abbreviation of the input word using abbr().
    2. It checks if this abbreviation exists as a key in the hash table.
      • If the abbreviation is not in the table, it means no other word in the dictionary shares this abbreviation, so the method returns True.
      • If the abbreviation is in the table, it means other words share the same abbreviation. The method then checks if the set associated with this abbreviation contains only the input word. If it does (meaning only one word has this abbreviation), it returns True; otherwise, it returns False.

Time and Space Complexity:

  • Time Complexity:
    • __init__: O(N*M), where N is the number of words in the dictionary, and M is the maximum length of a word (for calculating abbreviations).
    • isUnique: O(1) on average because hash table lookups are O(1) on average. In the worst case (hash collisions), it could be O(K), where K is the number of words sharing the same abbreviation.
  • Space Complexity: O(N*M) in the worst case, to store the hash table, where N is the number of words and M is the maximum word length. In practice, it will often be less since many words will have different abbreviations.

Code Examples (Python, Java, C++, Go, TypeScript):

The code examples in the problem description demonstrate the implementation of this approach in several programming languages. They all follow the same logic outlined above, differing only in syntax and specific data structure implementations. Note the use of defaultdict(set) in Python for convenient handling of hash table entries. Other languages use similar techniques for efficient initialization and access.