{x}
blog image

Vowel Spellchecker

Given a wordlist, we want to implement a spellchecker that converts a query word into a correct word.

For a given query word, the spell checker handles two categories of spelling mistakes:

  • Capitalization: If the query matches a word in the wordlist (case-insensitive), then the query word is returned with the same case as the case in the wordlist.
    • Example: wordlist = ["yellow"], query = "YellOw": correct = "yellow"
    • Example: wordlist = ["Yellow"], query = "yellow": correct = "Yellow"
    • Example: wordlist = ["yellow"], query = "yellow": correct = "yellow"
  • Vowel Errors: If after replacing the vowels ('a', 'e', 'i', 'o', 'u') of the query word with any vowel individually, it matches a word in the wordlist (case-insensitive), then the query word is returned with the same case as the match in the wordlist.
    • Example: wordlist = ["YellOw"], query = "yollow": correct = "YellOw"
    • Example: wordlist = ["YellOw"], query = "yeellow": correct = "" (no match)
    • Example: wordlist = ["YellOw"], query = "yllw": correct = "" (no match)

In addition, the spell checker operates under the following precedence rules:

  • When the query exactly matches a word in the wordlist (case-sensitive), you should return the same word back.
  • When the query matches a word up to capitlization, you should return the first such match in the wordlist.
  • When the query matches a word up to vowel errors, you should return the first such match in the wordlist.
  • If the query has no matches in the wordlist, you should return the empty string.

Given some queries, return a list of words answer, where answer[i] is the correct word for query = queries[i].

 

Example 1:

Input: wordlist = ["KiTe","kite","hare","Hare"], queries = ["kite","Kite","KiTe","Hare","HARE","Hear","hear","keti","keet","keto"]
Output: ["kite","KiTe","KiTe","Hare","hare","","","KiTe","","KiTe"]

Example 2:

Input: wordlist = ["yellow"], queries = ["YellOw"]
Output: ["yellow"]

 

Constraints:

  • 1 <= wordlist.length, queries.length <= 5000
  • 1 <= wordlist[i].length, queries[i].length <= 7
  • wordlist[i] and queries[i] consist only of only English letters.

Solution Explanation for Vowel Spellchecker

This problem requires implementing a spellchecker that corrects query words based on capitalization and vowel errors, adhering to specific precedence rules. The optimal solution uses hash tables for efficient lookups.

Approach:

  1. Data Structures: We employ three hash tables:

    • s: A set (or unordered_set in C++) to store all words from wordlist for case-sensitive exact matches. This provides O(1) lookup time.
    • low: A map (or unordered_map in C++) to store lowercase versions of words as keys, with the original (case-preserving) word as the value. This handles capitalization errors.
    • pat: A map to store words with vowels replaced by '*' as keys and the original word as the value. This handles vowel errors.
  2. Preprocessing: We iterate through wordlist once:

    • Add each word to s.
    • Convert each word to lowercase and add it to low, overwriting if a lowercase duplicate exists (we only need the first occurrence).
    • Replace vowels with '*' in the lowercase word and add it to pat, again overwriting duplicates.
  3. Query Processing: For each query word:

    • Case-Sensitive Check: Check if the query is in s. If it is, this is the exact match, and we use it.
    • Case-Insensitive Check: Convert the query to lowercase. If it's in low, we've found a capitalization match; use the value from low.
    • Vowel-Insensitive Check: Replace vowels with '*' in the lowercase query. If it's in pat, we have a vowel error match; use the value from pat.
    • No Match: If none of the above conditions are met, there's no match, and we return an empty string.

Time Complexity:

  • Preprocessing: O(N * L), where N is the length of wordlist and L is the maximum length of a word. The vowel replacement in the preprocessing step is linear to the length of the word.
  • Query processing: O(M * L), where M is the length of queries. Each query involves a constant number of lookups in hash tables (O(1) on average), and potentially a linear-time vowel replacement in the worst case.

Therefore, the overall time complexity is O(N * L + M * L). Since L is relatively small and constant compared to N and M, it can be considered O(N + M).

Space Complexity:

  • O(N) to store the three hash tables. The space used is proportional to the number of words in wordlist.

Code Implementation (Python):

class Solution:
    def spellchecker(self, wordlist, queries):
        s = set(wordlist)
        low = {}
        pat = {}
        vowels = "aeiou"
 
        def replace_vowels(word):
            return "".join(['*' if c in vowels else c for c in word])
 
        for word in wordlist:
            lower_word = word.lower()
            low[lower_word] = word  # Overwrite duplicates
            pat[replace_vowels(lower_word)] = word # Overwrite duplicates
 
        ans = []
        for query in queries:
            if query in s:
                ans.append(query)
            elif query.lower() in low:
                ans.append(low[query.lower()])
            elif replace_vowels(query.lower()) in pat:
                ans.append(pat[replace_vowels(query.lower())])
            else:
                ans.append("")
 
        return ans

The code in other languages (Java, C++, Go) follows a similar structure and logic, differing primarily in syntax and specific data structure implementations. The core idea of using hash tables for efficient lookups remains consistent across all languages.