Given a wordlist
, we want to implement a spellchecker that converts a query word into a correct word.
For a given query
word, the spell checker handles two categories of spelling mistakes:
wordlist = ["yellow"]
, query = "YellOw"
: correct = "yellow"
wordlist = ["Yellow"]
, query = "yellow"
: correct = "Yellow"
wordlist = ["yellow"]
, query = "yellow"
: correct = "yellow"
('a', 'e', 'i', 'o', 'u')
of the query word with any vowel individually, it matches a word in the wordlist (case-insensitive), then the query word is returned with the same case as the match in the wordlist.
wordlist = ["YellOw"]
, query = "yollow"
: correct = "YellOw"
wordlist = ["YellOw"]
, query = "yeellow"
: correct = ""
(no match)wordlist = ["YellOw"]
, query = "yllw"
: correct = ""
(no match)In addition, the spell checker operates under the following precedence rules:
Given some queries
, return a list of words answer
, where answer[i]
is the correct word for query = queries[i]
.
Example 1:
Input: wordlist = ["KiTe","kite","hare","Hare"], queries = ["kite","Kite","KiTe","Hare","HARE","Hear","hear","keti","keet","keto"] Output: ["kite","KiTe","KiTe","Hare","hare","","","KiTe","","KiTe"]
Example 2:
Input: wordlist = ["yellow"], queries = ["YellOw"] Output: ["yellow"]
Constraints:
1 <= wordlist.length, queries.length <= 5000
1 <= wordlist[i].length, queries[i].length <= 7
wordlist[i]
and queries[i]
consist only of only English letters.This problem requires implementing a spellchecker that corrects query words based on capitalization and vowel errors, adhering to specific precedence rules. The optimal solution uses hash tables for efficient lookups.
Approach:
Data Structures: We employ three hash tables:
s
: A set
(or unordered_set
in C++) to store all words from wordlist
for case-sensitive exact matches. This provides O(1) lookup time.low
: A map
(or unordered_map
in C++) to store lowercase versions of words as keys, with the original (case-preserving) word as the value. This handles capitalization errors.pat
: A map
to store words with vowels replaced by '*' as keys and the original word as the value. This handles vowel errors.Preprocessing: We iterate through wordlist
once:
s
.low
, overwriting if a lowercase duplicate exists (we only need the first occurrence).pat
, again overwriting duplicates.Query Processing: For each query word:
s
. If it is, this is the exact match, and we use it.low
, we've found a capitalization match; use the value from low
.pat
, we have a vowel error match; use the value from pat
.Time Complexity:
wordlist
and L is the maximum length of a word. The vowel replacement in the preprocessing step is linear to the length of the word.queries
. Each query involves a constant number of lookups in hash tables (O(1) on average), and potentially a linear-time vowel replacement in the worst case.Therefore, the overall time complexity is O(N * L + M * L). Since L is relatively small and constant compared to N and M, it can be considered O(N + M).
Space Complexity:
wordlist
.Code Implementation (Python):
class Solution:
def spellchecker(self, wordlist, queries):
s = set(wordlist)
low = {}
pat = {}
vowels = "aeiou"
def replace_vowels(word):
return "".join(['*' if c in vowels else c for c in word])
for word in wordlist:
lower_word = word.lower()
low[lower_word] = word # Overwrite duplicates
pat[replace_vowels(lower_word)] = word # Overwrite duplicates
ans = []
for query in queries:
if query in s:
ans.append(query)
elif query.lower() in low:
ans.append(low[query.lower()])
elif replace_vowels(query.lower()) in pat:
ans.append(pat[replace_vowels(query.lower())])
else:
ans.append("")
return ans
The code in other languages (Java, C++, Go) follows a similar structure and logic, differing primarily in syntax and specific data structure implementations. The core idea of using hash tables for efficient lookups remains consistent across all languages.