This problem asks to design a data structure that efficiently finds the shortest distance between any two words in a given list. A naive approach would involve repeatedly scanning the list for each query, resulting in a high time complexity. The optimal solution uses a hash map to pre-process the data, enabling fast lookups for subsequent queries.
Preprocessing:
wordsDict
, and values are lists of their indices in the input array. This step allows for O(1) average-case lookup time for the index of each word.Querying:
shortest
query:
word1
and word2
from the hash map.ans
). This approach avoids redundant comparisons and ensures the minimum distance is found.Time Complexity:
__init__
or Constructor
): O(N), where N is the length of wordsDict
. This is because we iterate through the input array once to build the hash map.shortest
): O(M + K), where M and K are the number of occurrences of word1
and word2
respectively in wordsDict
. In the worst case (words appearing at the beginning and end of the array), the two-pointer approach iterates through both lists. However, on average, the time complexity will be significantly less than O(N).Space Complexity:
wordsDict
. In the worst case, each word appears only once, resulting in N entries in the hash map.class WordDistance:
def __init__(self, wordsDict: List[str]):
self.d = defaultdict(list) # Hash map (dictionary) to store word indices
for i, w in enumerate(wordsDict):
self.d[w].append(i) # Store each word's index
def shortest(self, word1: str, word2: str) -> int:
a, b = self.d[word1], self.d[word2] # Get index lists for the words
ans = inf # Initialize minimum distance to infinity
i = j = 0 # Initialize pointers for the index lists
while i < len(a) and j < len(b): # Two-pointer iteration
ans = min(ans, abs(a[i] - b[j])) # Update minimum distance
if a[i] <= b[j]:
i += 1 # Move pointer for the smaller index
else:
j += 1
return ans
The code in other languages (Java, C++, Go) follows the same logic and uses equivalent data structures to achieve the same result. The key is the use of a hash map for efficient word lookup and a two-pointer approach for optimized distance calculation.