This problem asks us to find all substrings of a given text that are present in a given word list and return their starting and ending indices as pairs. The solution can be approached using either a brute-force method or a Trie data structure for optimization.
Approach 1: Brute-Force (Python)
This approach iterates through all possible substrings of the text and checks if each substring is present in the words
set. It leverages Python's efficient set membership check.
class Solution:
def indexPairs(self, text: str, words: List[str]) -> List[List[int]]:
words = set(words)
n = len(text)
return [
[i, j] for i in range(n) for j in range(i, n) if text[i : j + 1] in words
]
words
list. This is because we generate all possible substrings (n^2) and check each against the words
set (m on average).words
set and n is the space taken by the resulting list of index pairs.Approach 2: Trie (Java, C++, Go)
Using a Trie significantly improves the efficiency. A Trie is a tree-like data structure optimized for string prefix searches. We insert all words from the words
array into the Trie. Then we traverse the text, checking if each prefix is a valid word in the Trie.
Code Examples:
The following code snippets demonstrate the Trie approach in Java, C++, and Go. The core logic remains the same: build a Trie, traverse the text, and collect index pairs.
Java:
class Trie {
Trie[] children = new Trie[26];
boolean isEnd = false;
// ... (Trie insert method) ...
}
class Solution {
public int[][] indexPairs(String text, String[] words) {
Trie trie = new Trie();
for (String w : words) trie.insert(w); // Insert words into Trie
int n = text.length();
List<int[]> ans = new ArrayList<>();
for (int i = 0; i < n; ++i) {
Trie node = trie;
for (int j = i; j < n; ++j) {
int idx = text.charAt(j) - 'a';
if (node.children[idx] == null) break;
node = node.children[idx];
if (node.isEnd) ans.add(new int[] {i, j});
}
}
return ans.toArray(new int[ans.size()][2]);
}
}
C++:
class Trie {
public:
vector<Trie*> children;
bool isEnd = false;
Trie() { children.resize(26); }
void insert(string word) { /* ... */ } //Insert method
};
class Solution {
public:
vector<vector<int>> indexPairs(string text, vector<string>& words) {
Trie* trie = new Trie();
for (auto w : words) trie->insert(w);
int n = text.size();
vector<vector<int>> ans;
for (int i = 0; i < n; ++i) {
Trie* node = trie;
for (int j = i; j < n; ++j) {
int idx = text[j] - 'a';
if (!node->children[idx]) break;
node = node->children[idx];
if (node->isEnd) ans.push_back({i, j});
}
}
return ans;
}
};
Go:
type Trie struct {
children [26]*Trie
isEnd bool
}
func (t *Trie) insert(word string) {
//...
}
func indexPairs(text string, words []string) [][]int {
//..Build Trie, similar to Java and C++
}
Conclusion:
The Trie-based approach offers a significant improvement in time complexity compared to the brute-force method, making it more efficient for larger inputs. The choice of the appropriate approach depends on the constraints of the problem and the expected input size. For smaller inputs, the simpler brute-force approach may suffice. For larger inputs, the Trie-based approach is highly recommended for its efficiency.