{x}
blog image

Index Pairs of a String

Solution Explanation for LeetCode 1065: Index Pairs of a String

This problem asks us to find all substrings of a given text that are present in a given word list and return their starting and ending indices as pairs. The solution can be approached using either a brute-force method or a Trie data structure for optimization.

Approach 1: Brute-Force (Python)

This approach iterates through all possible substrings of the text and checks if each substring is present in the words set. It leverages Python's efficient set membership check.

class Solution:
    def indexPairs(self, text: str, words: List[str]) -> List[List[int]]:
        words = set(words)
        n = len(text)
        return [
            [i, j] for i in range(n) for j in range(i, n) if text[i : j + 1] in words
        ]
  • Time Complexity: O(m*n^2), where n is the length of the text and m is the average length of words in the words list. This is because we generate all possible substrings (n^2) and check each against the words set (m on average).
  • Space Complexity: O(m+n), where m is the space taken by the words set and n is the space taken by the resulting list of index pairs.

Approach 2: Trie (Java, C++, Go)

Using a Trie significantly improves the efficiency. A Trie is a tree-like data structure optimized for string prefix searches. We insert all words from the words array into the Trie. Then we traverse the text, checking if each prefix is a valid word in the Trie.

Code Examples:

The following code snippets demonstrate the Trie approach in Java, C++, and Go. The core logic remains the same: build a Trie, traverse the text, and collect index pairs.

Java:

class Trie {
    Trie[] children = new Trie[26];
    boolean isEnd = false;
    // ... (Trie insert method) ...
}
 
class Solution {
    public int[][] indexPairs(String text, String[] words) {
        Trie trie = new Trie();
        for (String w : words) trie.insert(w); // Insert words into Trie
        int n = text.length();
        List<int[]> ans = new ArrayList<>();
        for (int i = 0; i < n; ++i) {
            Trie node = trie;
            for (int j = i; j < n; ++j) {
                int idx = text.charAt(j) - 'a';
                if (node.children[idx] == null) break;
                node = node.children[idx];
                if (node.isEnd) ans.add(new int[] {i, j});
            }
        }
        return ans.toArray(new int[ans.size()][2]);
    }
}

C++:

class Trie {
public:
    vector<Trie*> children;
    bool isEnd = false;
    Trie() { children.resize(26); }
    void insert(string word) { /* ... */ } //Insert method
};
 
class Solution {
public:
    vector<vector<int>> indexPairs(string text, vector<string>& words) {
        Trie* trie = new Trie();
        for (auto w : words) trie->insert(w);
        int n = text.size();
        vector<vector<int>> ans;
        for (int i = 0; i < n; ++i) {
            Trie* node = trie;
            for (int j = i; j < n; ++j) {
                int idx = text[j] - 'a';
                if (!node->children[idx]) break;
                node = node->children[idx];
                if (node->isEnd) ans.push_back({i, j});
            }
        }
        return ans;
    }
};
 

Go:

type Trie struct {
	children [26]*Trie
	isEnd    bool
}
 
func (t *Trie) insert(word string) {
    //...
}
 
func indexPairs(text string, words []string) [][]int {
    //..Build Trie, similar to Java and C++
}
  • Time Complexity: O(n*m) where n is the length of the text and m is the maximum length of words in words list. In the worst case, each character in the text could lead to traversing a path of length m in the Trie.
  • Space Complexity: O(sum of lengths of all words). The space is dominated by the Trie, which stores all words.

Conclusion:

The Trie-based approach offers a significant improvement in time complexity compared to the brute-force method, making it more efficient for larger inputs. The choice of the appropriate approach depends on the constraints of the problem and the expected input size. For smaller inputs, the simpler brute-force approach may suffice. For larger inputs, the Trie-based approach is highly recommended for its efficiency.