{x}
blog image

Add Bold Tag in String

Solution Explanation

This problem involves adding bold tags (<b> and </b>) around substrings of a given string s that are present in a list of words words. Overlapping substrings should be merged into a single bold-tagged section.

The solution utilizes a Trie data structure for efficient substring searching. Here's a breakdown:

1. Trie Construction:

A Trie is built from the words in the words array. Each Trie node represents a character, and paths through the Trie represent words. This allows for quick checking if a substring is present in words.

2. Bold Tag Identification:

The code iterates through the input string s. For each character, it traverses the Trie, checking if a substring starting at that character exists in the Trie. If a complete word (marked by is_end = True in the Trie node) is found, the start and end indices of the word are stored in the pairs list.

3. Merging Overlapping Bold Tags:

The pairs list contains potentially overlapping intervals. The code then processes pairs to merge overlapping intervals. It iterates through the list, combining adjacent or overlapping intervals into a single, larger interval. The result is a list of non-overlapping intervals (t) where bold tags should be applied.

4. Adding Bold Tags:

Finally, the code iterates through the string s and adds the <b> and </b> tags around the substrings defined by the intervals in t.

Time Complexity Analysis:

  • Trie Construction: Building the Trie takes O(M), where M is the total number of characters in all words in the words array. Inserting each word into the Trie takes time proportional to its length.
  • Substring Search: Searching for substrings in s using the Trie takes O(N*K), where N is the length of s and K is the maximum length of a word in words. In the worst case, each character in s could potentially lead to checking a complete word in the Trie.
  • Merging Intervals: Merging the overlapping intervals in pairs takes O(P log P) if a sorting algorithm is used (where P is the number of pairs found). However, a linear scan can achieve O(P) complexity.
  • Adding Tags: Adding the tags takes O(N) time, as it involves iterating through the string once.

Therefore, the overall time complexity is dominated by the substring search and potentially the merging step, resulting in O(N*K + P log P) or O(N*K + P), where P is the number of pairs found, which is at most O(N). In a simplified analysis, considering K as a constant, the time complexity is effectively O(N).

Space Complexity Analysis:

  • Trie: The Trie's space complexity is O(M), where M is the total number of characters in all words in the words array.
  • pairs and t lists: These lists can store up to O(N) pairs in the worst-case scenario.
  • Output String: The output string's length is at most O(N), to accommodate the added tags.

Therefore, the overall space complexity is O(N + M). If we consider the size of the words to be relatively small compared to the length of the input string, the space complexity simplifies to O(N).

The provided code examples in Python, Java, C++, and Go all implement this solution with the described time and space complexities. The choice of language doesn't significantly alter the algorithmic approach or its efficiency.