{x}
blog image

Stream of Characters

Design an algorithm that accepts a stream of characters and checks if a suffix of these characters is a string of a given array of strings words.

For example, if words = ["abc", "xyz"] and the stream added the four characters (one by one) 'a', 'x', 'y', and 'z', your algorithm should detect that the suffix "xyz" of the characters "axyz" matches "xyz" from words.

Implement the StreamChecker class:

  • StreamChecker(String[] words) Initializes the object with the strings array words.
  • boolean query(char letter) Accepts a new character from the stream and returns true if any non-empty suffix from the stream forms a word that is in words.

 

Example 1:

Input
["StreamChecker", "query", "query", "query", "query", "query", "query", "query", "query", "query", "query", "query", "query"]
[[["cd", "f", "kl"]], ["a"], ["b"], ["c"], ["d"], ["e"], ["f"], ["g"], ["h"], ["i"], ["j"], ["k"], ["l"]]
Output
[null, false, false, false, true, false, true, false, false, false, false, false, true]

Explanation
StreamChecker streamChecker = new StreamChecker(["cd", "f", "kl"]);
streamChecker.query("a"); // return False
streamChecker.query("b"); // return False
streamChecker.query("c"); // return False
streamChecker.query("d"); // return True, because 'cd' is in the wordlist
streamChecker.query("e"); // return False
streamChecker.query("f"); // return True, because 'f' is in the wordlist
streamChecker.query("g"); // return False
streamChecker.query("h"); // return False
streamChecker.query("i"); // return False
streamChecker.query("j"); // return False
streamChecker.query("k"); // return False
streamChecker.query("l"); // return True, because 'kl' is in the wordlist

 

Constraints:

  • 1 <= words.length <= 2000
  • 1 <= words[i].length <= 200
  • words[i] consists of lowercase English letters.
  • letter is a lowercase English letter.
  • At most 4 * 104 calls will be made to query.

Solution Explanation:

This problem requires designing a StreamChecker that efficiently checks if a suffix of a character stream matches any word in a given list. A Trie data structure is ideal for this because it allows for fast prefix (and in this case, suffix) searching.

Approach:

  1. Trie Construction: We build a Trie where each node represents a character, and words are inserted in reverse order. This allows us to efficiently check suffixes by traversing the Trie from the end of the stream.

  2. query() Method: The query() method appends the incoming character to a character stream (cs in Python, sb in Java, s in C++ and Go). It then uses the Trie's search method to check if any suffix of the stream (up to a certain length to avoid excessive computation) is a word present in the Trie. The suffix is checked in reverse.

  3. Time Complexity Analysis:

    • __init__ / Constructor: The time complexity for building the Trie is O(Σ len(word)), where the summation is across all words. In the worst case this becomes O(M * K) where M is the number of words and K is the length of the longest word.
    • query: The time complexity of the query() method is O(L), where L is the length of the current stream (or more precisely, the length of the suffix checked, which is capped at a reasonable length in the code). In the worst case it could be O(K) where K is the maximum length of words. This is because we traverse at most the length of the longest suffix stored in the cs array (limited to self.limit in Python to avoid unbounded growth) in the Trie to check for a match.
  4. Space Complexity Analysis: The space complexity is dominated by the Trie. In the worst case, if all words are of length K and no common prefixes exist, the Trie will have O(M*K) nodes (where M is the number of words). The space for storing the character stream cs is also a factor, but we limit its size (self.limit in Python) to a fixed value to prevent unbounded growth.

Code Explanation (Python):

The Python code demonstrates the approach clearly:

  • Trie Class: Implements the Trie with insert() to add words (in reverse) and search() to check for suffixes in the Trie.
  • StreamChecker Class:
    • The constructor creates the Trie and initializes a list cs to store the character stream.
    • query(letter) adds the letter to cs, limits the length of cs (to handle a large stream of characters), and uses trie.search() to check for a suffix match.

The Java, C++, and Go implementations follow the same logic but with the syntax specific to their respective languages. They all utilize a Trie data structure for efficient suffix checking within the character stream.

Example Usage:

In the provided example, StreamChecker is initialized with words ["cd", "f", "kl"]. The query() calls simulate a character stream. The outputs accurately reflect whether a suffix matches any of the given words.