{x}
blog image

Occurrences After Bigram

Given two strings first and second, consider occurrences in some text of the form "first second third", where second comes immediately after first, and third comes immediately after second.

Return an array of all the words third for each occurrence of "first second third".

 

Example 1:

Input: text = "alice is a good girl she is a good student", first = "a", second = "good"
Output: ["girl","student"]

Example 2:

Input: text = "we will we will rock you", first = "we", second = "will"
Output: ["we","rock"]

 

Constraints:

  • 1 <= text.length <= 1000
  • text consists of lowercase English letters and spaces.
  • All the words in text are separated by a single space.
  • 1 <= first.length, second.length <= 10
  • first and second consist of lowercase English letters.
  • text will not have any leading or trailing spaces.

Solution Explanation for LeetCode 1078: Occurrences After Bigram

This problem asks us to find all words that appear immediately after a specific bigram (a sequence of two words) within a given text.

Approach:

The most straightforward approach involves iterating through the words in the text. We split the text into words, and then use a sliding window of size 3 to check for occurrences of the target bigram. If the bigram is found, we append the third word in the window to the result.

Time Complexity Analysis:

The algorithm iterates through the words in the text once. Splitting the text into words takes O(N) time, where N is the number of words. The subsequent iteration also takes O(N) time. Therefore, the overall time complexity is O(N), which is linear with respect to the number of words in the input text.

Space Complexity Analysis:

The space complexity is determined by the size of the output array ans, which stores the third words following the bigram. In the worst-case scenario, where every bigram is a match, the size of ans could be proportional to the number of words in the input. Therefore, the space complexity is O(N) in the worst case, and O(1) in the best case (no matches).

Code Explanation (Python):

class Solution:
    def findOcurrences(self, text: str, first: str, second: str) -> List[str]:
        words = text.split()  #Split the text into a list of words. O(N)
        ans = [] # Initialize an empty list to store the results
        for i in range(len(words) - 2): # Iterate up to the third to last word. O(N)
            a, b, c = words[i : i + 3] # Create a sliding window of three words
            if a == first and b == second: # Check if the bigram is found
                ans.append(c) #Append the third word to results
        return ans #Return the list of third words

Code Explanation (Java):

class Solution {
    public String[] findOcurrences(String text, String first, String second) {
        String[] words = text.split(" "); //Split the text into an array of words
        List<String> ans = new ArrayList<>(); //Initialize an ArrayList to store results
        for (int i = 0; i < words.length - 2; ++i) { //Iterate up to the third to last word
            if (first.equals(words[i]) && second.equals(words[i + 1])) { //Check if the bigram is found
                ans.add(words[i + 2]); // Add the third word to the ArrayList
            }
        }
        return ans.toArray(new String[0]); // Convert ArrayList to String array and return
    }
}

The other languages (C++, Go, TypeScript) follow a very similar pattern: they split the input string into words, iterate through the words using a sliding window of size 3, and check for the bigram to collect the words that follow. The core logic remains the same across all implementations.