{x}
blog image

Longest Happy Prefix

A string is called a happy prefix if is a non-empty prefix which is also a suffix (excluding itself).

Given a string s, return the longest happy prefix of s. Return an empty string "" if no such prefix exists.

 

Example 1:

Input: s = "level"
Output: "l"
Explanation: s contains 4 prefix excluding itself ("l", "le", "lev", "leve"), and suffix ("l", "el", "vel", "evel"). The largest prefix which is also suffix is given by "l".

Example 2:

Input: s = "ababab"
Output: "abab"
Explanation: "abab" is the largest prefix which is also suffix. They can overlap in the original string.

 

Constraints:

  • 1 <= s.length <= 105
  • s contains only lowercase English letters.

Solution Explanation for Longest Happy Prefix

The problem asks to find the longest prefix of a string that is also a suffix (excluding the string itself). Two efficient solutions are presented: String Hashing and the Knuth-Morris-Pratt (KMP) algorithm.

Solution 1: String Hashing

This approach leverages string hashing to efficiently compare prefixes and suffixes. It works as follows:

  1. Hashing: A rolling hash function is used to compute the hash values of all prefixes and suffixes of the input string s. The rolling hash allows for efficient computation of the hash of a substring without recomputing the entire hash every time. A common choice for the base is a prime number (like 131 or 13331) to minimize collisions.

  2. Comparison: The algorithm iterates through possible prefix lengths (l), comparing the hash of the prefix of length l with the hash of the suffix of length l. If the hashes match, it means the prefix and suffix are identical, and the prefix of length l is a happy prefix.

  3. Longest Prefix: The algorithm keeps track of the longest happy prefix found so far and returns it.

Time Complexity: O(n), where n is the length of the string. The hashing and comparison steps each take linear time.

Space Complexity: O(n) due to the arrays used for storing hash values.

Solution 2: Knuth-Morris-Pratt (KMP) Algorithm

The KMP algorithm is a string-searching algorithm that can be adapted to find the longest happy prefix. It works by pre-processing the input string to create a "next" array. The next[i] value indicates the length of the longest proper prefix of the string s[0...i] which is also a suffix of s[0...i].

  1. Prefix Function (Next Array): The KMP algorithm computes the next array. next[i] stores the index of the longest proper prefix of s[0...i] that is also a suffix of s[0...i]. If no such prefix exists, next[i] is 0.

  2. Longest Happy Prefix: The last element of the next array (next[n-1], where n is the length of the string plus a special character) directly represents the length of the longest happy prefix. The prefix of that length is then extracted from the original string.

Time Complexity: O(n) because the prefix function computation and the search itself are both linear.

Space Complexity: O(n) for the next array.

Code Examples (Python)

Solution 1 (String Hashing): (Note: This Python implementation uses a simpler substring comparison instead of a full rolling hash for brevity, but the concept remains the same. A true rolling hash would be more efficient for extremely large strings.)

class Solution:
    def longestPrefix(self, s: str) -> str:
        for i in range(len(s) - 1, 0, -1):
            if s[:i] == s[len(s) - i:]:
                return s[:i]
        return ""
 

Solution 2 (KMP):

class Solution:
    def longestPrefix(self, s: str) -> str:
        s += "#"  # Add a special character to handle the last element in next array.
        n = len(s)
        next = [0] * n
        next[0] = -1
        i, j = 1, 0
        while i < n:
            if s[i] == s[j]:
                j += 1
                next[i] = j
            else:
                j = next[j]
            i += 1
        return s[:next[n-1]]
 

The other language examples (Java, C++, Go, TypeScript, Rust) implement the same two algorithms, adapting the syntax to each language's specific features. The KMP approach is generally preferred for its clarity and guaranteed linear time complexity. However, for very large strings, a highly optimized rolling hash might offer a slight performance advantage in practice.