{x}
blog image

Find Substring With Given Hash Value

The hash of a 0-indexed string s of length k, given integers p and m, is computed using the following function:

  • hash(s, p, m) = (val(s[0]) * p0 + val(s[1]) * p1 + ... + val(s[k-1]) * pk-1) mod m.

Where val(s[i]) represents the index of s[i] in the alphabet from val('a') = 1 to val('z') = 26.

You are given a string s and the integers power, modulo, k, and hashValue. Return sub, the first substring of s of length k such that hash(sub, power, modulo) == hashValue.

The test cases will be generated such that an answer always exists.

A substring is a contiguous non-empty sequence of characters within a string.

 

Example 1:

Input: s = "leetcode", power = 7, modulo = 20, k = 2, hashValue = 0
Output: "ee"
Explanation: The hash of "ee" can be computed to be hash("ee", 7, 20) = (5 * 1 + 5 * 7) mod 20 = 40 mod 20 = 0. 
"ee" is the first substring of length 2 with hashValue 0. Hence, we return "ee".

Example 2:

Input: s = "fbxzaad", power = 31, modulo = 100, k = 3, hashValue = 32
Output: "fbx"
Explanation: The hash of "fbx" can be computed to be hash("fbx", 31, 100) = (6 * 1 + 2 * 31 + 24 * 312) mod 100 = 23132 mod 100 = 32. 
The hash of "bxz" can be computed to be hash("bxz", 31, 100) = (2 * 1 + 24 * 31 + 26 * 312) mod 100 = 25732 mod 100 = 32. 
"fbx" is the first substring of length 3 with hashValue 32. Hence, we return "fbx".
Note that "bxz" also has a hash of 32 but it appears later than "fbx".

 

Constraints:

  • 1 <= k <= s.length <= 2 * 104
  • 1 <= power, modulo <= 109
  • 0 <= hashValue < modulo
  • s consists of lowercase English letters only.
  • The test cases are generated such that an answer always exists.

Solution Explanation: Finding Substring with Given Hash Value

This problem requires finding the first substring of length k within a given string s that matches a target hashValue. The hash function is defined as:

hash(s, p, m) = (val(s[0]) * p<sup>0</sup> + val(s[1]) * p<sup>1</sup> + ... + val(s[k-1]) * p<sup>k-1</sup>) mod m

where val(s[i]) is the alphabetical index of the character (a=1, b=2,... z=26), p is the power, and m is the modulo.

A naive approach would calculate the hash for every substring of length k, which has a time complexity of O(n*k), where n is the length of the string. However, we can optimize this using a sliding window and rolling hash technique.

Optimized Approach (Sliding Window and Rolling Hash)

The core idea is to efficiently update the hash value as the window slides along the string, avoiding redundant calculations. We process the string from right to left to simplify the hash update.

  1. Initial Hash: Calculate the hash of the last k characters of the string.

  2. Sliding Window: Iterate through the string from right to left (index i from n-k-1 to 0).

  3. Hash Update: For each iteration:

    • Remove the contribution of the character leaving the window (at i+k).
    • Add the contribution of the character entering the window (at i).
    • This update is done efficiently using modular arithmetic.
  4. Check and Return: If the updated hash matches hashValue, the substring starting at index i is the solution.

Time and Space Complexity

  • Time Complexity: O(n), where n is the length of the string. We iterate through the string once. The hash update within the loop is constant time.
  • Space Complexity: O(1). We use a constant amount of extra space to store the current hash and other variables.

Code Implementation (Python)

class Solution:
    def subStrHash(self, s, power, modulo, k, hashValue):
        h, n = 0, len(s)
        p = 1  # power^0 = 1. We use this to avoid repetitive calculation of powers.
        for i in range(n - 1, n - 1 - k, -1):
            val = ord(s[i]) - ord("a") + 1
            h = ((h * power) + val) % modulo
            if i != n - k:
                p = (p * power) % modulo # update p for next iteration
 
        j = n - k # Initialize j as the start index of the last substring
        for i in range(n - 1 - k, -1, -1):
            pre = ord(s[i + k]) - ord("a") + 1 #char leaving the window
            cur = ord(s[i]) - ord("a") + 1 #char entering the window
            h = ((h - (pre * p) % modulo + modulo) * power + cur) % modulo # efficient hash update
            if h == hashValue:
                j = i
        return s[j : j + k]

The code in other languages (Java, C++, Go, TypeScript, JavaScript) follows a similar structure and logic, adapting syntax and data types as needed. The core algorithm remains the same, offering efficient computation by leveraging rolling hash and sliding window techniques. The use of BigInt in Javascript and Typescript is necessary to handle potentially large hash values to avoid integer overflow.