{x}
blog image

Longest Common Subpath

There is a country of n cities numbered from 0 to n - 1. In this country, there is a road connecting every pair of cities.

There are m friends numbered from 0 to m - 1 who are traveling through the country. Each one of them will take a path consisting of some cities. Each path is represented by an integer array that contains the visited cities in order. The path may contain a city more than once, but the same city will not be listed consecutively.

Given an integer n and a 2D integer array paths where paths[i] is an integer array representing the path of the ith friend, return the length of the longest common subpath that is shared by every friend's path, or 0 if there is no common subpath at all.

A subpath of a path is a contiguous sequence of cities within that path.

 

Example 1:

Input: n = 5, paths = [[0,1,2,3,4],
                       [2,3,4],
                       [4,0,1,2,3]]
Output: 2
Explanation: The longest common subpath is [2,3].

Example 2:

Input: n = 3, paths = [[0],[1],[2]]
Output: 0
Explanation: There is no common subpath shared by the three paths.

Example 3:

Input: n = 5, paths = [[0,1,2,3,4],
                       [4,3,2,1,0]]
Output: 1
Explanation: The possible longest common subpaths are [0], [1], [2], [3], and [4]. All have a length of 1.

 

Constraints:

  • 1 <= n <= 105
  • m == paths.length
  • 2 <= m <= 105
  • sum(paths[i].length) <= 105
  • 0 <= paths[i][j] < n
  • The same city is not listed multiple times consecutively in paths[i].

Solution Explanation for Longest Common Subpath

This problem asks to find the length of the longest common subpath among multiple paths. A naive approach would have a very high time complexity. The optimal solution uses a combination of rolling hashing and binary search.

Core Idea:

  1. Rolling Hash: We use a rolling hash function to efficiently compute the hash value of all subpaths of length k in each path. The rolling hash allows us to compute the hash of the next subpath in O(1) time instead of O(k) time, which is crucial for performance.

  2. Binary Search: We perform a binary search on the length k of the potential common subpaths. For a given k, we check if there exists a subpath of length k common to all paths. If such a subpath exists, we try a larger k; otherwise, we try a smaller k. This significantly reduces the search space.

  3. Hashing Subpaths: For each path and length k, we compute the hashes of all subpaths of length k. Then, we count the frequency of each hash value across all paths. If a hash has a frequency equal to the number of paths, we've found a common subpath of length k.

Time Complexity Analysis:

  • Rolling Hash Computation: For each path and each length k, we compute hashes for at most len(path) - k + 1 subpaths. This takes O(len(path)) time per path, and summing over all paths gives a time complexity of O(sum(len(path))).

  • Binary Search: The binary search iterates through at most O(log(min(len(path)))) steps.

  • Frequency Counting: In the worst case, we have O(sum(len(path))) unique hash values. Counting their frequencies takes O(sum(len(path))) time.

Therefore, the overall time complexity is dominated by the rolling hash computation and frequency counting, which is approximately O(sum(len(path)) * log(min(len(path)))). The logarithmic factor comes from the binary search. Note that sum(len(path)) is the total number of elements in all the paths, which is given as being less than or equal to 105 in the constraints.

Space Complexity Analysis:

The space complexity is dominated by the storage of hashes and counts which is approximately O(sum(len(path))) in the worst case.

Code Explanation (Python):

class Solution:
    def longestCommonSubpath(self, n: int, paths: List[List[int]]) -> int:
        def check(k: int) -> bool:
            cnt = Counter() #Counts hash frequencies across paths
            for h in hh: #Iterates through precomputed rolling hashes for each path
                vis = set() #Keeps track of seen hashes for a given path
                for i in range(1, len(h) - k + 1): #Iterates through subpaths
                    j = i + k - 1
                    x = (h[j] - h[i - 1] * p[j - i + 1]) % mod #Rolling hash calculation
                    if x not in vis:
                        vis.add(x)
                        cnt[x] += 1 #Increment hash count if unique for a given path
            return max(cnt.values()) == m #Check if any hash appears in all paths
 
 
        m = len(paths)
        mx = max(len(path) for path in paths) #Find maximum path length
        base = 133331  #Hash base
        mod = 2**64 + 1  #Hash modulus for larger numbers
        p = [0] * (mx + 1)  #Precompute powers of the hash base
        p[0] = 1
        for i in range(1, len(p)):
            p[i] = p[i - 1] * base % mod
        hh = [] #Store precomputed rolling hashes for each path
        for path in paths:
            k = len(path)
            h = [0] * (k + 1)
            for i, x in enumerate(path, 1):
                h[i] = h[i - 1] * base % mod + x
            hh.append(h)
        l, r = 0, min(len(path) for path in paths) #Binary search range
        while l < r:
            mid = (l + r + 1) >> 1
            if check(mid):
                l = mid
            else:
                r = mid - 1
        return l
 

The Java code follows a similar structure, using the same algorithms and data structures. The key differences lie in syntax and how some data structures (like Counter) are handled differently in Java compared to Python.

This detailed explanation clarifies the approach, complexity, and implementation of an efficient solution to the Longest Common Subpath problem. The use of rolling hashing and binary search dramatically improves performance compared to brute-force methods.