{x}
blog image

Sequence Reconstruction

Solution Explanation: Sequence Reconstruction

This problem asks whether a given integer array nums is the unique shortest supersequence of a set of subsequences provided in sequences. A supersequence contains all sequences in sequences as subsequences. The solution uses topological sorting to efficiently determine this.

Approach:

  1. Build a Directed Acyclic Graph (DAG): We create a directed graph where each node represents a number in the range [1, n] (where n is the length of nums). A directed edge from node a to node b indicates that b must appear after a in any valid supersequence. This is derived from the input sequences. If sequences contains [a, b], there is an edge from a to b.

  2. In-degree Calculation: For each node, we calculate its in-degree (the number of incoming edges). This represents how many numbers must precede it in any valid supersequence.

  3. Topological Sort: We perform a topological sort using Kahn's algorithm. This algorithm starts by adding nodes with an in-degree of 0 to a queue. We process nodes one by one, removing them from the graph and decreasing the in-degree of their neighbors. If, at any point, the queue contains more than one node, it means multiple topological orderings (and therefore multiple shortest supersequences) are possible.

  4. Uniqueness Check: If the topological sort completes and the queue is empty, it means there's only one possible topological ordering, indicating that nums is the unique shortest supersequence. If the queue is not empty at the end, it means there are multiple possible supersequences.

Time Complexity Analysis:

  • Building the graph takes O(sum(lengths of sequences)) time, where the sum is taken over all subsequences in the sequences array.
  • Topological sorting takes O(V + E) time, where V is the number of vertices (n) and E is the number of edges (at most n² but typically less).
  • Therefore, the overall time complexity is dominated by the graph construction and topological sort, resulting in O(n + sum(lengths of sequences)).

Space Complexity Analysis:

  • The graph requires O(n + E) space to store the adjacency list.
  • The indeg array takes O(n) space.
  • The queue takes O(n) space in the worst case.
  • Thus, the overall space complexity is O(n + E).

Code Implementation (Python):

from collections import deque
 
def sequenceReconstruction(nums, sequences):
    n = len(nums)
    graph = [[] for _ in range(n)]
    in_degree = [0] * n
 
    # Build the graph and calculate in-degrees
    for seq in sequences:
        for i in range(len(seq) - 1):
            u = seq[i] - 1
            v = seq[i+1] - 1
            graph[u].append(v)
            in_degree[v] += 1
 
    # Topological Sort
    queue = deque([i for i, degree in enumerate(in_degree) if degree == 0])
    result = []
    while len(queue) == 1: #Key condition for uniqueness check.  Stops early if multiple paths exist.
        node = queue.popleft()
        result.append(node + 1)  # Add 1 to get back to original numbering
        for neighbor in graph[node]:
            in_degree[neighbor] -= 1
            if in_degree[neighbor] == 0:
                queue.append(neighbor)
    
    return len(queue) == 0 and result == nums # Check if the topological sort produced nums
 

The other language implementations (Java, C++, Go, TypeScript) follow the same algorithmic structure, only differing in syntax and data structures. The core idea of building the DAG, calculating in-degrees, performing topological sort, and checking for uniqueness remains consistent across all implementations.