{x}
blog image

Dot Product of Two Sparse Vectors

Solution Explanation

This problem asks to compute the dot product of two sparse vectors efficiently. A sparse vector is a vector with mostly zero values. A naive approach would iterate through all elements, leading to O(n) time complexity where n is the length of the vectors. However, for sparse vectors, this is inefficient. The optimal solution leverages hash tables (dictionaries in Python) to store only non-zero elements and their indices.

Approach

The solution uses a hash table (or map) to store only the non-zero elements of the vector, along with their indices. This significantly reduces the storage space required for sparse vectors. The dot product is then computed by iterating through the smaller hash table and looking up corresponding elements in the larger hash table.

Code Explanation (Python)

class SparseVector:
    def __init__(self, nums: List[int]):
        self.d = {i: v for i, v in enumerate(nums) if v}
 
    # Return the dotProduct of two sparse vectors
    def dotProduct(self, vec: "SparseVector") -> int:
        a, b = self.d, vec.d
        if len(b) < len(a):
            a, b = b, a  # Ensure 'a' is the smaller dictionary
        return sum(v * b.get(i, 0) for i, v in a.items())
  1. __init__(self, nums): The constructor initializes a dictionary self.d. It iterates through the input nums list, and for each non-zero element, it adds a key-value pair to the dictionary where the key is the index and the value is the element's value.

  2. dotProduct(self, vec): This method computes the dot product.

    • It assigns self.d and vec.d to a and b for readability.
    • It checks which dictionary is smaller (if len(b) < len(a): a, b = b, a) to optimize the loop. Iterating through the smaller dictionary is more efficient.
    • The sum() function efficiently calculates the dot product. The generator expression (v * b.get(i, 0) for i, v in a.items()) iterates through the smaller dictionary a. For each key-value pair (i, v), it retrieves the corresponding value from dictionary b using b.get(i, 0). If the key i is not found in b, b.get(i, 0) returns 0. The product v * b.get(i, 0) is then added to the sum.

Time and Space Complexity Analysis

Time Complexity: O(min(m, n)), where m and n are the number of non-zero elements in the two vectors. This is because we iterate through the smaller dictionary, which contains at most min(m, n) elements. The get() operation on a dictionary is O(1) on average.

Space Complexity: O(m + n) in the worst case, where m and n are the number of non-zero elements in each vector. This is because we store the non-zero elements and their indices in the dictionaries. In the best case (both vectors are entirely zeros), the space complexity would be O(1). However, given the problem statement, this is not likely.

The other languages (Java, C++, Go, TypeScript) implement the same logic, using their respective hash table data structures (HashMap, unordered_map, map). The core concept and complexity remain consistent across all implementations.