{x}
blog image

Change Null Values in a Table to the Previous Value

Solution Explanation for LeetCode Problem 2388: Change Null Values in a Table to the Previous Value

This problem requires updating NULL values in the drink column of the CoffeeShop table with the most recent non-NULL value encountered. We'll explore two SQL solutions with different approaches and analyze their time complexities.

Solution 1: Using Variables (MySQL)

This solution leverages MySQL's variable capabilities for a concise and efficient approach.

Code:

SELECT
    id,
    CASE
        WHEN drink IS NOT NULL THEN @cur := drink
        ELSE @cur
    END AS drink
FROM CoffeeShop;

Explanation:

  1. @cur := drink: We declare and initialize a user-defined variable @cur. This line assigns the current non-NULL drink value to @cur. This assignment happens only when drink is NOT NULL.

  2. ELSE @cur: If drink IS NULL, the CASE statement returns the value currently stored in @cur (the last non-NULL value encountered).

  3. FROM CoffeeShop: The query iterates through the CoffeeShop table row by row. The order of rows is guaranteed to be preserved.

Time Complexity: O(N), where N is the number of rows in the CoffeeShop table. The query iterates through each row once.

Space Complexity: O(1). The space used is constant regardless of the table size. The variable @cur occupies a fixed amount of space.

Solution 2: Using Window Functions (MySQL)

This solution employs window functions to achieve the same result. It's more readable but might be slightly less efficient for very large tables.

Code:

WITH
    S AS (
        SELECT *, ROW_NUMBER() OVER () AS rk
        FROM CoffeeShop
    ),
    T AS (
        SELECT
            *,
            SUM(
                CASE
                    WHEN drink IS NULL THEN 0
                    ELSE 1
                END
            ) OVER (ORDER BY rk) AS gid
        FROM S
    )
SELECT
    id,
    MAX(drink) OVER (
        PARTITION BY gid
        ORDER BY rk
    ) AS drink
FROM T;

Explanation:

  1. CTE S: Assigns a row number (rk) to each row in the original table using ROW_NUMBER() OVER (). This is crucial for maintaining the original order and enabling subsequent operations.

  2. CTE T: This CTE calculates a group ID (gid) for each consecutive sequence of non-NULL drink values. The SUM() with OVER (ORDER BY rk) acts as a cumulative counter that increments only when a non-NULL drink is encountered. Rows with NULL drinks will share the same gid as the preceding non-NULL row.

  3. Final SELECT: Uses MAX(drink) OVER (PARTITION BY gid ORDER BY rk) to find the maximum (i.e., the last non-NULL) drink within each group (gid) and assigns it to all rows within that group, ordered by rk.

Time Complexity: O(N log N) or potentially O(N) depending on the specific database engine's implementation of window functions. The sorting involved in window functions can contribute to the log N factor, though some optimizations can reduce it to linear time in some cases.

Space Complexity: O(N) in the worst case because of the creation of CTEs S and T. The size of these CTEs is proportional to the number of rows in the input table.

Comparison:

Solution 1 (using variables) is generally more efficient, especially for very large tables, because it avoids the overhead associated with window functions. Solution 2 (using window functions) is arguably more readable and easier to understand, but the performance difference might be negligible for smaller datasets. The choice depends on the trade-off between performance and code readability.