Table: Products
+---------------+---------+ | Column Name | Type | +---------------+---------+ | product_id | int | | new_price | int | | change_date | date | +---------------+---------+ (product_id, change_date) is the primary key (combination of columns with unique values) of this table. Each row of this table indicates that the price of some product was changed to a new price at some date.
Write a solution to find the prices of all products on 2019-08-16
. Assume the price of all products before any change is 10
.
Return the result table in any order.
The result format is in the following example.
Example 1:
Input: Products table: +------------+-----------+-------------+ | product_id | new_price | change_date | +------------+-----------+-------------+ | 1 | 20 | 2019-08-14 | | 2 | 50 | 2019-08-14 | | 1 | 30 | 2019-08-15 | | 1 | 35 | 2019-08-16 | | 2 | 65 | 2019-08-17 | | 3 | 20 | 2019-08-18 | +------------+-----------+-------------+ Output: +------------+-------+ | product_id | price | +------------+-------+ | 2 | 50 | | 1 | 35 | | 3 | 10 | +------------+-------+
This problem requires finding the price of each product on a specific date ('2019-08-16'). The Products
table stores price changes over time. If a product has no price change on or before the given date, its price is assumed to be 10.
Two SQL solutions are presented: one using subqueries and joins, and another using window functions.
This approach uses two Common Table Expressions (CTEs):
T
: This CTE selects all distinct product_id
s from the Products
table. This ensures that all products are included in the final result, even those without price changes on or before the target date.
P
: This CTE finds the most recent price change for each product before or on '2019-08-16'. It does this by:
change_date
for each product_id
that is less than or equal to '2019-08-16'.Products
table to retrieve the new_price
corresponding to that maximum change_date
.Finally, a LEFT JOIN
is performed between T
and P
using product_id
. This ensures that all product_id
s from T
are included. IFNULL(price, 10)
handles cases where a product has no entry in P
(meaning no price change before the target date), setting the price to the default 10.
MySQL Code (Solution 1):
WITH
T AS (SELECT DISTINCT product_id FROM Products),
P AS (
SELECT product_id, new_price AS price
FROM Products
WHERE
(product_id, change_date) IN (
SELECT product_id, MAX(change_date) AS change_date
FROM Products
WHERE change_date <= '2019-08-16'
GROUP BY 1
)
)
SELECT product_id, IFNULL(price, 10) AS price
FROM
T
LEFT JOIN P USING (product_id);
This solution uses window functions for a more concise approach:
P
: This CTE performs a LEFT JOIN
between a table of distinct product_id
s and the Products
table, filtering for change_date
less than or equal to '2019-08-16'. This joins all products with their price changes up to the target date.
T
: This CTE uses the RANK()
window function to assign a rank to each price change for each product based on the change_date
in descending order. The highest rank indicates the most recent price change.
Finally, the query selects product_id
and new_price
(renamed as price
) from T
where the rank is 1 (the most recent change). IFNULL(new_price, 10)
handles the case of no price changes before the target date.
MySQL Code (Solution 2):
WITH
P AS (
SELECT p1.product_id, new_price, change_date
FROM
(
SELECT DISTINCT product_id
FROM Products
) AS p1
LEFT JOIN Products AS p2
ON p1.product_id = p2.product_id AND p2.change_date <= '2019-08-16'
),
T AS (
SELECT
*,
RANK() OVER (
PARTITION BY product_id
ORDER BY change_date DESC
) AS rk
FROM P
)
SELECT product_id, IFNULL(new_price, 10) AS price
FROM T
WHERE rk = 1;
Both solutions have similar time complexity. The dominant factor is the sorting or grouping required to find the most recent price change for each product. This is typically O(N log N), where N is the number of rows in the Products
table. The window function approach might have slightly better constant factors in some database implementations. The space complexity is also similar, primarily determined by the intermediate CTEs, which could be proportional to N in the worst case. However, in practice the database optimizer may optimize the query execution plan to minimize the actual space usage.