The problem requires finding product IDs that have at least three orders in two consecutive years. The solutions use SQL queries to achieve this. Both solutions utilize a Common Table Expression (CTE) to simplify the query.
This solution uses a CTE called P
. P
aggregates the Orders
table to count the number of orders for each product in each year. The mark
column indicates whether a product had at least three orders in a given year (1 for true, 0 for false). The final SELECT
statement then joins P
with itself to find product IDs that have mark
as 1 (at least three orders) in consecutive years.
MySQL Code (Solution 1):
WITH
P AS (
SELECT product_id, YEAR(purchase_date) AS y, COUNT(1) >= 3 AS mark
FROM Orders
GROUP BY 1, 2
)
SELECT DISTINCT p1.product_id
FROM
P AS p1
JOIN P AS p2 ON p1.y = p2.y - 1 AND p1.product_id = p2.product_id
WHERE p1.mark AND p2.mark;
Explanation:
CTE P
:
SELECT product_id, YEAR(purchase_date) AS y, COUNT(1) >= 3 AS mark
: Selects the product ID, the year of purchase, and a boolean mark
indicating whether the count of orders for that product in that year is greater than or equal to 3.FROM Orders
: Specifies the source table.GROUP BY 1, 2
: Groups the results by product ID and year.Final SELECT
statement:
SELECT DISTINCT p1.product_id
: Selects the distinct product IDs.FROM P AS p1 JOIN P AS p2 ON p1.y = p2.y - 1 AND p1.product_id = p2.product_id
: Joins P
with itself (p1
and p2
). The join condition ensures that we're comparing consecutive years (p1.y = p2.y - 1
) for the same product ID.WHERE p1.mark AND p2.mark
: Filters the results to include only product IDs that had at least three orders in both consecutive years.This solution simplifies the CTE by directly selecting the product_id
and year
only for those products that meet the order count criteria. The final join logic remains the same.
MySQL Code (Solution 2):
WITH
P AS (
SELECT product_id, YEAR(purchase_date) AS y
FROM Orders
GROUP BY 1, 2
HAVING COUNT(1) >= 3
)
SELECT DISTINCT p1.product_id
FROM
P AS p1
JOIN P AS p2 ON p1.y = p2.y - 1 AND p1.product_id = p2.product_id;
Explanation:
The difference lies in the CTE P
. Instead of creating a mark
column, it directly filters out years with fewer than three orders using the HAVING
clause. This makes the final WHERE
clause unnecessary.
Both solutions have similar time complexity. The dominant operation is the self-join of the CTE P
. The size of P
is proportional to the number of unique product-year combinations. Let's denote this number as N
. The self-join operation has a time complexity of O(N^2) in the worst case (though it might be optimized by the database engine). The GROUP BY
and COUNT
operations in the CTE are also O(N) but are dominated by the self-join.
Therefore, the overall time complexity of both solutions is O(N^2), where N is the number of unique (product_id, year) combinations in the Orders
table. The space complexity is also O(N) due to the size of the CTE. The actual performance might vary depending on the database engine's optimization strategies.