{x}
blog image

Products With Three or More Orders in Two Consecutive Years

Solution Explanation

The problem requires finding product IDs that have at least three orders in two consecutive years. The solutions use SQL queries to achieve this. Both solutions utilize a Common Table Expression (CTE) to simplify the query.

Solution 1: Using a Mark in the CTE

This solution uses a CTE called P. P aggregates the Orders table to count the number of orders for each product in each year. The mark column indicates whether a product had at least three orders in a given year (1 for true, 0 for false). The final SELECT statement then joins P with itself to find product IDs that have mark as 1 (at least three orders) in consecutive years.

MySQL Code (Solution 1):

WITH
    P AS (
        SELECT product_id, YEAR(purchase_date) AS y, COUNT(1) >= 3 AS mark
        FROM Orders
        GROUP BY 1, 2
    )
SELECT DISTINCT p1.product_id
FROM
    P AS p1
    JOIN P AS p2 ON p1.y = p2.y - 1 AND p1.product_id = p2.product_id
WHERE p1.mark AND p2.mark;

Explanation:

  1. CTE P:

    • SELECT product_id, YEAR(purchase_date) AS y, COUNT(1) >= 3 AS mark: Selects the product ID, the year of purchase, and a boolean mark indicating whether the count of orders for that product in that year is greater than or equal to 3.
    • FROM Orders: Specifies the source table.
    • GROUP BY 1, 2: Groups the results by product ID and year.
  2. Final SELECT statement:

    • SELECT DISTINCT p1.product_id: Selects the distinct product IDs.
    • FROM P AS p1 JOIN P AS p2 ON p1.y = p2.y - 1 AND p1.product_id = p2.product_id: Joins P with itself (p1 and p2). The join condition ensures that we're comparing consecutive years (p1.y = p2.y - 1) for the same product ID.
    • WHERE p1.mark AND p2.mark: Filters the results to include only product IDs that had at least three orders in both consecutive years.

Solution 2: Simplified CTE

This solution simplifies the CTE by directly selecting the product_id and year only for those products that meet the order count criteria. The final join logic remains the same.

MySQL Code (Solution 2):

WITH
    P AS (
        SELECT product_id, YEAR(purchase_date) AS y
        FROM Orders
        GROUP BY 1, 2
        HAVING COUNT(1) >= 3
    )
SELECT DISTINCT p1.product_id
FROM
    P AS p1
    JOIN P AS p2 ON p1.y = p2.y - 1 AND p1.product_id = p2.product_id;

Explanation:

The difference lies in the CTE P. Instead of creating a mark column, it directly filters out years with fewer than three orders using the HAVING clause. This makes the final WHERE clause unnecessary.

Time Complexity Analysis

Both solutions have similar time complexity. The dominant operation is the self-join of the CTE P. The size of P is proportional to the number of unique product-year combinations. Let's denote this number as N. The self-join operation has a time complexity of O(N^2) in the worst case (though it might be optimized by the database engine). The GROUP BY and COUNT operations in the CTE are also O(N) but are dominated by the self-join.

Therefore, the overall time complexity of both solutions is O(N^2), where N is the number of unique (product_id, year) combinations in the Orders table. The space complexity is also O(N) due to the size of the CTE. The actual performance might vary depending on the database engine's optimization strategies.