Table: Sales
+-------------+-------+
| Column Name | Type |
+-------------+-------+
| sale_id | int |
| product_id | int |
| year | int |
| quantity | int |
| price | int |
+-------------+-------+
(sale_id, year) is the primary key (combination of columns with unique values) of this table.
product_id is a foreign key (reference column) to Product
table.
Each row of this table shows a sale on the product product_id in a certain year.
Note that the price is per unit.
Table: Product
+--------------+---------+ | Column Name | Type | +--------------+---------+ | product_id | int | | product_name | varchar | +--------------+---------+ product_id is the primary key (column with unique values) of this table. Each row of this table indicates the product name of each product.
Write a solution to select the product id, year, quantity, and price for the first year of every product sold.
Return the resulting table in any order.
The result format is in the following example.
Example 1:
Input: Sales table: +---------+------------+------+----------+-------+ | sale_id | product_id | year | quantity | price | +---------+------------+------+----------+-------+ | 1 | 100 | 2008 | 10 | 5000 | | 2 | 100 | 2009 | 12 | 5000 | | 7 | 200 | 2011 | 15 | 9000 | +---------+------------+------+----------+-------+ Product table: +------------+--------------+ | product_id | product_name | +------------+--------------+ | 100 | Nokia | | 200 | Apple | | 300 | Samsung | +------------+--------------+ Output: +------------+------------+----------+-------+ | product_id | first_year | quantity | price | +------------+------------+----------+-------+ | 100 | 2008 | 10 | 5000 | | 200 | 2011 | 15 | 9000 | +------------+------------+----------+-------+
This problem requires finding the first year of sale for each product and returning the corresponding product ID, year, quantity, and price. We can solve this using two main approaches: using subqueries or using window functions.
This approach uses a subquery to find the minimum year for each product and then joins this information back to the original Sales
table to retrieve the required details.
MySQL Code:
SELECT
product_id,
year AS first_year,
quantity,
price
FROM Sales
WHERE
(product_id, year) IN (
SELECT
product_id,
MIN(year) AS year
FROM Sales
GROUP BY product_id
);
Explanation:
Inner Query: SELECT product_id, MIN(year) AS year FROM Sales GROUP BY product_id
This subquery groups the Sales
table by product_id
and finds the minimum year (MIN(year)
) for each product. This identifies the first year each product was sold.
Outer Query: SELECT product_id, year AS first_year, quantity, price FROM Sales WHERE (product_id, year) IN (...)
This query selects the product_id
, year
(renamed as first_year
), quantity
, and price
from the Sales
table. The WHERE
clause filters the results to only include rows where the (product_id, year)
combination matches the (product_id, minimum year)
combinations found in the inner query.
Time Complexity Analysis:
The inner query has a time complexity of O(N log N) due to the GROUP BY
and MIN
operations (assuming sorting is involved). The outer query then performs a nested loop join with the inner query result set, giving an overall time complexity of approximately O(N log N), where N is the number of rows in the Sales
table.
This approach leverages the RANK()
window function to assign a rank to each sale within a product's sales history based on the year. We can then filter for only the rows with rank 1, representing the first year of sale.
MySQL Code:
WITH
T AS (
SELECT
*,
RANK() OVER (
PARTITION BY product_id
ORDER BY year
) AS rk
FROM Sales
)
SELECT product_id, year AS first_year, quantity, price
FROM T
WHERE rk = 1;
Explanation:
Common Table Expression (CTE): WITH T AS (...)
This defines a CTE named T
that calculates the rank for each sale.
RANK() Function: RANK() OVER (PARTITION BY product_id ORDER BY year)
This function assigns a rank to each row within each product_id
partition, ordered by year
in ascending order. The first sale for each product will have rank 1.
Final Query: SELECT product_id, year AS first_year, quantity, price FROM T WHERE rk = 1
This query selects the relevant columns from the CTE T
and filters the results to only include rows where rk
(rank) is equal to 1.
Time Complexity Analysis:
Window functions typically have a time complexity of O(N log N) because they may involve sorting. The final selection of rows where rk = 1
is O(N), making the overall time complexity O(N log N), similar to Approach 1.
In summary: Both approaches achieve the desired result. The choice between them depends on personal preference and the specific database system being used; window functions are often considered more elegant and efficient in more complex scenarios. The time complexity of both approaches is approximately O(N log N).