Table: Product
+--------------+---------+ | Column Name | Type | +--------------+---------+ | product_id | int | | product_name | varchar | | unit_price | int | +--------------+---------+ product_id is the primary key (column with unique values) of this table. Each row of this table indicates the name and the price of each product.
Table: Sales
+-------------+---------+ | Column Name | Type | +-------------+---------+ | seller_id | int | | product_id | int | | buyer_id | int | | sale_date | date | | quantity | int | | price | int | +-------------+---------+ This table can have duplicate rows. product_id is a foreign key (reference column) to the Product table. Each row of this table contains some information about one sale.
Write a solution to report the products that were only sold in the first quarter of 2019
. That is, between 2019-01-01
and 2019-03-31
inclusive.
Return the result table in any order.
The result format is in the following example.
Example 1:
Input: Product table: +------------+--------------+------------+ | product_id | product_name | unit_price | +------------+--------------+------------+ | 1 | S8 | 1000 | | 2 | G4 | 800 | | 3 | iPhone | 1400 | +------------+--------------+------------+ Sales table: +-----------+------------+----------+------------+----------+-------+ | seller_id | product_id | buyer_id | sale_date | quantity | price | +-----------+------------+----------+------------+----------+-------+ | 1 | 1 | 1 | 2019-01-21 | 2 | 2000 | | 1 | 2 | 2 | 2019-02-17 | 1 | 800 | | 2 | 2 | 3 | 2019-06-02 | 1 | 800 | | 3 | 3 | 4 | 2019-05-13 | 2 | 2800 | +-----------+------------+----------+------------+----------+-------+ Output: +-------------+--------------+ | product_id | product_name | +-------------+--------------+ | 1 | S8 | +-------------+--------------+ Explanation: The product with id 1 was only sold in the spring of 2019. The product with id 2 was sold in the spring of 2019 but was also sold after the spring of 2019. The product with id 3 was sold after spring 2019. We return only product 1 as it is the product that was only sold in the spring of 2019.
This problem requires finding products that were sold only during the first quarter of 2019. We need to analyze the Sales
and Product
tables to achieve this.
Approach:
The solution uses SQL to efficiently query and filter the data. The core idea is to:
Sales
and Product
tables using product_id
to get product names along with sales information.product_id
to analyze sales for each product.MySQL Solution:
SELECT product_id, product_name
FROM
Sales
JOIN Product USING (product_id)
GROUP BY 1
HAVING COUNT(1) = SUM(sale_date BETWEEN '2019-01-01' AND '2019-03-31');
Explanation:
SELECT product_id, product_name
: This selects the product_id
and product_name
from the joined tables.FROM Sales JOIN Product USING (product_id)
: This performs an inner join between the Sales
and Product
tables using product_id
as the common key.GROUP BY 1
: Groups the results by the first column in the SELECT
statement, which is product_id
. This allows us to count sales for each product.HAVING COUNT(1) = SUM(sale_date BETWEEN '2019-01-01' AND '2019-03-31')
: This is the crucial filtering condition.
COUNT(1)
counts the total number of sales for each product.SUM(sale_date BETWEEN '2019-01-01' AND '2019-03-31')
counts the number of sales that fall within the specified date range. The BETWEEN
operator is inclusive.HAVING
clause ensures that only products where these two counts are equal are included in the result—meaning only those sold exclusively in Q1 2019.Time Complexity Analysis:
The time complexity of this SQL query depends on the database system's optimization techniques. However, in a general sense:
HAVING
clause adds some overhead, but it's generally proportional to the number of rows after the join.Therefore, the overall time complexity is approximately linear with respect to the size of the data (after considering database optimizations). It is not strictly O(N) due to the join, but it is often practically linear in the size of the dataset.
Space Complexity Analysis:
The space complexity is primarily determined by the size of the intermediate result sets created during the query execution (the joined table and the grouped results). This is also proportional to the size of the input data. So, the space complexity is also roughly linear with respect to the size of the input data.