{x}
blog image

Sales Analysis III

Table: Product

+--------------+---------+
| Column Name  | Type    |
+--------------+---------+
| product_id   | int     |
| product_name | varchar |
| unit_price   | int     |
+--------------+---------+
product_id is the primary key (column with unique values) of this table.
Each row of this table indicates the name and the price of each product.

Table: Sales

+-------------+---------+
| Column Name | Type    |
+-------------+---------+
| seller_id   | int     |
| product_id  | int     |
| buyer_id    | int     |
| sale_date   | date    |
| quantity    | int     |
| price       | int     |
+-------------+---------+
This table can have duplicate rows.
product_id is a foreign key (reference column) to the Product table.
Each row of this table contains some information about one sale.

 

Write a solution to report the products that were only sold in the first quarter of 2019. That is, between 2019-01-01 and 2019-03-31 inclusive.

Return the result table in any order.

The result format is in the following example.

 

Example 1:

Input: 
Product table:
+------------+--------------+------------+
| product_id | product_name | unit_price |
+------------+--------------+------------+
| 1          | S8           | 1000       |
| 2          | G4           | 800        |
| 3          | iPhone       | 1400       |
+------------+--------------+------------+
Sales table:
+-----------+------------+----------+------------+----------+-------+
| seller_id | product_id | buyer_id | sale_date  | quantity | price |
+-----------+------------+----------+------------+----------+-------+
| 1         | 1          | 1        | 2019-01-21 | 2        | 2000  |
| 1         | 2          | 2        | 2019-02-17 | 1        | 800   |
| 2         | 2          | 3        | 2019-06-02 | 1        | 800   |
| 3         | 3          | 4        | 2019-05-13 | 2        | 2800  |
+-----------+------------+----------+------------+----------+-------+
Output: 
+-------------+--------------+
| product_id  | product_name |
+-------------+--------------+
| 1           | S8           |
+-------------+--------------+
Explanation: 
The product with id 1 was only sold in the spring of 2019.
The product with id 2 was sold in the spring of 2019 but was also sold after the spring of 2019.
The product with id 3 was sold after spring 2019.
We return only product 1 as it is the product that was only sold in the spring of 2019.

Solution Explanation for LeetCode 1084: Sales Analysis III

This problem requires finding products that were sold only during the first quarter of 2019. We need to analyze the Sales and Product tables to achieve this.

Approach:

The solution uses SQL to efficiently query and filter the data. The core idea is to:

  1. Join the tables: Combine Sales and Product tables using product_id to get product names along with sales information.
  2. Group by product: Group the results by product_id to analyze sales for each product.
  3. Filter by sales period: For each product, count the total number of sales and the number of sales within the first quarter of 2019 (Jan 1st to March 31st, 2019).
  4. Select only qualifying products: Only include products where the total number of sales is equal to the number of sales during the first quarter. This ensures that the product was sold only during that period.

MySQL Solution:

SELECT product_id, product_name
FROM
    Sales
    JOIN Product USING (product_id)
GROUP BY 1
HAVING COUNT(1) = SUM(sale_date BETWEEN '2019-01-01' AND '2019-03-31');

Explanation:

  • SELECT product_id, product_name: This selects the product_id and product_name from the joined tables.
  • FROM Sales JOIN Product USING (product_id): This performs an inner join between the Sales and Product tables using product_id as the common key.
  • GROUP BY 1: Groups the results by the first column in the SELECT statement, which is product_id. This allows us to count sales for each product.
  • HAVING COUNT(1) = SUM(sale_date BETWEEN '2019-01-01' AND '2019-03-31'): This is the crucial filtering condition.
    • COUNT(1) counts the total number of sales for each product.
    • SUM(sale_date BETWEEN '2019-01-01' AND '2019-03-31') counts the number of sales that fall within the specified date range. The BETWEEN operator is inclusive.
    • The HAVING clause ensures that only products where these two counts are equal are included in the result—meaning only those sold exclusively in Q1 2019.

Time Complexity Analysis:

The time complexity of this SQL query depends on the database system's optimization techniques. However, in a general sense:

  • The join operation has a time complexity that depends on the size of the tables (e.g., O(N*M) for a nested loop join, where N and M are the sizes of the Sales and Product tables, respectively). However, optimized join algorithms used by database systems can often achieve much better performance than this worst case.
  • The grouping operation and HAVING clause adds some overhead, but it's generally proportional to the number of rows after the join.

Therefore, the overall time complexity is approximately linear with respect to the size of the data (after considering database optimizations). It is not strictly O(N) due to the join, but it is often practically linear in the size of the dataset.

Space Complexity Analysis:

The space complexity is primarily determined by the size of the intermediate result sets created during the query execution (the joined table and the grouped results). This is also proportional to the size of the input data. So, the space complexity is also roughly linear with respect to the size of the input data.