{x}
blog image

Average Selling Price

Table: Prices

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| product_id    | int     |
| start_date    | date    |
| end_date      | date    |
| price         | int     |
+---------------+---------+
(product_id, start_date, end_date) is the primary key (combination of columns with unique values) for this table.
Each row of this table indicates the price of the product_id in the period from start_date to end_date.
For each product_id there will be no two overlapping periods. That means there will be no two intersecting periods for the same product_id.

 

Table: UnitsSold

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| product_id    | int     |
| purchase_date | date    |
| units         | int     |
+---------------+---------+
This table may contain duplicate rows.
Each row of this table indicates the date, units, and product_id of each product sold. 

 

Write a solution to find the average selling price for each product. average_price should be rounded to 2 decimal places. If a product does not have any sold units, its average selling price is assumed to be 0.

Return the result table in any order.

The result format is in the following example.

 

Example 1:

Input: 
Prices table:
+------------+------------+------------+--------+
| product_id | start_date | end_date   | price  |
+------------+------------+------------+--------+
| 1          | 2019-02-17 | 2019-02-28 | 5      |
| 1          | 2019-03-01 | 2019-03-22 | 20     |
| 2          | 2019-02-01 | 2019-02-20 | 15     |
| 2          | 2019-02-21 | 2019-03-31 | 30     |
+------------+------------+------------+--------+
UnitsSold table:
+------------+---------------+-------+
| product_id | purchase_date | units |
+------------+---------------+-------+
| 1          | 2019-02-25    | 100   |
| 1          | 2019-03-01    | 15    |
| 2          | 2019-02-10    | 200   |
| 2          | 2019-03-22    | 30    |
+------------+---------------+-------+
Output: 
+------------+---------------+
| product_id | average_price |
+------------+---------------+
| 1          | 6.96          |
| 2          | 16.96         |
+------------+---------------+
Explanation: 
Average selling price = Total Price of Product / Number of products sold.
Average selling price for product 1 = ((100 * 5) + (15 * 20)) / 115 = 6.96
Average selling price for product 2 = ((200 * 15) + (30 * 30)) / 230 = 16.96

Solution Explanation for LeetCode 1251: Average Selling Price

This problem requires calculating the average selling price for each product, considering sales across different date ranges and prices. The solution uses SQL queries to achieve this efficiently.

Approach

The core idea is to join the Prices and UnitsSold tables to connect product prices with their corresponding sales. Since a product might have multiple price periods and sales dates, careful aggregation is necessary.

  1. Join Operation: A LEFT JOIN is used between Prices and UnitsSold. The join condition is based on matching product_id and ensuring that the purchase_date in UnitsSold falls within the start_date and end_date range specified in Prices. A LEFT JOIN ensures that all products from Prices are included, even if they have no sales in UnitsSold.

  2. Conditional Aggregation: The joined data needs aggregation to calculate the average selling price for each product_id. The SUM(price * units) calculates the total revenue for each product, and SUM(units) calculates the total units sold. Dividing the total revenue by total units gives the average price.

  3. Handling Missing Sales: If a product has no sales, the SUM(units) will be 0, leading to division by zero. The IFNULL function handles this by replacing NULL (resulting from division by zero) with 0.

  4. Rounding: The ROUND function ensures that the average price is rounded to two decimal places, as specified in the problem.

SQL Code (MySQL)

SELECT
    p.product_id,
    IFNULL(ROUND(SUM(price * units) / SUM(units), 2), 0) AS average_price
FROM
    Prices AS p
    LEFT JOIN UnitsSold AS u
        ON p.product_id = u.product_id AND purchase_date BETWEEN start_date AND end_date
GROUP BY 1;

Time Complexity Analysis

The time complexity of this SQL query is dominated by the JOIN operation and the subsequent GROUP BY operation. The JOIN operation's complexity depends on the size of the tables and the indexing used but is generally considered O(N log N) or O(N*M) in the worst case (N and M being the number of rows in Prices and UnitsSold respectively), if no efficient indexes are present. The GROUP BY operation has a complexity that's typically linear to the number of rows after the join, O(K) (K being the number of rows after join).

Therefore, the overall time complexity is dominated by the join, making it approximately O(N log N) or O(N*M) in the worst case, depending on the database's query optimizer and the presence of relevant indexes on product_id, start_date, end_date, and purchase_date. The addition of indexes significantly improves performance.

Space Complexity Analysis

The space complexity is primarily determined by the intermediate result set created during the JOIN and GROUP BY operations. In the worst case, the space required would be proportional to the size of the largest of the input tables, or the size of the joined table if it's larger than the inputs. Hence, it's approximately O(max(N, M)) or O(K), depending on the size of the intermediate tables.