This problem requires calculating the total sales amount for each product in each year based on the provided Product
and Sales
tables. The solution involves joining the tables, calculating the number of days each product was sold in each year, and then computing the total sales amount.
The core idea is to iterate through each year (2018, 2019, 2020) and, for each sales entry:
period_start
, period_end
) overlaps with the current year.average_daily_sales
to get the total sales amount for that product in that year.Product
table to get the product_name
.The provided MySQL solution efficiently implements this approach using a combination of JOIN
operations, date functions, and conditional logic.
SELECT
s.product_id,
p.product_name,
y.YEAR report_year,
s.average_daily_sales * (
IF(
YEAR(s.period_end) > y.YEAR,
y.days_of_year,
DAYOFYEAR(s.period_end)
) - IF(
YEAR(s.period_start) < y.YEAR,
1,
DAYOFYEAR(s.period_start)
) + 1
) total_amount
FROM
Sales s
INNER JOIN (
SELECT
'2018' YEAR,
365 days_of_year
UNION
ALL
SELECT
'2019' YEAR,
365 days_of_year
UNION
ALL
SELECT
'2020' YEAR,
366 days_of_year
) y ON YEAR(s.period_start) <= y.YEAR
AND YEAR(s.period_end) >= y.YEAR
INNER JOIN Product p ON p.product_id = s.product_id
ORDER BY
s.product_id,
y.YEAR
Let's break down the code:
INNER JOIN (SELECT ... ) y
: This subquery creates a temporary table y
containing the years (2018, 2019, 2020) and their respective number of days (accounting for leap year). This avoids explicit looping in SQL.
ON YEAR(s.period_start) <= y.YEAR AND YEAR(s.period_end) >= y.YEAR
: This condition ensures that only sales periods overlapping with a given year are considered.
INNER JOIN Product p ON p.product_id = s.product_id
: This joins the Sales
table with the Product
table using product_id
to retrieve the product_name
.
s.average_daily_sales * (...) total_amount
: This is the core calculation. The expression within the parentheses calculates the number of days the product was sold in the given year y.YEAR
:
IF(YEAR(s.period_end) > y.YEAR, y.days_of_year, DAYOFYEAR(s.period_end))
: If the sales period ends after the year, it takes the full number of days in that year (y.days_of_year
). Otherwise, it takes the day of the year the sales period ends.
IF(YEAR(s.period_start) < y.YEAR, 1, DAYOFYEAR(s.period_start))
: If the sales period starts before the year, it uses 1 (the first day of the year). Otherwise, it uses the day of the year the sales period starts.
The difference between these two values, plus 1, gives the number of days in the year.
ORDER BY s.product_id, y.YEAR
: This sorts the results as required by the problem statement.
The time complexity of the SQL query is dominated by the JOIN
operations. In the worst case, if the Sales
table is very large, the joins will have a time complexity of O(NM), where N is the number of rows in the Sales
table and M is the number of rows in the Product
table (although database optimizations like indexes can significantly improve this in practice). The calculation within the SELECT
statement is O(1) per row. Therefore, the overall time complexity is approximately **O(NM)**. However, with appropriate indexing on the database, the performance is much better in real-world scenarios.
The space complexity is determined by the size of the output which is proportional to the number of sales entries and years, making it roughly O(N) (or O(N*K) where K is the number of years if we do not create a temporary table for years).