{x}
blog image

Total Sales Amount by Year

Solution Explanation for LeetCode Problem 1384: Total Sales Amount by Year

This problem requires calculating the total sales amount for each product in each year based on the provided Product and Sales tables. The solution involves joining the tables, calculating the number of days each product was sold in each year, and then computing the total sales amount.

Approach

The core idea is to iterate through each year (2018, 2019, 2020) and, for each sales entry:

  1. Determine overlap: Check if the sales period (period_start, period_end) overlaps with the current year.
  2. Calculate days: If there's an overlap, calculate the number of days the product was sold within that year. This requires careful handling of cases where the sales period starts or ends within the year.
  3. Compute total amount: Multiply the number of days by the average_daily_sales to get the total sales amount for that product in that year.
  4. Join with Product table: Join the result with the Product table to get the product_name.

MySQL Solution Explained

The provided MySQL solution efficiently implements this approach using a combination of JOIN operations, date functions, and conditional logic.

SELECT
    s.product_id,
    p.product_name,
    y.YEAR report_year,
    s.average_daily_sales * (
        IF(
            YEAR(s.period_end) > y.YEAR,
            y.days_of_year,
            DAYOFYEAR(s.period_end)
        ) - IF(
            YEAR(s.period_start) < y.YEAR,
            1,
            DAYOFYEAR(s.period_start)
        ) + 1
    ) total_amount
FROM
    Sales s
    INNER JOIN (
        SELECT
            '2018' YEAR,
            365 days_of_year
        UNION
        ALL
        SELECT
            '2019' YEAR,
            365 days_of_year
        UNION
        ALL
        SELECT
            '2020' YEAR,
            366 days_of_year
    ) y ON YEAR(s.period_start) <= y.YEAR
    AND YEAR(s.period_end) >= y.YEAR
    INNER JOIN Product p ON p.product_id = s.product_id
ORDER BY
    s.product_id,
    y.YEAR

Let's break down the code:

  1. INNER JOIN (SELECT ... ) y: This subquery creates a temporary table y containing the years (2018, 2019, 2020) and their respective number of days (accounting for leap year). This avoids explicit looping in SQL.

  2. ON YEAR(s.period_start) <= y.YEAR AND YEAR(s.period_end) >= y.YEAR: This condition ensures that only sales periods overlapping with a given year are considered.

  3. INNER JOIN Product p ON p.product_id = s.product_id: This joins the Sales table with the Product table using product_id to retrieve the product_name.

  4. s.average_daily_sales * (...) total_amount: This is the core calculation. The expression within the parentheses calculates the number of days the product was sold in the given year y.YEAR:

    • IF(YEAR(s.period_end) > y.YEAR, y.days_of_year, DAYOFYEAR(s.period_end)): If the sales period ends after the year, it takes the full number of days in that year (y.days_of_year). Otherwise, it takes the day of the year the sales period ends.

    • IF(YEAR(s.period_start) < y.YEAR, 1, DAYOFYEAR(s.period_start)): If the sales period starts before the year, it uses 1 (the first day of the year). Otherwise, it uses the day of the year the sales period starts.

    • The difference between these two values, plus 1, gives the number of days in the year.

  5. ORDER BY s.product_id, y.YEAR: This sorts the results as required by the problem statement.

Time Complexity Analysis

The time complexity of the SQL query is dominated by the JOIN operations. In the worst case, if the Sales table is very large, the joins will have a time complexity of O(NM), where N is the number of rows in the Sales table and M is the number of rows in the Product table (although database optimizations like indexes can significantly improve this in practice). The calculation within the SELECT statement is O(1) per row. Therefore, the overall time complexity is approximately **O(NM)**. However, with appropriate indexing on the database, the performance is much better in real-world scenarios.

The space complexity is determined by the size of the output which is proportional to the number of sales entries and years, making it roughly O(N) (or O(N*K) where K is the number of years if we do not create a temporary table for years).