This problem requires comparing the average salary of each department to the overall company average salary for each month. The solution involves using SQL window functions for efficient computation.
Both solutions follow a similar approach:
Join Tables: First, the Salary
and Employee
tables are joined using employee_id
to link salaries to their respective departments.
Calculate Averages: Window functions are crucial here. We use AVG() OVER (PARTITION BY ...)
to calculate:
company_avg_amount
: The average salary across the entire company for each month (PARTITION BY pay_date
).department_avg_amount
: The average salary for each department in each month (PARTITION BY pay_date, department_id
).Compare and Categorize: A CASE
statement compares company_avg_amount
and department_avg_amount
to determine whether the department's average is 'higher', 'lower', or 'same' as the company average.
Result Table: The final query selects the month (pay_month
), department ID (department_id
), and the comparison result (comparison
). DISTINCT
is used in Solution 1 to avoid duplicate rows; Solution 2 uses GROUP BY
to achieve the same effect.
This solution uses DISTINCT
to remove redundant rows after calculating the averages.
WITH
t AS (
SELECT
DATE_FORMAT(pay_date, '%Y-%m') AS pay_month,
department_id,
AVG(amount) OVER (PARTITION BY pay_date) AS company_avg_amount,
AVG(amount) OVER (PARTITION BY pay_date, department_id) AS department_avg_amount
FROM
Salary AS s
JOIN Employee AS e ON s.employee_id = e.employee_id
)
SELECT DISTINCT
pay_month,
department_id,
CASE
WHEN company_avg_amount = department_avg_amount THEN 'same'
WHEN company_avg_amount < department_avg_amount THEN 'higher'
ELSE 'lower'
END AS comparison
FROM t;
Time Complexity: O(N log N), where N is the number of rows in the Salary
table. This is dominated by the sorting implicit in the window functions. The DISTINCT
clause adds a small overhead.
Space Complexity: O(N) in the worst case, due to the intermediate CTE (t
).
This solution uses GROUP BY
to achieve the same result as DISTINCT
in Solution 1.
WITH
S AS (
SELECT *
FROM
Salary
JOIN Employee USING (employee_id)
),
T AS (
SELECT
DATE_FORMAT(pay_date, '%Y-%m') AS pay_month,
department_id,
AVG(amount) OVER (PARTITION BY pay_date, department_id) AS department_avg,
AVG(amount) OVER (PARTITION BY pay_date) AS company_avg
FROM S
)
SELECT
pay_month,
department_id,
CASE
WHEN AVG(department_avg) > AVG(company_avg) THEN 'higher'
WHEN AVG(department_avg) < AVG(company_avg) THEN 'lower'
ELSE 'same'
END AS comparison
FROM T
GROUP BY 1, 2;
Time Complexity: O(N log N), similar to Solution 1. The GROUP BY
operation involves sorting.
Space Complexity: O(N) in the worst case, due to the intermediate CTEs (S
and T
).
Both solutions provide correct results. The choice between them might depend on specific database optimization strategies and potential index usage. The time and space complexities are similar, dominated by the window functions and the grouping/distinct operations. Both are efficient solutions for this problem.