{x}
blog image

Department Highest Salary

Table: Employee

+--------------+---------+
| Column Name  | Type    |
+--------------+---------+
| id           | int     |
| name         | varchar |
| salary       | int     |
| departmentId | int     |
+--------------+---------+
id is the primary key (column with unique values) for this table.
departmentId is a foreign key (reference columns) of the ID from the Department table.
Each row of this table indicates the ID, name, and salary of an employee. It also contains the ID of their department.

 

Table: Department

+-------------+---------+
| Column Name | Type    |
+-------------+---------+
| id          | int     |
| name        | varchar |
+-------------+---------+
id is the primary key (column with unique values) for this table. It is guaranteed that department name is not NULL.
Each row of this table indicates the ID of a department and its name.

 

Write a solution to find employees who have the highest salary in each of the departments.

Return the result table in any order.

The result format is in the following example.

 

Example 1:

Input: 
Employee table:
+----+-------+--------+--------------+
| id | name  | salary | departmentId |
+----+-------+--------+--------------+
| 1  | Joe   | 70000  | 1            |
| 2  | Jim   | 90000  | 1            |
| 3  | Henry | 80000  | 2            |
| 4  | Sam   | 60000  | 2            |
| 5  | Max   | 90000  | 1            |
+----+-------+--------+--------------+
Department table:
+----+-------+
| id | name  |
+----+-------+
| 1  | IT    |
| 2  | Sales |
+----+-------+
Output: 
+------------+----------+--------+
| Department | Employee | Salary |
+------------+----------+--------+
| IT         | Jim      | 90000  |
| Sales      | Henry    | 80000  |
| IT         | Max      | 90000  |
+------------+----------+--------+
Explanation: Max and Jim both have the highest salary in the IT department and Henry has the highest salary in the Sales department.

Solution Explanation for LeetCode 184: Department Highest Salary

This problem requires finding the employees with the highest salary in each department. Two SQL solutions are provided: one using a subquery and another using window functions. Both achieve the same result but differ in their approach and efficiency.

Solution 1: Equi-Join + Subquery

This approach uses a subquery to find the maximum salary for each department and then joins this result with the Employee and Department tables to retrieve the employee information.

Steps:

  1. Inner Join: The Employee and Department tables are joined using an INNER JOIN based on Employee.departmentId = Department.id. This combines data from both tables based on matching department IDs.

  2. Subquery: A subquery (SELECT departmentId, MAX(salary) FROM Employee GROUP BY 1) is used. This subquery groups the Employee table by departmentId and finds the maximum salary (MAX(salary)) for each department. GROUP BY 1 is a shorthand for GROUP BY departmentId.

  3. Filtering: The WHERE clause filters the joined result to include only those rows where the (departmentId, salary) tuple matches the result from the subquery. This ensures that only employees with the maximum salary in their respective departments are selected.

  4. Result Selection: The SELECT statement retrieves the department name (d.name), employee name (e.name), and salary.

MySQL Code:

SELECT d.name AS department, e.name AS employee, salary
FROM
    Employee AS e
    JOIN Department AS d ON e.departmentId = d.id
WHERE
    (d.id, salary) IN (
        SELECT departmentId, MAX(salary)
        FROM Employee
        GROUP BY 1
    );

Time Complexity: The subquery has a time complexity of O(N log N) due to the sorting involved in finding the maximum salary for each department (N being the number of employees). The outer join is O(M log M) where M is the total number of rows after the join. Overall, the complexity is dominated by the sorting in the subquery.

Space Complexity: The space complexity is O(D), where D is the number of distinct departments. This is because the subquery needs to store the maximum salary for each department.

Solution 2: Equi-Join + Window Function

This approach uses a window function (RANK()) to assign a rank to each employee within their department based on their salary. The highest-paid employee in each department receives rank 1.

Steps:

  1. Common Table Expression (CTE): A CTE T is created to perform the join and ranking.

  2. Inner Join: Similar to Solution 1, an INNER JOIN combines Employee and Department tables.

  3. Window Function: The RANK() window function assigns a rank to each employee based on salary within each department (PARTITION BY d.name). Employees with the same salary get the same rank. ORDER BY salary DESC ensures that higher salaries get lower ranks (rank 1 being the highest).

  4. Filtering: The outer query selects only those rows from the CTE where the rank (rk) is 1.

  5. Result Selection: The SELECT statement retrieves the department name, employee name, and salary for those with rank 1.

MySQL Code:

WITH
    T AS (
        SELECT
            d.name AS department,
            e.name AS employee,
            salary,
            RANK() OVER (
                PARTITION BY d.name
                ORDER BY salary DESC
            ) AS rk
        FROM
            Employee AS e
            JOIN Department AS d ON e.departmentId = d.id
    )
SELECT department, employee, salary
FROM T
WHERE rk = 1;

Time Complexity: The join operation takes O(N log N) time. The RANK() window function also takes O(N log N) time in the worst case (although implementations may optimize this). Overall, the time complexity is O(N log N).

Space Complexity: The space complexity is O(N) because the CTE needs to store all the employee and department information with the added rank.

Comparison:

Both solutions provide correct results. Solution 2 (using window functions) is generally preferred for its better readability and often better performance, especially in database systems optimized for window functions. Solution 1 can be less efficient, particularly with very large datasets, because of the subquery's need to repeatedly find the maximum for each department. The specific performance difference can depend on the database system and its query optimizer.