Table: Employees
+-------------+----------+ | Column Name | Type | +-------------+----------+ | employee_id | int | | name | varchar | | manager_id | int | | salary | int | +-------------+----------+ In SQL, employee_id is the primary key for this table. This table contains information about the employees, their salary, and the ID of their manager. Some employees do not have a manager (manager_id is null).
Find the IDs of the employees whose salary is strictly less than $30000
and whose manager left the company. When a manager leaves the company, their information is deleted from the Employees
table, but the reports still have their manager_id
set to the manager that left.
Return the result table ordered by employee_id
.
The result format is in the following example.
Example 1:
Input: Employees table: +-------------+-----------+------------+--------+ | employee_id | name | manager_id | salary | +-------------+-----------+------------+--------+ | 3 | Mila | 9 | 60301 | | 12 | Antonella | null | 31000 | | 13 | Emery | null | 67084 | | 1 | Kalel | 11 | 21241 | | 9 | Mikaela | null | 50937 | | 11 | Joziah | 6 | 28485 | +-------------+-----------+------------+--------+ Output: +-------------+ | employee_id | +-------------+ | 11 | +-------------+ Explanation: The employees with a salary less than $30000 are 1 (Kalel) and 11 (Joziah). Kalel's manager is employee 11, who is still in the company (Joziah). Joziah's manager is employee 6, who left the company because there is no row for employee 6 as it was deleted.
This problem requires finding employees who meet two conditions: their salary is less than $30,000, and their manager is no longer in the company (meaning their manager_id
exists in the Employees
table but the corresponding employee record is missing).
Two solutions are provided: one using a LEFT JOIN
and another using a subquery. Both achieve the same result but differ in their approach.
This approach uses a LEFT JOIN
to connect the Employees
table to itself. We join the table on e1.manager_id = e2.employee_id
. This effectively links each employee (e1
) to their manager (e2
). A LEFT JOIN
ensures that all rows from the left table (e1
) are included, even if there's no matching row in the right table (e2
).
The WHERE
clause filters the results:
e1.salary < 30000
: Selects employees with salaries below $30,000.e1.manager_id IS NOT NULL
: Ensures the employee has a manager.e2.employee_id IS NULL
: This is the key condition. It identifies cases where a manager_id
exists in e1
but there is no corresponding entry in e2
, indicating the manager has left the company.Finally, ORDER BY 1
sorts the results by employee_id
.
MySQL Code:
SELECT e1.employee_id
FROM
Employees AS e1
LEFT JOIN Employees AS e2 ON e1.manager_id = e2.employee_id
WHERE e1.salary < 30000 AND e1.manager_id IS NOT NULL AND e2.employee_id IS NULL
ORDER BY 1;
This solution uses a subquery to first identify the employee_id
s of managers who have left the company. The subquery (SELECT employee_id FROM Employees)
returns all existing employee_id
s. The outer query then filters for employees whose salary is less than $30,000 and whose manager_id
is not in the set of existing employee_id
s (meaning their manager is not in the company anymore).
MySQL Code:
SELECT employee_id
FROM Employees
WHERE salary < 30000 AND manager_id NOT IN (SELECT employee_id FROM Employees)
ORDER BY 1;
Both solutions have similar time complexities.
Solution 1 (LEFT JOIN): The LEFT JOIN
operation has a time complexity that depends on the database's implementation but is generally considered to be O(N log N) or O(N*M) in the worst case for large tables (N and M being the number of rows in the respective tables). The filtering operation in the WHERE
clause is linear, O(N), where N is the number of rows after the join.
Solution 2 (Subquery): The subquery has a time complexity of O(N) to select all employee_id
s. The outer query then performs a linear scan, O(M), where M is the number of rows in the Employees
table. The NOT IN
operation itself can be computationally expensive, sometimes even causing performance degradation compared to a LEFT JOIN
.
In practice, the performance difference might depend on the database's optimizer and the size of the data. Often, LEFT JOIN
s are optimized better than NOT IN
subqueries with large datasets.
Therefore, while both are theoretically O(N), Solution 1 (using LEFT JOIN
) is generally preferred for its clarity and often better performance with larger datasets in practice.