{x}
blog image

The Number of Employees Which Report to Each Employee

Table: Employees

+-------------+----------+
| Column Name | Type     |
+-------------+----------+
| employee_id | int      |
| name        | varchar  |
| reports_to  | int      |
| age         | int      |
+-------------+----------+
employee_id is the column with unique values for this table.
This table contains information about the employees and the id of the manager they report to. Some employees do not report to anyone (reports_to is null). 

 

For this problem, we will consider a manager an employee who has at least 1 other employee reporting to them.

Write a solution to report the ids and the names of all managers, the number of employees who report directly to them, and the average age of the reports rounded to the nearest integer.

Return the result table ordered by employee_id.

The result format is in the following example.

 

Example 1:

Input: 
Employees table:
+-------------+---------+------------+-----+
| employee_id | name    | reports_to | age |
+-------------+---------+------------+-----+
| 9           | Hercy   | null       | 43  |
| 6           | Alice   | 9          | 41  |
| 4           | Bob     | 9          | 36  |
| 2           | Winston | null       | 37  |
+-------------+---------+------------+-----+
Output: 
+-------------+-------+---------------+-------------+
| employee_id | name  | reports_count | average_age |
+-------------+-------+---------------+-------------+
| 9           | Hercy | 2             | 39          |
+-------------+-------+---------------+-------------+
Explanation: Hercy has 2 people report directly to him, Alice and Bob. Their average age is (41+36)/2 = 38.5, which is 39 after rounding it to the nearest integer.

Example 2:

Input: 
Employees table:
+-------------+---------+------------+-----+ 
| employee_id | name    | reports_to | age |
|-------------|---------|------------|-----|
| 1           | Michael | null       | 45  |
| 2           | Alice   | 1          | 38  |
| 3           | Bob     | 1          | 42  |
| 4           | Charlie | 2          | 34  |
| 5           | David   | 2          | 40  |
| 6           | Eve     | 3          | 37  |
| 7           | Frank   | null       | 50  |
| 8           | Grace   | null       | 48  |
+-------------+---------+------------+-----+ 
Output: 
+-------------+---------+---------------+-------------+
| employee_id | name    | reports_count | average_age |
| ----------- | ------- | ------------- | ----------- |
| 1           | Michael | 2             | 40          |
| 2           | Alice   | 2             | 37          |
| 3           | Bob     | 1             | 37          |
+-------------+---------+---------------+-------------+

Solution Explanation: SQL Query for Manager Statistics

This problem requires retrieving manager information along with the number of direct reports and their average age. The solution uses a SQL query leveraging self-joins and aggregate functions.

Approach:

  1. Self-Join: The Employees table is self-joined using JOIN Employees AS e2 ON e1.reports_to = e2.employee_id. This links each employee (e1) to their manager (e2) based on the reports_to column. e1 represents the employees, while e2 represents their respective managers.

  2. Grouping: The GROUP BY 1 clause groups the results by the manager's employee_id (the first column in the SELECT statement). This allows us to aggregate data for each manager.

  3. Aggregation:

    • COUNT(1) counts the number of employees reporting to each manager (reports_count).
    • ROUND(AVG(e1.age)) calculates the average age of the employees reporting to each manager and rounds the result to the nearest integer (average_age).
  4. Filtering (Implicit): Managers are implicitly identified because only employees who have at least one subordinate will appear in the results after the GROUP BY operation.

  5. Ordering: ORDER BY 1 orders the final result set by employee_id as required.

Time Complexity Analysis:

The time complexity of this SQL query depends heavily on the database engine's optimization strategies and the size of the Employees table. However, it is generally considered to be O(N log N) or better due to the efficient indexing and algorithms employed by most database systems for JOIN, GROUP BY, and ORDER BY operations. In the worst-case scenario without optimal indexes, it might approach O(N^2), where N is the number of rows in the Employees table. Proper indexing on employee_id and reports_to columns is crucial for efficient query execution.

MySQL Code:

SELECT
    e2.employee_id,
    e2.name,
    COUNT(1) AS reports_count,
    ROUND(AVG(e1.age)) AS average_age
FROM
    Employees AS e1
    JOIN Employees AS e2 ON e1.reports_to = e2.employee_id
GROUP BY 1
ORDER BY 1;

Other SQL Dialects:

The query's structure is largely database-agnostic. Minor syntax adjustments might be necessary depending on the specific database system (e.g., column aliases might need adjustments). The core concepts of self-join, grouping, and aggregation remain the same. For instance, in PostgreSQL, you might use ROUND(AVG(e1.age)::numeric) for more precise rounding.

The provided solution efficiently solves the problem using standard SQL features and is generally adaptable to most SQL databases. The focus on proper indexing is key to maintaining good performance as the table size grows.