{x}
blog image

Nth Highest Salary

Table: Employee

+-------------+------+
| Column Name | Type |
+-------------+------+
| id          | int  |
| salary      | int  |
+-------------+------+
id is the primary key (column with unique values) for this table.
Each row of this table contains information about the salary of an employee.

 

Write a solution to find the nth highest salary from the Employee table. If there is no nth highest salary, return null.

The result format is in the following example.

 

Example 1:

Input: 
Employee table:
+----+--------+
| id | salary |
+----+--------+
| 1  | 100    |
| 2  | 200    |
| 3  | 300    |
+----+--------+
n = 2
Output: 
+------------------------+
| getNthHighestSalary(2) |
+------------------------+
| 200                    |
+------------------------+

Example 2:

Input: 
Employee table:
+----+--------+
| id | salary |
+----+--------+
| 1  | 100    |
+----+--------+
n = 2
Output: 
+------------------------+
| getNthHighestSalary(2) |
+------------------------+
| null                   |
+------------------------+

Solution Explanation for Nth Highest Salary

This problem requires finding the Nth highest salary from an Employee table containing employee IDs and salaries. The solution needs to handle cases where the Nth highest salary doesn't exist.

Approach

The core idea is to sort the unique salaries in descending order and then select the Nth element. If there are fewer than N unique salaries, we return NULL (or a representation of NULL in the specific language).

SQL Solution (MySQL)

The MySQL solution leverages SQL's built-in capabilities for sorting and limiting results:

CREATE FUNCTION getNthHighestSalary(N INT) RETURNS INT
BEGIN
    SET N = N - 1; -- Adjust N because LIMIT uses 0-based indexing
  RETURN (
      SELECT DISTINCT salary -- Select distinct salaries to avoid duplicates
      FROM Employee
      ORDER BY salary DESC -- Order in descending order
      LIMIT 1 OFFSET N -- Limit to 1 row, offset by N-1 to get the Nth highest
  );
END
  • CREATE FUNCTION getNthHighestSalary(N INT) RETURNS INT: This defines a function that takes an integer N as input and returns an integer representing the Nth highest salary.
  • SET N = N - 1;: This line adjusts the input N because the LIMIT clause in MySQL uses 0-based indexing. If we want the 2nd highest salary, we need to offset by 1.
  • SELECT DISTINCT salary ...: This selects the unique salaries to handle duplicate salary values.
  • ORDER BY salary DESC: This sorts the salaries in descending order.
  • LIMIT 1 OFFSET N: This limits the result set to one row, offsetting by N to retrieve the Nth highest salary. If no Nth highest salary exists, this returns an empty result set, which is handled implicitly by the function's RETURN statement. An empty result set translates to NULL.

Time Complexity: O(M log M), where M is the number of unique salaries. Sorting dominates the time complexity. If an index exists on the salary column, this could be reduced. Space Complexity: O(M) in the worst case, to store the sorted unique salaries if no index is used.

Python Solution (using Pandas)

The Python solution uses the Pandas library for efficient data manipulation:

import pandas as pd
import numpy as np
 
def nth_highest_salary(employee: pd.DataFrame, N: int) -> pd.DataFrame:
    unique_salaries = employee.salary.unique()
    if len(unique_salaries) < N:
        return pd.DataFrame([np.NaN], columns=[f"getNthHighestSalary({N})"])
    else:
        salary = sorted(unique_salaries, reverse=True)[N - 1]
        return pd.DataFrame([salary], columns=[f"getNthHighestSalary({N})"])
  • unique_salaries = employee.salary.unique(): This extracts the unique salaries from the Pandas DataFrame.
  • if len(unique_salaries) < N:: This checks if there are fewer than N unique salaries; if so, it returns a DataFrame with NaN representing NULL.
  • salary = sorted(unique_salaries, reverse=True)[N - 1]: This sorts the unique salaries in descending order and selects the (N-1)th element (again, because of 0-based indexing).
  • return pd.DataFrame([salary], columns=[f"getNthHighestSalary({N})"]): This creates a Pandas DataFrame with the Nth highest salary and a specific column name for consistency with the problem's output format.

Time Complexity: O(M log M), where M is the number of unique salaries, due to the sorting operation. Space Complexity: O(M) to store the unique salaries.

Both solutions efficiently solve the problem, with the SQL solution being more concise and potentially faster if the database is optimized for the query. The Python solution offers more flexibility if you're working with data outside of a relational database.