{x}
blog image

Customers Who Never Order

Table: Customers

+-------------+---------+
| Column Name | Type    |
+-------------+---------+
| id          | int     |
| name        | varchar |
+-------------+---------+
id is the primary key (column with unique values) for this table.
Each row of this table indicates the ID and name of a customer.

 

Table: Orders

+-------------+------+
| Column Name | Type |
+-------------+------+
| id          | int  |
| customerId  | int  |
+-------------+------+
id is the primary key (column with unique values) for this table.
customerId is a foreign key (reference columns) of the ID from the Customers table.
Each row of this table indicates the ID of an order and the ID of the customer who ordered it.

 

Write a solution to find all customers who never order anything.

Return the result table in any order.

The result format is in the following example.

 

Example 1:

Input: 
Customers table:
+----+-------+
| id | name  |
+----+-------+
| 1  | Joe   |
| 2  | Henry |
| 3  | Sam   |
| 4  | Max   |
+----+-------+
Orders table:
+----+------------+
| id | customerId |
+----+------------+
| 1  | 3          |
| 2  | 1          |
+----+------------+
Output: 
+-----------+
| Customers |
+-----------+
| Henry     |
| Max       |
+-----------+

Solution Explanation for Customers Who Never Order

This problem requires finding customers from a Customers table who have not placed any orders in an Orders table. We can achieve this using two primary SQL approaches: NOT IN and LEFT JOIN.

Approach 1: Using NOT IN

This approach first identifies all customerId values present in the Orders table. Then, it selects customers from the Customers table whose id is not present in that list.

MySQL Code:

SELECT name AS Customers
FROM Customers
WHERE
    id NOT IN (
        SELECT customerId
        FROM Orders
    );

Explanation:

  1. SELECT name AS Customers: This selects the name column from the Customers table and renames it to Customers for the output.
  2. FROM Customers: Specifies the Customers table as the source of data.
  3. WHERE id NOT IN (...): This is the core of the query. It filters the results to include only those customers whose id is not found within the subquery.
  4. (SELECT customerId FROM Orders): The subquery selects all customerId values from the Orders table. This represents the set of customers who have placed at least one order.

Python (Pandas) Code:

import pandas as pd
 
def find_customers(customers: pd.DataFrame, orders: pd.DataFrame) -> pd.DataFrame:
    df = customers[~customers["id"].isin(orders["customerId"])]
    df = df[["name"]].rename(columns={"name": "Customers"})
    return df

Explanation:

  1. customers[~customers["id"].isin(orders["customerId"])]: This line uses Pandas' isin() function to check if each id in the customers DataFrame is present in the customerId column of the orders DataFrame. The ~ operator inverts the boolean result, selecting only those rows where the id is not found in orders["customerId"].
  2. df = df[["name"]].rename(columns={"name": "Customers"}): This selects only the name column and renames it to Customers for consistency with the SQL output.

Approach 2: Using LEFT JOIN

This approach uses a LEFT JOIN to combine the Customers and Orders tables. A LEFT JOIN returns all rows from the left table (Customers), even if there's no match in the right table (Orders). We then filter the results to include only rows where the customerId is NULL, indicating customers with no matching orders.

MySQL Code:

SELECT name AS Customers
FROM
    Customers AS c
    LEFT JOIN Orders AS o ON c.id = o.customerId
WHERE o.id IS NULL;

Explanation:

  1. SELECT name AS Customers: Selects the name column and renames it.
  2. FROM Customers AS c LEFT JOIN Orders AS o ON c.id = o.customerId: Performs a LEFT JOIN between Customers (aliased as c) and Orders (aliased as o) based on the id and customerId columns.
  3. WHERE o.id IS NULL: Filters the results to keep only rows where the id from the Orders table is NULL. This signifies that no matching order exists for that customer.

Time Complexity Analysis

Both approaches have similar time complexities.

  • NOT IN Approach: The subquery (SELECT customerId FROM Orders) takes O(N) time, where N is the number of rows in the Orders table. The outer query then iterates through the Customers table (O(M), where M is the number of rows in the Customers table), performing a lookup in the result set of the subquery for each customer (O(N) in the worst case). Therefore, the overall complexity is approximately O(M*N). However, database optimizers often significantly improve this performance in practice.

  • LEFT JOIN Approach: A LEFT JOIN operation generally has a time complexity of O(M*N) in the worst case, where M and N are the number of rows in the Customers and Orders tables, respectively. Again, database optimization can improve this significantly.

In summary, while the theoretical worst-case complexity is O(M*N) for both approaches, database systems employ optimizations that usually result in much faster execution times than a naive implementation would suggest. The choice between NOT IN and LEFT JOIN often depends on personal preference and the specific database system's capabilities. LEFT JOIN is often preferred for its readability and generally better performance in many database systems.