Table: Customers
+-------------+---------+ | Column Name | Type | +-------------+---------+ | id | int | | name | varchar | +-------------+---------+ id is the primary key (column with unique values) for this table. Each row of this table indicates the ID and name of a customer.
Table: Orders
+-------------+------+ | Column Name | Type | +-------------+------+ | id | int | | customerId | int | +-------------+------+ id is the primary key (column with unique values) for this table. customerId is a foreign key (reference columns) of the ID from the Customers table. Each row of this table indicates the ID of an order and the ID of the customer who ordered it.
Write a solution to find all customers who never order anything.
Return the result table in any order.
The result format is in the following example.
Example 1:
Input: Customers table: +----+-------+ | id | name | +----+-------+ | 1 | Joe | | 2 | Henry | | 3 | Sam | | 4 | Max | +----+-------+ Orders table: +----+------------+ | id | customerId | +----+------------+ | 1 | 3 | | 2 | 1 | +----+------------+ Output: +-----------+ | Customers | +-----------+ | Henry | | Max | +-----------+
This problem requires finding customers from a Customers
table who have not placed any orders in an Orders
table. We can achieve this using two primary SQL approaches: NOT IN
and LEFT JOIN
.
NOT IN
This approach first identifies all customerId
values present in the Orders
table. Then, it selects customers from the Customers
table whose id
is not present in that list.
MySQL Code:
SELECT name AS Customers
FROM Customers
WHERE
id NOT IN (
SELECT customerId
FROM Orders
);
Explanation:
SELECT name AS Customers
: This selects the name
column from the Customers
table and renames it to Customers
for the output.FROM Customers
: Specifies the Customers
table as the source of data.WHERE id NOT IN (...)
: This is the core of the query. It filters the results to include only those customers whose id
is not found within the subquery.(SELECT customerId FROM Orders)
: The subquery selects all customerId
values from the Orders
table. This represents the set of customers who have placed at least one order.Python (Pandas) Code:
import pandas as pd
def find_customers(customers: pd.DataFrame, orders: pd.DataFrame) -> pd.DataFrame:
df = customers[~customers["id"].isin(orders["customerId"])]
df = df[["name"]].rename(columns={"name": "Customers"})
return df
Explanation:
customers[~customers["id"].isin(orders["customerId"])]
: This line uses Pandas' isin()
function to check if each id
in the customers
DataFrame is present in the customerId
column of the orders
DataFrame. The ~
operator inverts the boolean result, selecting only those rows where the id
is not found in orders["customerId"]
.df = df[["name"]].rename(columns={"name": "Customers"})
: This selects only the name
column and renames it to Customers
for consistency with the SQL output.LEFT JOIN
This approach uses a LEFT JOIN
to combine the Customers
and Orders
tables. A LEFT JOIN
returns all rows from the left table (Customers
), even if there's no match in the right table (Orders
). We then filter the results to include only rows where the customerId
is NULL
, indicating customers with no matching orders.
MySQL Code:
SELECT name AS Customers
FROM
Customers AS c
LEFT JOIN Orders AS o ON c.id = o.customerId
WHERE o.id IS NULL;
Explanation:
SELECT name AS Customers
: Selects the name
column and renames it.FROM Customers AS c LEFT JOIN Orders AS o ON c.id = o.customerId
: Performs a LEFT JOIN
between Customers
(aliased as c
) and Orders
(aliased as o
) based on the id
and customerId
columns.WHERE o.id IS NULL
: Filters the results to keep only rows where the id
from the Orders
table is NULL
. This signifies that no matching order exists for that customer.Both approaches have similar time complexities.
NOT IN
Approach: The subquery (SELECT customerId FROM Orders
) takes O(N) time, where N is the number of rows in the Orders
table. The outer query then iterates through the Customers
table (O(M), where M is the number of rows in the Customers
table), performing a lookup in the result set of the subquery for each customer (O(N) in the worst case). Therefore, the overall complexity is approximately O(M*N). However, database optimizers often significantly improve this performance in practice.
LEFT JOIN
Approach: A LEFT JOIN
operation generally has a time complexity of O(M*N) in the worst case, where M and N are the number of rows in the Customers
and Orders
tables, respectively. Again, database optimization can improve this significantly.
In summary, while the theoretical worst-case complexity is O(M*N) for both approaches, database systems employ optimizations that usually result in much faster execution times than a naive implementation would suggest. The choice between NOT IN
and LEFT JOIN
often depends on personal preference and the specific database system's capabilities. LEFT JOIN
is often preferred for its readability and generally better performance in many database systems.