{x}
blog image

Customers Who Bought All Products

Table: Customer

+-------------+---------+
| Column Name | Type    |
+-------------+---------+
| customer_id | int     |
| product_key | int     |
+-------------+---------+
This table may contain duplicates rows. 
customer_id is not NULL.
product_key is a foreign key (reference column) to Product table.

 

Table: Product

+-------------+---------+
| Column Name | Type    |
+-------------+---------+
| product_key | int     |
+-------------+---------+
product_key is the primary key (column with unique values) for this table.

 

Write a solution to report the customer ids from the Customer table that bought all the products in the Product table.

Return the result table in any order.

The result format is in the following example.

 

Example 1:

Input: 
Customer table:
+-------------+-------------+
| customer_id | product_key |
+-------------+-------------+
| 1           | 5           |
| 2           | 6           |
| 3           | 5           |
| 3           | 6           |
| 1           | 6           |
+-------------+-------------+
Product table:
+-------------+
| product_key |
+-------------+
| 5           |
| 6           |
+-------------+
Output: 
+-------------+
| customer_id |
+-------------+
| 1           |
| 3           |
+-------------+
Explanation: 
The customers who bought all the products (5 and 6) are customers with IDs 1 and 3.

Solution Explanation: Finding Customers Who Bought All Products

The problem requires identifying customers who have purchased every product listed in the Product table. The solution leverages SQL's grouping and aggregation capabilities to achieve this efficiently.

Approach:

The core idea is to:

  1. Count distinct products per customer: Group the Customer table by customer_id and count the unique product_key values for each customer. This gives us the number of distinct products each customer bought.

  2. Count total distinct products: Determine the total number of distinct products available using a subquery on the Product table. This is the benchmark against which we compare each customer's purchases.

  3. Filter customers: Compare the count of distinct products bought by each customer (from step 1) with the total number of distinct products (from step 2). Only customers whose purchase count matches the total product count have bought all products and should be included in the result.

SQL Solution (MySQL):

SELECT customer_id
FROM Customer
GROUP BY 1
HAVING COUNT(DISTINCT product_key) = (SELECT COUNT(*) FROM Product);

Explanation:

  • SELECT customer_id: This selects the customer_id column to be included in the output.
  • FROM Customer: This specifies the Customer table as the source of data.
  • GROUP BY 1: This groups the rows by the first column in the SELECT list (customer_id), so we get counts of distinct products for each customer.
  • HAVING COUNT(DISTINCT product_key) = (SELECT COUNT(*) FROM Product): This is the crucial filtering step.
    • COUNT(DISTINCT product_key) counts the number of distinct products each customer purchased within their group.
    • (SELECT COUNT(*) FROM Product) is a subquery that calculates the total number of unique products in the Product table.
    • The HAVING clause filters the grouped results, keeping only groups (customers) where the number of distinct products they bought equals the total number of products.

Time Complexity Analysis:

The time complexity is dominated by the grouping and aggregation operations. In general, these operations have a time complexity of O(N log N) or O(N), where N is the number of rows in the Customer table, depending on the database engine's optimization strategies. The subquery to count products in the Product table is a separate operation with complexity proportional to the size of the Product table (let's call it M). The overall complexity would be O(max(N log N, M)). If the Product table is relatively small, the overall complexity is effectively O(N log N) or O(N).

Space Complexity Analysis:

The space complexity is determined by the intermediate data structures used during the grouping and aggregation. The space used will be proportional to the number of unique customers and distinct products, which is likely to be less than or equal to the size of the Customer table. Therefore, the space complexity is considered to be O(N) in the worst case. In practice, this will likely be less than O(N) as not all customers will have bought all the products.