This problem requires finding users who made two purchases within seven days of each other. The optimal approach uses window functions in SQL to efficiently compare purchase dates for each user.
Window Function: We employ the LAG()
window function. LAG(purchase_date, 1) OVER (PARTITION BY user_id ORDER BY purchase_date)
fetches the previous purchase date for each user, ordered chronologically. PARTITION BY user_id
ensures that the LAG()
function operates independently for each user.
Date Difference: DATEDIFF(purchase_date, LAG(purchase_date, 1) OVER (PARTITION BY user_id ORDER BY purchase_date))
calculates the difference in days between the current purchase date and the previous purchase date for each user.
Filtering: The WHERE d <= 7
clause filters the results, retaining only those users with a date difference of 7 days or less between consecutive purchases.
Distinct Users: SELECT DISTINCT user_id
ensures that each user ID appears only once in the output, even if they have multiple purchase pairs within the 7-day window.
Ordering: ORDER BY user_id
sorts the final result by user ID.
WITH
t AS (
SELECT
user_id,
DATEDIFF(
purchase_date,
LAG(purchase_date, 1) OVER (
PARTITION BY user_id
ORDER BY purchase_date
)
) AS d
FROM Purchases
)
SELECT DISTINCT user_id
FROM t
WHERE d <= 7
ORDER BY user_id;
The time complexity is dominated by the window function operations. Window functions typically have a time complexity of O(N log N) or O(N) depending on the specific database implementation and the size of the input data N
being the number of rows in the Purchases
table. The subsequent filtering and sorting steps add minimal overhead compared to the window function calculations. Therefore, the overall time complexity is approximately O(N log N) or O(N). The space complexity is also O(N) in the worst case due to the creation of intermediate tables in the CTE (Common Table Expression).