{x}
blog image

User Activity for the Past 30 Days I

Table: Activity

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| user_id       | int     |
| session_id    | int     |
| activity_date | date    |
| activity_type | enum    |
+---------------+---------+
This table may have duplicate rows.
The activity_type column is an ENUM (category) of type ('open_session', 'end_session', 'scroll_down', 'send_message').
The table shows the user activities for a social media website. 
Note that each session belongs to exactly one user.

 

Write a solution to find the daily active user count for a period of 30 days ending 2019-07-27 inclusively. A user was active on someday if they made at least one activity on that day.

Return the result table in any order.

The result format is in the following example.

 

Example 1:

Input: 
Activity table:
+---------+------------+---------------+---------------+
| user_id | session_id | activity_date | activity_type |
+---------+------------+---------------+---------------+
| 1       | 1          | 2019-07-20    | open_session  |
| 1       | 1          | 2019-07-20    | scroll_down   |
| 1       | 1          | 2019-07-20    | end_session   |
| 2       | 4          | 2019-07-20    | open_session  |
| 2       | 4          | 2019-07-21    | send_message  |
| 2       | 4          | 2019-07-21    | end_session   |
| 3       | 2          | 2019-07-21    | open_session  |
| 3       | 2          | 2019-07-21    | send_message  |
| 3       | 2          | 2019-07-21    | end_session   |
| 4       | 3          | 2019-06-25    | open_session  |
| 4       | 3          | 2019-06-25    | end_session   |
+---------+------------+---------------+---------------+
Output: 
+------------+--------------+ 
| day        | active_users |
+------------+--------------+ 
| 2019-07-20 | 2            |
| 2019-07-21 | 2            |
+------------+--------------+ 
Explanation: Note that we do not care about days with zero active users.

Solution Explanation for LeetCode 1141: User Activity for the Past 30 Days I

This problem requires retrieving the daily active user count for the 30 days leading up to and including 2019-07-27. A user is considered active on a day if they performed at least one activity.

The solution uses SQL queries to efficiently process the Activity table.

Approach:

  1. Filtering the Data: We first filter the Activity table to include only the activities that occurred within the 30-day period ending on 2019-07-27. This is achieved using a WHERE clause with DATEDIFF function in MySQL which calculates the difference between '2019-07-27' and the activity_date. We only include entries where the difference is less than 30 days.

  2. Grouping by Day: The GROUP BY clause groups the results by activity_date. This allows us to count distinct users for each day.

  3. Counting Active Users: The COUNT(DISTINCT user_id) function counts the number of unique users for each day. DISTINCT ensures that each user is counted only once per day, even if they have multiple activities.

  4. Result Formatting: The SELECT clause selects the activity_date as day and the count of distinct users as active_users.

MySQL Code:

SELECT activity_date AS day, COUNT(DISTINCT user_id) AS active_users
FROM Activity
WHERE activity_date <= '2019-07-27' AND DATEDIFF('2019-07-27', activity_date) < 30
GROUP BY 1;

Time Complexity Analysis:

The time complexity of this SQL query is dominated by the GROUP BY and COUNT(DISTINCT) operations. In the worst case, where all entries are within the 30-day window, the time complexity would be O(N log N) or potentially O(N) depending on the database's optimization techniques, where N is the number of rows in the Activity table. This is because sorting or hashing might be needed to group and count distinct users efficiently. However, for practical purposes with relatively small datasets, the performance will be quite fast.

Space Complexity Analysis:

The space complexity is primarily determined by the intermediate results generated during the GROUP BY and COUNT(DISTINCT) operations. In the worst case, this could be O(N) to store the grouped data and counts, where N is the number of rows in the Activity table. Again, the actual space usage in practice will depend on the database's optimization strategies. The output size (the result table) will be at most 30 rows (one for each day), so this space is considered relatively small and constant.