{x}
blog image

Game Play Analysis V

Solution Explanation for LeetCode 1097: Game Play Analysis V

This problem requires analyzing player activity data to calculate the day-one retention rate for each installation date. The solution involves finding the installation date for each player and then determining the proportion of players who returned the day after installation.

Approach

The core idea is to use a window function to determine each player's installation date (the first date they logged in) and then group the data by installation date to calculate the retention rate.

  1. Find Installation Date: A common table expression (CTE) is used to determine the installation date for each player. The MIN(event_date) OVER (PARTITION BY player_id) clause finds the minimum event_date for each player, representing their installation date. This is assigned to a new column install_dt.

  2. Calculate Retention: The main query then groups the data by install_dt. For each installation date:

    • COUNT(DISTINCT player_id) counts the total number of players who installed the game on that day (installs).
    • SUM(DATEDIFF(event_date, install_dt) = 1) counts the number of players who logged in exactly one day after their installation date. DATEDIFF calculates the difference in days.
    • The retention rate is calculated by dividing the number of returning players by the total number of installs and rounded to two decimal places using ROUND.

MySQL Code and Explanation

WITH
    T AS (
        SELECT
            player_id,
            event_date,
            MIN(event_date) OVER (PARTITION BY player_id) AS install_dt
        FROM Activity
    )
SELECT
    install_dt,
    COUNT(DISTINCT player_id) AS installs,
    ROUND(
        SUM(DATEDIFF(event_date, install_dt) = 1) / COUNT(DISTINCT player_id),
        2
    ) AS day1_retention
FROM T
GROUP BY 1;
  • WITH T AS (...): This defines a CTE named T. CTEs are useful for breaking down complex queries into smaller, more manageable parts.
  • SELECT player_id, event_date, MIN(event_date) OVER (PARTITION BY player_id) AS install_dt FROM Activity: This selects the player ID, event date, and calculates the installation date using the window function. PARTITION BY player_id ensures that the MIN function is applied separately to each player's events.
  • SELECT install_dt, COUNT(DISTINCT player_id) AS installs, ... FROM T GROUP BY 1: This is the main query that uses the CTE T. It groups the results by install_dt and calculates the total installs and day-one retention for each installation date. GROUP BY 1 is a shorthand for GROUP BY install_dt.

Time Complexity Analysis

The time complexity of this solution is dominated by the window function and the GROUP BY operation. The window function has a time complexity of O(N log N) in the worst case (where N is the number of rows in the Activity table), and the GROUP BY operation also has a time complexity of O(N log N) in the worst case (due to sorting). Therefore, the overall time complexity is O(N log N). The space complexity is O(N) due to the CTE and intermediate results. In practice, optimized database engines might achieve better performance.