{x}
blog image

The Number of Passengers in Each Bus II

Solution Explanation: The Number of Passengers in Each Bus II

This problem requires processing two tables, Buses and Passengers, to determine how many passengers board each bus. The challenge lies in considering the arrival times of both buses and passengers, the bus capacity, and ensuring that passengers are only counted once (even if they arrive before multiple buses).

Approach

The solution uses a MySQL query with common table expressions (CTEs) to efficiently solve this problem. Here's a breakdown:

  1. Union All and Sorting: We first combine the Buses and Passengers tables using UNION ALL. The Passengers table entries are given a dummy bus_id of -1. This combined table is then sorted by arrival time (arrival_time) and then by bus_id. This ensures that we process passengers and buses in chronological order.

  2. Cumulative Sum: A running sum (cur) is calculated using the SUM() OVER (ORDER BY dt, bus_id) window function. This tracks the total number of passengers and buses processed up to the current row.

  3. Tracking Passengers and Buses: A variable @t is used to keep track of the cumulative number of passengers that have boarded buses. If cur_sum is greater than 0, it means the current bus already has passengers, otherwise, it is a new bus without passengers.

  4. Calculating Passengers per Bus: The final SELECT statement filters for rows representing buses (bus_id > 0) and calculates the number of passengers (passengers_cnt) for each bus. It subtracts the cumulative number of passengers already boarded (cur_sum) from the bus capacity (cnt) to get the number of passengers for the current bus.

MySQL Code Explained

WITH
    T AS (
        SELECT
            *,
            SUM(cnt) OVER (ORDER BY dt, bus_id) AS cur,
            IF(@t > 0, @t := cnt, @t := @t + cnt) AS cur_sum
        FROM
            (
                SELECT bus_id, arrival_time AS dt, capacity AS cnt FROM Buses
                UNION ALL
                SELECT -1, arrival_time AS dt, -1 FROM Passengers
            ) AS a JOIN (SELECT @t := 0 x) AS b
    )
SELECT
    bus_id,
    IF(cur_sum > 0, cnt - cur_sum, cnt) AS passengers_cnt
FROM T
WHERE bus_id > 0
ORDER BY bus_id;
  • WITH T AS (...): This defines a CTE named T which performs the intermediate calculations.
  • UNION ALL: Combines Buses and Passengers data, adding a dummy bus_id for passengers.
  • SUM() OVER (...): Calculates the running sum of cnt (capacity for buses, -1 for passengers).
  • @t: User-defined variable to track the accumulated number of passengers.
  • IF(@t > 0, ...): Conditional logic to handle passengers getting on the bus efficiently.
  • SELECT ... FROM T WHERE bus_id > 0: Filters for bus data and calculates the number of passengers on each bus.
  • ORDER BY bus_id: Sorts the final result by bus ID.

Time Complexity Analysis

The time complexity of this solution is dominated by the sorting and window function operations within the CTE. The sorting step has a time complexity of O(N log N), where N is the total number of rows in the combined Buses and Passengers tables. The window function also adds to the complexity, but it's generally implemented efficiently in database systems. Therefore, the overall time complexity is O(N log N).

The space complexity is primarily determined by the size of the CTE T, which stores the combined data with additional calculated columns. This space complexity is O(N).