This problem requires processing two tables, Buses
and Passengers
, to determine how many passengers board each bus. The challenge lies in considering the arrival times of both buses and passengers, the bus capacity, and ensuring that passengers are only counted once (even if they arrive before multiple buses).
The solution uses a MySQL query with common table expressions (CTEs) to efficiently solve this problem. Here's a breakdown:
Union All and Sorting: We first combine the Buses
and Passengers
tables using UNION ALL
. The Passengers
table entries are given a dummy bus_id
of -1. This combined table is then sorted by arrival time (arrival_time
) and then by bus_id
. This ensures that we process passengers and buses in chronological order.
Cumulative Sum: A running sum (cur
) is calculated using the SUM() OVER (ORDER BY dt, bus_id)
window function. This tracks the total number of passengers and buses processed up to the current row.
Tracking Passengers and Buses: A variable @t
is used to keep track of the cumulative number of passengers that have boarded buses. If cur_sum
is greater than 0, it means the current bus already has passengers, otherwise, it is a new bus without passengers.
Calculating Passengers per Bus: The final SELECT
statement filters for rows representing buses (bus_id > 0
) and calculates the number of passengers (passengers_cnt
) for each bus. It subtracts the cumulative number of passengers already boarded (cur_sum
) from the bus capacity (cnt
) to get the number of passengers for the current bus.
WITH
T AS (
SELECT
*,
SUM(cnt) OVER (ORDER BY dt, bus_id) AS cur,
IF(@t > 0, @t := cnt, @t := @t + cnt) AS cur_sum
FROM
(
SELECT bus_id, arrival_time AS dt, capacity AS cnt FROM Buses
UNION ALL
SELECT -1, arrival_time AS dt, -1 FROM Passengers
) AS a JOIN (SELECT @t := 0 x) AS b
)
SELECT
bus_id,
IF(cur_sum > 0, cnt - cur_sum, cnt) AS passengers_cnt
FROM T
WHERE bus_id > 0
ORDER BY bus_id;
T
which performs the intermediate calculations.Buses
and Passengers
data, adding a dummy bus_id
for passengers.cnt
(capacity for buses, -1 for passengers).The time complexity of this solution is dominated by the sorting and window function operations within the CTE. The sorting step has a time complexity of O(N log N), where N is the total number of rows in the combined Buses
and Passengers
tables. The window function also adds to the complexity, but it's generally implemented efficiently in database systems. Therefore, the overall time complexity is O(N log N).
The space complexity is primarily determined by the size of the CTE T
, which stores the combined data with additional calculated columns. This space complexity is O(N).