This problem involves two tables: Tasks
and Executed
. Tasks
defines the total number of subtasks for each task, while Executed
records which subtasks have been completed. The goal is to identify and report the subtasks that were not executed for each task.
The most efficient approach uses a recursive CTE to generate all possible subtasks for each task and then performs a left join with the Executed
table to find the missing ones.
1. Recursive CTE (T):
The CTE T
recursively generates all subtask IDs for each task.
Base Case: It starts by selecting task_id
and subtasks_count
from the Tasks
table. subtasks_count
initially represents the highest subtask ID for that task.
Recursive Step: It then recursively subtracts 1 from subtask_id
until it reaches 1. This generates all subtask IDs (from 1 to subtasks_count
) for each task.
2. Left Join:
A LEFT JOIN
is performed between the CTE T
(containing all possible subtask IDs) and the Executed
table. This join uses task_id
and subtask_id
as the join keys.
3. Filtering Missing Subtasks:
Finally, a WHERE
clause filters the results. Executed.subtask_id IS NULL
selects only those rows where a corresponding entry is not found in the Executed
table, indicating that the subtask was not executed.
WITH RECURSIVE
T(task_id, subtask_id) AS (
SELECT
task_id,
subtasks_count
FROM Tasks
UNION ALL
SELECT
task_id,
subtask_id - 1
FROM t
WHERE subtask_id > 1
)
SELECT
T.*
FROM
T
LEFT JOIN Executed USING (task_id, subtask_id)
WHERE Executed.subtask_id IS NULL;
Recursive CTE: The time complexity of generating the CTE T
is O(N*M), where N is the number of tasks and M is the maximum number of subtasks for any single task (in this case, M is capped at 20). This is because each task may require up to M recursive steps.
Left Join: The left join has a time complexity of O(N*M) in the worst case (if every subtask from T is present in the Executed table) because of its nested-loops nature. However, the average performance can be better due to the use of indexes on task_id
and subtask_id
.
Filtering: The filtering step has a linear time complexity O(N*M) with respect to the size of the intermediate result of the left join.
Therefore, the overall time complexity of the solution is dominated by the CTE generation and left join which is approximately O(N*M). Since M is a small constant (20), we can consider the complexity to be effectively linear with respect to the number of tasks (N).
The space complexity is determined primarily by the size of the intermediate results generated by the CTE and the left join. In the worst case, the CTE could generate NM rows, and the left join could also result in an intermediate table of similar size. Therefore, the space complexity is also approximately **O(NM)**. Again, considering M as a small constant, the space complexity can be considered effectively linear with respect to N.