This problem requires finding businesses with more than one event type where the number of occurrences exceeds the average occurrences for that event type. We'll examine two SQL solutions.
This approach uses a subquery to calculate the average occurrences for each event type and then joins it with the original Events
table.
MySQL Code:
SELECT business_id
FROM
EVENTS AS t1
JOIN (
SELECT
event_type,
AVG(occurences) AS occurences
FROM EVENTS
GROUP BY event_type
) AS t2
ON t1.event_type = t2.event_type
WHERE t1.occurences > t2.occurences
GROUP BY business_id
HAVING COUNT(1) > 1;
Explanation:
Inner Query (t2): This selects the event_type
and calculates the average occurrences
for each event_type
using AVG(occurrences)
. GROUP BY event_type
ensures the average is calculated separately for each event type.
Outer Query (t1): This joins the Events
table (aliased as t1
) with the results of the inner query (t2
) using ON t1.event_type = t2.event_type
. This ensures we compare each business's occurrences to the average for its corresponding event type.
WHERE
Clause: This filters the results to include only those businesses where the occurrences
are strictly greater than the average calculated in the inner query (t1.occurences > t2.occurences
).
GROUP BY business_id
: This groups the results by business_id
to count the number of event types that meet the criteria for each business.
HAVING COUNT(1) > 1
: This filters the results to include only businesses that have more than one event type satisfying the condition in the WHERE
clause (i.e., active businesses).
Time Complexity: The time complexity is dominated by the GROUP BY
and JOIN
operations. In a well-optimized database system, these operations typically have a complexity of O(N log N) or potentially O(N) with appropriate indexing, where N is the number of rows in the Events
table.
This solution leverages window functions for a more concise and potentially more efficient approach.
MySQL Code:
WITH
T AS (
SELECT
business_id,
occurences > AVG(occurences) OVER (PARTITION BY event_type) AS mark
FROM Events
)
SELECT business_id
FROM T
WHERE mark = 1
GROUP BY 1
HAVING COUNT(1) > 1;
Explanation:
Common Table Expression (CTE) T
: This CTE calculates a boolean mark
for each row. AVG(occurrences) OVER (PARTITION BY event_type)
calculates the average occurrences for each event type using a window function. The mark
is set to 1 (true) if the business's occurrences exceed the average for that event type, otherwise 0 (false).
Outer Query: This selects business_id
from CTE T
where mark = 1
(meaning occurrences exceeded the average), groups by business_id
, and filters for businesses with more than one event type meeting this criterion using HAVING COUNT(1) > 1
.
Time Complexity: Similar to Solution 1, the time complexity is primarily determined by the window function and GROUP BY
operations. Window functions are generally optimized well in modern database systems, so the complexity should be comparable to or potentially better than Solution 1, often around O(N log N) or O(N) with suitable indexing.
In summary: Both solutions effectively solve the problem. Solution 2, using window functions, tends to be more elegant and can be more efficient in some database systems due to the optimized implementation of window functions. The choice between them depends on preference and the specific database system's capabilities.