{x}
blog image

Countries You Can Safely Invest In

Solution Explanation

This problem requires finding countries where the average call duration is higher than the global average call duration. The solution involves several steps: joining tables, calculating averages, and filtering based on the global average.

Approach:

The solution uses SQL queries to manipulate the provided tables (Person, Country, Calls). The core idea is as follows:

  1. Join Tables: Join the Person, Calls, and Country tables to connect call durations to countries. The Person table links caller/callee IDs to phone numbers, which contain country codes. The Country table maps country codes to country names. This join ensures we can associate call durations with specific countries.

  2. Calculate Average Call Duration per Country: Group the joined data by country and compute the average call duration for each country using the AVG() aggregate function.

  3. Calculate Global Average Call Duration: Compute the global average call duration using AVG() on the duration column of the Calls table.

  4. Filter Countries: Filter the results from step 2, selecting only countries where the average call duration exceeds the global average calculated in step 3.

Time Complexity Analysis:

The time complexity is dominated by the join operations and the GROUP BY clause. In general, the time complexity of joining tables can be O(n*m) in the worst case (nested loop join), where 'n' and 'm' are the number of rows in the tables being joined. However, database systems typically optimize joins using techniques like hash joins or merge joins, which can reduce the complexity to closer to O(n+m) in many cases. The GROUP BY operation also has a complexity that depends on the implementation (e.g., sorting or hashing) but is generally considered to be at most O(n log n) or O(n) depending on the sorting method used. Therefore, the overall time complexity is likely to be close to O(n log n) or even O(n) in well-optimized database systems, where 'n' represents the total number of rows across all tables.

Space Complexity Analysis:

The space complexity is determined primarily by the size of intermediate results generated during the join and grouping operations. In the worst case, this could be proportional to the size of the input data, but efficient database implementations usually manage intermediate results efficiently, minimizing space usage.

Code (MySQL):

Both solutions achieve the same result, differing primarily in syntax (using a subquery or a Common Table Expression (CTE)).

Solution 1 (Subquery):

SELECT country
FROM
    (
        SELECT c.name AS country, AVG(duration) AS duration
        FROM
            Person
            JOIN Calls ON id IN(caller_id, callee_id)
            JOIN Country AS c ON LEFT(phone_number, 3) = country_code
        GROUP BY 1
    ) AS t
WHERE duration > (SELECT AVG(duration) FROM Calls);

Solution 2 (CTE):

WITH
    T AS (
        SELECT c.name AS country, AVG(duration) AS duration
        FROM
            Person
            JOIN Calls ON id IN(caller_id, callee_id)
            JOIN Country AS c ON LEFT(phone_number, 3) = country_code
        GROUP BY 1
    )
SELECT country
FROM T
WHERE duration > (SELECT AVG(duration) FROM Calls);

Both solutions perform the same operations, with CTE offering slightly improved readability for complex queries. The choice between them is largely a matter of preference and coding style within the context of the specific database system being used. The CTE approach might be preferred for more complex scenarios to make the query logic more modular and easier to maintain.