This problem requires finding pairs of users who have at least three common friends. The solution uses SQL queries to efficiently achieve this.
The solution employs a clever use of SQL joins and aggregate functions. Here's a breakdown:
Creating a Symmetric Friendship Table (t): The initial WITH t AS (...)
statement creates a temporary table t
. This table is crucial because it makes the friendship relationships bidirectional. The original Friendship
table only lists friendships as (user1_id, user2_id) where user1_id < user2_id. The UNION ALL
operation combines the original table with a reversed version, ensuring that if A is friends with B, the relationship is represented as both (A, B) and (B, A).
Joining the Table: The main query joins t
with itself three times using aliases t1
, t2
, and t3
. This allows us to find common friends:
t1
represents the pair of users we're checking for strong friendship.t2
finds all friends of t1.user2_id
.t3
finds all friends of t1.user1_id
.The JOIN
conditions t1.user2_id = t2.user1_id
and t1.user1_id = t3.user1_id
link friends of each user. The crucial condition t3.user2_id = t2.user2_id
ensures that we're only counting friends that are common to both t1.user1_id
and t1.user2_id
.
Counting Common Friends: COUNT(1)
counts the number of common friends. The GROUP BY
clause groups the results by pairs of user1_id
and user2_id
.
Filtering for Strong Friendships: The HAVING COUNT(1) >= 3
clause filters out pairs with fewer than three common friends, leaving only the strong friendships.
Ensuring Uniqueness: The condition t1.user1_id < t1.user2_id
in the WHERE
clause ensures that each strong friendship pair appears only once (preventing duplicates like (1,2) and (2,1)).
WITH
t AS (
SELECT
*
FROM Friendship
UNION ALL
SELECT
user2_id,
user1_id
FROM Friendship
)
SELECT
t1.user1_id,
t1.user2_id,
COUNT(1) AS common_friend
FROM
t AS t1
JOIN t AS t2 ON t1.user2_id = t2.user1_id
JOIN t AS t3 ON t1.user1_id = t3.user1_id
WHERE t3.user2_id = t2.user2_id AND t1.user1_id < t1.user2_id
GROUP BY t1.user1_id, t1.user2_id
HAVING COUNT(1) >= 3;
The code directly implements the steps described above using standard SQL syntax.
The time complexity is dominated by the self-joins. In the worst case, the number of rows in the t
table is proportional to the number of friendships (let's say N). The joins can lead to a complexity of approximately O(N^3) in the worst case. However, this is a worst-case scenario; in practice, the complexity might be significantly lower because not every pair of users will have many common friends. The GROUP BY
and HAVING
operations add some overhead, but they are generally less significant than the joins. The database's query optimizer plays a role in actual performance as well.
The space complexity is dominated by the size of the temporary table t
and the intermediate results of the joins. This is roughly proportional to the size of the input data (number of friendships), making it O(N). Again, database optimizations can affect the actual space used.