{x}
blog image

Page Recommendations

Solution Explanation for LeetCode 1264: Page Recommendations

This problem requires us to find pages that should be recommended to a user (user_id = 1) based on the pages liked by their friends. We need to exclude pages the user already likes. The solution utilizes SQL queries to achieve this efficiently. Both solutions presented leverage the Friendship and Likes tables.

Approach

Both solutions follow a similar strategy:

  1. Identify Friends: Find all users who are friends with user_id = 1. This is done using UNION or UNION ALL to combine results from both sides of the Friendship table (since friendship is bidirectional).

  2. Find Liked Pages of Friends: Join the results from step 1 with the Likes table to find all pages liked by the identified friends.

  3. Exclude Already Liked Pages: Filter out pages that user_id = 1 already likes using a NOT IN subquery or a WHERE clause with a NOT EXISTS condition (demonstrated in alternative solutions not originally provided).

Solution 1: Using CTE (Common Table Expression)

This solution uses a Common Table Expression (CTE) named T to store the friends of user 1. This improves readability and potentially query optimization.

MySQL Code:

WITH T AS (
    SELECT user1_id AS user_id FROM Friendship WHERE user2_id = 1
    UNION
    SELECT user2_id AS user_id FROM Friendship WHERE user1_id = 1
)
SELECT DISTINCT page_id AS recommended_page
FROM T JOIN Likes USING (user_id)
WHERE page_id NOT IN (SELECT page_id FROM Likes WHERE user_id = 1);

Explanation:

  • WITH T AS (...): Defines a CTE named T containing the user IDs of friends of user 1. The UNION combines results from both columns of the Friendship table.
  • SELECT DISTINCT page_id AS recommended_page: Selects the unique page IDs and renames the column to recommended_page.
  • FROM T JOIN Likes USING (user_id): Joins the T CTE with the Likes table using the user_id column. This efficiently links friends to the pages they liked.
  • WHERE page_id NOT IN (SELECT page_id FROM Likes WHERE user_id = 1): Filters out pages already liked by user 1. This ensures recommendations are novel.

Solution 2: Direct Subquery

This solution integrates the friend identification directly into the main query using subqueries.

MySQL Code:

SELECT DISTINCT page_id AS recommended_page
FROM Likes
WHERE user_id IN (
    SELECT user1_id AS user_id FROM Friendship WHERE user2_id = 1
    UNION ALL
    SELECT user2_id AS user_id FROM Friendship WHERE user1_id = 1
)
AND page_id NOT IN (SELECT page_id FROM Likes WHERE user_id = 1);

Explanation:

  • The WHERE clause contains two conditions:
    • user_id IN (...): Checks if the user_id is in the set of friends (obtained via the subquery).
    • AND page_id NOT IN (...): Excludes pages already liked by user 1 (using another subquery).

Time Complexity Analysis

Both solutions have similar time complexities. The dominant factor is the nested subqueries (or join in Solution 1) which depends on the size of the Friendship and Likes tables.

  • Worst-Case Time Complexity: O(F * L), where F is the number of rows in Friendship and L is the number of rows in Likes. This is because the join and subquery operations could potentially require iterating through a significant portion of both tables in the worst case. However, database optimizers often reduce this complexity in practice. Index usage on user_id and page_id columns will significantly improve performance.

  • Space Complexity: The space complexity is dominated by the temporary results created during query execution. In the worst case, this could be proportional to the size of the input tables, but efficient database systems will minimize this.

In summary, both solutions effectively solve the problem. Solution 1, using a CTE, often enhances readability and may help the database optimizer generate a more efficient execution plan. The choice between them is largely a matter of preference and database-specific optimization considerations. Appropriate indexing on the database tables is crucial for optimal performance, regardless of which solution is chosen.