{x}
blog image

Number of Comments per Post

Solution Explanation for LeetCode 1241: Number of Comments per Post

This problem requires querying a database table (Submissions) to determine the number of unique comments associated with each post. The solution involves using SQL's JOIN, COUNT, GROUP BY, and DISTINCT functionalities.

Understanding the Data

The Submissions table has two columns:

  • sub_id: A unique identifier for each submission (post or comment).
  • parent_id: If a submission is a comment, this column holds the sub_id of the post it's commenting on. parent_id is NULL for posts.

Approach

The core idea is to identify posts (submissions with parent_id as NULL) and then count the unique comments associated with each post. This is achieved through a LEFT JOIN to connect posts with their corresponding comments. The DISTINCT keyword ensures we only count unique comments, even if the same comment appears multiple times in the table.

SQL Solution (MySQL)

WITH
    t AS (
        SELECT DISTINCT s1.sub_id AS post_id, s2.sub_id AS sub_id
        FROM
            Submissions AS s1
            LEFT JOIN Submissions AS s2 ON s1.sub_id = s2.parent_id
        WHERE s1.parent_id IS NULL
    )
SELECT post_id, COUNT(sub_id) AS number_of_comments
FROM t
GROUP BY post_id
ORDER BY post_id;

Code Breakdown:

  1. WITH t AS (...) (Common Table Expression): This creates a temporary named result set called t. This improves readability and simplifies the main query.

  2. SELECT DISTINCT s1.sub_id AS post_id, s2.sub_id AS sub_id: This selects the sub_id from the Submissions table (aliased as s1) as post_id representing the post IDs. It also selects the sub_id from the Submissions table (aliased as s2) as sub_id, representing the comment IDs. DISTINCT ensures that duplicate comments are not counted.

  3. FROM Submissions AS s1 LEFT JOIN Submissions AS s2 ON s1.sub_id = s2.parent_id: This performs a LEFT JOIN between the Submissions table (aliased twice as s1 and s2). The LEFT JOIN ensures that all posts are included in the result, even those without comments. The JOIN condition s1.sub_id = s2.parent_id links posts (s1.sub_id) with their comments (s2.sub_id).

  4. WHERE s1.parent_id IS NULL: This filters the results to include only rows where s1.parent_id is NULL, which identifies the actual posts.

  5. SELECT post_id, COUNT(sub_id) AS number_of_comments FROM t GROUP BY post_id: This selects post_id and counts the number of comments (COUNT(sub_id)) for each post. The GROUP BY post_id clause groups the results by post ID.

  6. ORDER BY post_id: This sorts the final results in ascending order of post_id.

Time Complexity Analysis

The time complexity of this SQL query depends on the database engine's optimization strategies, but it's generally considered to be proportional to the size of the Submissions table (O(N), where N is the number of rows in the Submissions table). The JOIN operation and subsequent grouping can be optimized by database indexing, potentially reducing the time to near-linear or even logarithmic time in best-case scenarios with efficient indexing.

Space Complexity Analysis

The space complexity is also related to the size of the Submissions table. In the worst-case scenario, where the temporary table t needs to store all rows from the Submissions table (unlikely with proper database optimization), the space complexity would be O(N). In practice, it would likely be much less, particularly with efficient indexing and query optimization performed by the database engine.