Table: Views
+---------------+---------+ | Column Name | Type | +---------------+---------+ | article_id | int | | author_id | int | | viewer_id | int | | view_date | date | +---------------+---------+ There is no primary key (column with unique values) for this table, the table may have duplicate rows. Each row of this table indicates that some viewer viewed an article (written by some author) on some date. Note that equal author_id and viewer_id indicate the same person.
Write a solution to find all the authors that viewed at least one of their own articles.
Return the result table sorted by id
in ascending order.
The result format is in the following example.
Example 1:
Input: Views table: +------------+-----------+-----------+------------+ | article_id | author_id | viewer_id | view_date | +------------+-----------+-----------+------------+ | 1 | 3 | 5 | 2019-08-01 | | 1 | 3 | 6 | 2019-08-02 | | 2 | 7 | 7 | 2019-08-01 | | 2 | 7 | 6 | 2019-08-02 | | 4 | 7 | 1 | 2019-07-22 | | 3 | 4 | 4 | 2019-07-21 | | 3 | 4 | 4 | 2019-07-21 | +------------+-----------+-----------+------------+ Output: +------+ | id | +------+ | 4 | | 7 | +------+
This problem requires finding authors who have viewed their own articles. The solution leverages the Views
table's structure to identify matching author_id
and viewer_id
entries.
The core idea is to filter the Views
table to only include rows where the author_id
and viewer_id
are identical. This indicates that an author viewed their own article. We then select the distinct author_id
values (to avoid duplicates) and order the results for consistency.
SELECT DISTINCT author_id AS id
FROM Views
WHERE author_id = viewer_id
ORDER BY 1;
Explanation:
SELECT DISTINCT author_id AS id
: This selects the unique author_id
values and renames the column to id
for clarity. The DISTINCT
keyword ensures that each author appears only once in the result, even if they viewed multiple of their own articles.
FROM Views
: This specifies that the data should be retrieved from the Views
table.
WHERE author_id = viewer_id
: This crucial clause filters the rows, keeping only those where the author ID and viewer ID match. This identifies authors who viewed their articles.
ORDER BY 1
: This sorts the resulting id
column in ascending order. 1
refers to the first column in the SELECT
statement, which is id
.
Time Complexity: O(N log N), where N is the number of rows in the Views
table. The dominant operation is the sorting (ORDER BY
), which typically has a time complexity of O(N log N) for efficient sorting algorithms used in database systems. The filtering (WHERE
clause) is likely optimized within the database engine and has a time complexity proportional to the number of rows it scans, but it's generally outweighed by the sorting.
Space Complexity: O(M), where M is the number of distinct authors who viewed their articles. This space is used to store the intermediate result set before sorting and output. In the worst case, M could be close to N, but it is expected to be significantly smaller in most cases (assuming not every author views their own articles). The space used by the database engine for internal processing is not usually considered in Big O space complexity analysis.
This solution is efficient because it directly targets the required information using a simple WHERE
clause and DISTINCT
for efficient data retrieval and output formatting. The database's internal query optimizer further helps in managing the time and space efficiency.