Table: Queries
+-------------+---------+ | Column Name | Type | +-------------+---------+ | query_name | varchar | | result | varchar | | position | int | | rating | int | +-------------+---------+ This table may have duplicate rows. This table contains information collected from some queries on a database. Theposition
column has a value from 1 to 500. Therating
column has a value from 1 to 5. Query withrating
less than 3 is a poor query.
We define query quality
as:
The average of the ratio between query rating and its position.
We also define poor query percentage
as:
The percentage of all queries with rating less than 3.
Write a solution to find each query_name
, the quality
and poor_query_percentage
.
Both quality
and poor_query_percentage
should be rounded to 2 decimal places.
Return the result table in any order.
The result format is in the following example.
Example 1:
Input: Queries table: +------------+-------------------+----------+--------+ | query_name | result | position | rating | +------------+-------------------+----------+--------+ | Dog | Golden Retriever | 1 | 5 | | Dog | German Shepherd | 2 | 5 | | Dog | Mule | 200 | 1 | | Cat | Shirazi | 5 | 2 | | Cat | Siamese | 3 | 3 | | Cat | Sphynx | 7 | 4 | +------------+-------------------+----------+--------+ Output: +------------+---------+-----------------------+ | query_name | quality | poor_query_percentage | +------------+---------+-----------------------+ | Dog | 2.50 | 33.33 | | Cat | 0.66 | 33.33 | +------------+---------+-----------------------+ Explanation: Dog queries quality is ((5 / 1) + (5 / 2) + (1 / 200)) / 3 = 2.50 Dog queries poor_ query_percentage is (1 / 3) * 100 = 33.33 Cat queries quality equals ((2 / 5) + (3 / 3) + (4 / 7)) / 3 = 0.66 Cat queries poor_ query_percentage is (1 / 3) * 100 = 33.33
This problem requires calculating two metrics for each query name in the Queries
table: quality
and poor_query_percentage
.
1. Quality Calculation:
The quality
is defined as the average of the ratio between the query's rating and its position. For each query, we need to:
rating / position
for each row.query_name
.2. Poor Query Percentage Calculation:
The poor_query_percentage
represents the percentage of queries with a rating less than 3 for each query_name
. We need to:
rating < 3
for each query_name
.query_name
.SQL Solution:
The SQL solution leverages the power of aggregate functions like AVG
, SUM
, COUNT
, and ROUND
. The GROUP BY
clause is crucial for performing these calculations separately for each query_name
.
MySQL Code:
The provided MySQL code efficiently computes both metrics. Let's break it down:
SELECT
query_name,
ROUND(AVG(rating / position), 2) AS quality,
ROUND(AVG(rating < 3) * 100, 2) AS poor_query_percentage
FROM Queries
WHERE query_name IS NOT NULL
GROUP BY 1;
SELECT query_name, ...
: This selects the query_name
and the calculated metrics.ROUND(AVG(rating / position), 2)
: This calculates the average of rating / position
for each query_name
and rounds it to two decimal places. The AVG
function computes the average of the ratios.ROUND(AVG(rating < 3) * 100, 2)
: This is a clever way to compute the percentage of poor queries. The expression rating < 3
evaluates to 1 (true) or 0 (false). AVG
then computes the average of these 1s and 0s, which represents the fraction of poor queries. Multiplying by 100 gives the percentage, and ROUND
rounds to two decimal places.FROM Queries
: This specifies the table to query.WHERE query_name IS NOT NULL
: This filters out any rows where query_name
is null, ensuring accurate calculations.GROUP BY 1
: This groups the results by query_name
, so the aggregations are performed for each unique query name (1 refers to the first column in the SELECT statement).Time Complexity Analysis:
The time complexity of this SQL query is dominated by the GROUP BY
operation. In the worst case, if there are n rows in the Queries
table, the GROUP BY
operation will take O(n) time to group the rows based on query_name
. The aggregate functions (AVG
, SUM
, COUNT
) then take linear time proportional to the number of rows within each group. Therefore, the overall time complexity is O(n), where n is the number of rows in the Queries
table. The sorting involved in GROUP BY
is typically done using efficient algorithms, making the overall time complexity linear on average.
Space Complexity Analysis:
The space complexity depends primarily on the size of the output. In the worst case, if all query_name
values are unique, the output will have a size proportional to the number of unique query names in the table. The intermediate space required for the aggregation process also depends on the size of the input, but it is generally considered linear with respect to input size. Hence, the space complexity is O(m) where m is the number of unique query_name
values. In the worst-case scenario where every row has a unique query_name
, this simplifies to O(n).