Table: Users
+--------------+---------+ | Column Name | Type | +--------------+---------+ | account | int | | name | varchar | +--------------+---------+ account is the primary key (column with unique values) for this table. Each row of this table contains the account number of each user in the bank. There will be no two users having the same name in the table.
Table: Transactions
+---------------+---------+ | Column Name | Type | +---------------+---------+ | trans_id | int | | account | int | | amount | int | | transacted_on | date | +---------------+---------+ trans_id is the primary key (column with unique values) for this table. Each row of this table contains all changes made to all accounts. amount is positive if the user received money and negative if they transferred money. All accounts start with a balance of 0.
Write a solution to report the name and balance of users with a balance higher than 10000
. The balance of an account is equal to the sum of the amounts of all transactions involving that account.
Return the result table in any order.
The result format is in the following example.
Example 1:
Input: Users table: +------------+--------------+ | account | name | +------------+--------------+ | 900001 | Alice | | 900002 | Bob | | 900003 | Charlie | +------------+--------------+ Transactions table: +------------+------------+------------+---------------+ | trans_id | account | amount | transacted_on | +------------+------------+------------+---------------+ | 1 | 900001 | 7000 | 2020-08-01 | | 2 | 900001 | 7000 | 2020-09-01 | | 3 | 900001 | -3000 | 2020-09-02 | | 4 | 900002 | 1000 | 2020-09-12 | | 5 | 900003 | 6000 | 2020-08-07 | | 6 | 900003 | 6000 | 2020-09-07 | | 7 | 900003 | -4000 | 2020-09-11 | +------------+------------+------------+---------------+ Output: +------------+------------+ | name | balance | +------------+------------+ | Alice | 11000 | +------------+------------+ Explanation: Alice's balance is (7000 + 7000 - 3000) = 11000. Bob's balance is 1000. Charlie's balance is (6000 + 6000 - 4000) = 8000.
This problem requires querying data from two tables, Users
and Transactions
, to determine the balance of each user's account and then filter for users with a balance greater than 10000. The solution uses SQL's JOIN
, SUM
, GROUP BY
, and HAVING
clauses.
Approach:
JOIN: We perform an INNER JOIN
between the Users
and Transactions
tables using the account
column as the common key. This combines the user's name with their transaction data. USING (account)
is a shorthand for ON Users.account = Transactions.account
.
SUM and GROUP BY: The SUM(amount)
function calculates the total amount for each account. The GROUP BY account
clause groups the transactions by account number, ensuring that the SUM
function operates independently for each user.
HAVING: The HAVING balance > 10000
clause filters the results, keeping only those accounts (and their corresponding user names) where the calculated balance exceeds 10000. HAVING
is used because the filtering is based on an aggregate function (SUM
). A WHERE
clause would not work here because it filters before the aggregation.
Time Complexity Analysis:
The time complexity of the SQL query depends on the size of the input tables. The JOIN
operation has a time complexity of O(M * N) in the worst case (nested loop join), where M and N are the number of rows in the Users
and Transactions
tables, respectively. However, database systems often use optimized join algorithms (e.g., hash joins, merge joins) that can significantly reduce this complexity. The GROUP BY
and SUM
operations have a time complexity of O(K log K) or O(K), where K is the number of distinct account numbers, depending on the database system's implementation. The HAVING
clause adds a linear time complexity, O(K). Therefore, the overall time complexity is dominated by the join operation and is roughly O(M * N) in the worst case, but practically much faster with optimized database implementations.
Space Complexity Analysis:
The space complexity is determined by the amount of intermediate data stored during the query execution. The space complexity of the JOIN
operation is O(M + N) in the worst case if we assume a hash join, where M and N are the number of rows in the Users
and Transactions
tables. The space used for GROUP BY
and SUM
depends on the number of distinct accounts, which is O(K). The final result set is also stored in memory and contributes to the space complexity. Therefore, the overall space complexity is roughly O(M + N + K).
MySQL Code:
SELECT
name,
SUM(amount) AS balance
FROM
Users
JOIN Transactions USING (account)
GROUP BY account
HAVING balance > 10000;
This solution is efficient for the given problem because it uses SQL's built-in capabilities for data manipulation and aggregation. More complex solutions involving procedural programming would be less efficient.