{x}
blog image

Bank Account Summary II

Table: Users

+--------------+---------+
| Column Name  | Type    |
+--------------+---------+
| account      | int     |
| name         | varchar |
+--------------+---------+
account is the primary key (column with unique values) for this table.
Each row of this table contains the account number of each user in the bank.
There will be no two users having the same name in the table.

 

Table: Transactions

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| trans_id      | int     |
| account       | int     |
| amount        | int     |
| transacted_on | date    |
+---------------+---------+
trans_id is the primary key (column with unique values) for this table.
Each row of this table contains all changes made to all accounts.
amount is positive if the user received money and negative if they transferred money.
All accounts start with a balance of 0.

 

Write a solution to report the name and balance of users with a balance higher than 10000. The balance of an account is equal to the sum of the amounts of all transactions involving that account.

Return the result table in any order.

The result format is in the following example.

 

Example 1:

Input: 
Users table:
+------------+--------------+
| account    | name         |
+------------+--------------+
| 900001     | Alice        |
| 900002     | Bob          |
| 900003     | Charlie      |
+------------+--------------+
Transactions table:
+------------+------------+------------+---------------+
| trans_id   | account    | amount     | transacted_on |
+------------+------------+------------+---------------+
| 1          | 900001     | 7000       |  2020-08-01   |
| 2          | 900001     | 7000       |  2020-09-01   |
| 3          | 900001     | -3000      |  2020-09-02   |
| 4          | 900002     | 1000       |  2020-09-12   |
| 5          | 900003     | 6000       |  2020-08-07   |
| 6          | 900003     | 6000       |  2020-09-07   |
| 7          | 900003     | -4000      |  2020-09-11   |
+------------+------------+------------+---------------+
Output: 
+------------+------------+
| name       | balance    |
+------------+------------+
| Alice      | 11000      |
+------------+------------+
Explanation: 
Alice's balance is (7000 + 7000 - 3000) = 11000.
Bob's balance is 1000.
Charlie's balance is (6000 + 6000 - 4000) = 8000.

Solution Explanation

This problem requires querying data from two tables, Users and Transactions, to determine the balance of each user's account and then filter for users with a balance greater than 10000. The solution uses SQL's JOIN, SUM, GROUP BY, and HAVING clauses.

Approach:

  1. JOIN: We perform an INNER JOIN between the Users and Transactions tables using the account column as the common key. This combines the user's name with their transaction data. USING (account) is a shorthand for ON Users.account = Transactions.account.

  2. SUM and GROUP BY: The SUM(amount) function calculates the total amount for each account. The GROUP BY account clause groups the transactions by account number, ensuring that the SUM function operates independently for each user.

  3. HAVING: The HAVING balance > 10000 clause filters the results, keeping only those accounts (and their corresponding user names) where the calculated balance exceeds 10000. HAVING is used because the filtering is based on an aggregate function (SUM). A WHERE clause would not work here because it filters before the aggregation.

Time Complexity Analysis:

The time complexity of the SQL query depends on the size of the input tables. The JOIN operation has a time complexity of O(M * N) in the worst case (nested loop join), where M and N are the number of rows in the Users and Transactions tables, respectively. However, database systems often use optimized join algorithms (e.g., hash joins, merge joins) that can significantly reduce this complexity. The GROUP BY and SUM operations have a time complexity of O(K log K) or O(K), where K is the number of distinct account numbers, depending on the database system's implementation. The HAVING clause adds a linear time complexity, O(K). Therefore, the overall time complexity is dominated by the join operation and is roughly O(M * N) in the worst case, but practically much faster with optimized database implementations.

Space Complexity Analysis:

The space complexity is determined by the amount of intermediate data stored during the query execution. The space complexity of the JOIN operation is O(M + N) in the worst case if we assume a hash join, where M and N are the number of rows in the Users and Transactions tables. The space used for GROUP BY and SUM depends on the number of distinct accounts, which is O(K). The final result set is also stored in memory and contributes to the space complexity. Therefore, the overall space complexity is roughly O(M + N + K).

MySQL Code:

SELECT
    name,
    SUM(amount) AS balance
FROM
    Users
    JOIN Transactions USING (account)
GROUP BY account
HAVING balance > 10000;

This solution is efficient for the given problem because it uses SQL's built-in capabilities for data manipulation and aggregation. More complex solutions involving procedural programming would be less efficient.