{x}
blog image

Suspicious Bank Accounts

Solution Explanation for LeetCode 1843: Suspicious Bank Accounts

This problem requires identifying suspicious bank accounts based on their transaction history. An account is deemed suspicious if its total income exceeds the max_income for two or more consecutive months. The solution involves querying two tables: Accounts (containing account IDs and maximum income limits) and Transactions (containing transaction details).

Approach:

Both solutions leverage the power of SQL's window functions and aggregate functions to efficiently process the data. The core idea is to:

  1. Calculate monthly income: For each account, determine the total income for each month.
  2. Compare with max_income: Check if the monthly income surpasses the account's max_income limit.
  3. Identify consecutive months: Detect if the income exceeded the limit for at least two consecutive months.

Solution 1 (Using Window Functions and Timestamp Difference):

This solution uses a Common Table Expression (CTE) called S.

  • S CTE: This selects the account ID, the beginning of the month (DATE_FORMAT(day, '%Y-%m-01')), the transaction ID, and a boolean flag marked indicating whether the monthly income exceeded max_income. The SUM(amount) OVER (...) is a window function that calculates the sum of amount for each account and month.

  • Main Query: This joins S with itself (s1 and s2) to compare consecutive months. TIMESTAMPDIFF(Month, s1.day, s2.day) = 1 ensures that we're comparing adjacent months. The WHERE clause filters for cases where both consecutive months have marked = 1.

Time Complexity Analysis (Solution 1):

The time complexity is dominated by the self-join in the main query. In the worst case, if there are N transactions, the self-join might have a complexity close to O(N^2). However, because the join is on account_id and month, and the data is likely to be clustered, the practical performance will be significantly better than O(N^2). The window functions used within the CTE have a time complexity proportional to the number of transactions in each month, which is typically much smaller than the total number of transactions. The overall runtime is highly dependent on the database's query optimizer and indexing.

Solution 2 (Using PERIOD_ADD and Grouping):

This solution is more concise. The CTE S calculates monthly income for each account and filters out months where the total income does not exceed max_income. It uses GROUP BY and HAVING clause for efficiency. The main query then checks for consecutive months using PERIOD_ADD to add one month to the yearmonth value.

Time Complexity Analysis (Solution 2):

The time complexity is also highly dependent on the database's query optimizer and indexing, as it involves grouping and joining. A well-optimized database with appropriate indexes could reduce the complexity to something close to O(N log N) or even O(N) in many practical scenarios (N being the number of transactions). The worst-case complexity, however, remains comparable to Solution 1.

MySQL Code (Solution 1):

WITH
    S AS (
        SELECT DISTINCT
            t.account_id,
            DATE_FORMAT(day, '%Y-%m-01') AS day,
            transaction_id AS tx,
            SUM(amount) OVER (
                PARTITION BY account_id, DATE_FORMAT(day, '%Y-%m-01')
            ) > max_income AS marked
        FROM
            Transactions AS t
            LEFT JOIN Accounts AS a ON t.account_id = a.account_id
        WHERE type = 'Creditor'
    )
SELECT DISTINCT s1.account_id
FROM
    S AS s1
    LEFT JOIN S AS s2 ON s1.account_id = s2.account_id AND TIMESTAMPDIFF(Month, s1.day, s2.day) = 1
WHERE s1.marked = 1 AND s2.marked = 1
ORDER BY s1.tx;

MySQL Code (Solution 2):

WITH
    S AS (
        SELECT
            account_id,
            DATE_FORMAT(day, '%Y%m') AS yearmonth,
            transaction_id AS tx
        FROM
            Transactions
            JOIN Accounts USING (account_id)
        WHERE type = 'Creditor'
        GROUP BY account_id, yearmonth
        HAVING SUM(amount) > AVG(max_income)
    )
SELECT DISTINCT account_id
FROM S
WHERE (account_id, PERIOD_ADD(yearmonth, 1)) IN (SELECT account_id, yearmonth FROM S)
ORDER BY tx;

Both solutions provide correct results, but their efficiency can vary based on database optimization and data distribution. Solution 2 might be slightly more efficient in some cases due to its more concise approach. Proper indexing on account_id and day in the Transactions table is crucial for optimal performance in both cases.