{x}
blog image

Combine Two Tables

Table: Person

+-------------+---------+
| Column Name | Type    |
+-------------+---------+
| personId    | int     |
| lastName    | varchar |
| firstName   | varchar |
+-------------+---------+
personId is the primary key (column with unique values) for this table.
This table contains information about the ID of some persons and their first and last names.

 

Table: Address

+-------------+---------+
| Column Name | Type    |
+-------------+---------+
| addressId   | int     |
| personId    | int     |
| city        | varchar |
| state       | varchar |
+-------------+---------+
addressId is the primary key (column with unique values) for this table.
Each row of this table contains information about the city and state of one person with ID = PersonId.

 

Write a solution to report the first name, last name, city, and state of each person in the Person table. If the address of a personId is not present in the Address table, report null instead.

Return the result table in any order.

The result format is in the following example.

 

Example 1:

Input: 
Person table:
+----------+----------+-----------+
| personId | lastName | firstName |
+----------+----------+-----------+
| 1        | Wang     | Allen     |
| 2        | Alice    | Bob       |
+----------+----------+-----------+
Address table:
+-----------+----------+---------------+------------+
| addressId | personId | city          | state      |
+-----------+----------+---------------+------------+
| 1         | 2        | New York City | New York   |
| 2         | 3        | Leetcode      | California |
+-----------+----------+---------------+------------+
Output: 
+-----------+----------+---------------+----------+
| firstName | lastName | city          | state    |
+-----------+----------+---------------+----------+
| Allen     | Wang     | Null          | Null     |
| Bob       | Alice    | New York City | New York |
+-----------+----------+---------------+----------+
Explanation: 
There is no address in the address table for the personId = 1 so we return null in their city and state.
addressId = 1 contains information about the address of personId = 2.

Solution Explanation: Combining Two Tables

This problem requires combining data from two tables, Person and Address, to produce a result showing each person's first name, last name, city, and state. The challenge lies in handling cases where a person in the Person table doesn't have a corresponding entry in the Address table.

Approach:

The most efficient approach is to use a LEFT JOIN operation in SQL. A LEFT JOIN ensures that all rows from the left table (in this case, Person) are included in the result. If a personId from the Person table doesn't have a match in the Address table, the corresponding city and state columns in the result will be filled with NULL.

Time and Space Complexity Analysis:

The time complexity of the LEFT JOIN operation depends on the database system's implementation but generally scales proportionally to the size of the input tables. In the worst case, it could be O(M * N), where M is the number of rows in the Person table and N is the number of rows in the Address table (if no efficient indexes are used). However, with appropriate indexes (e.g., an index on personId in both tables), the join operation typically runs much faster, often closer to O(M + N).

The space complexity is determined by the size of the output table, which is at most the size of the Person table. Thus, the space complexity is O(M).

Code Implementation:

The provided solutions demonstrate the LEFT JOIN approach using both Python with pandas (for data manipulation in a non-database context) and MySQL (for direct database query).

Python (using pandas):

import pandas as pd
 
 
def combine_two_tables(person: pd.DataFrame, address: pd.DataFrame) -> pd.DataFrame:
    return pd.merge(left=person, right=address, how="left", on="personId")[
        ["firstName", "lastName", "city", "state"]
    ]

This Python code leverages the pandas library. pd.merge performs the LEFT JOIN operation, matching rows based on the personId column. The how="left" argument specifies a left join. Finally, it selects only the desired columns (firstName, lastName, city, state).

MySQL:

SELECT firstName, lastName, city, state
FROM
    Person
    LEFT JOIN Address USING (personId);

This SQL query directly performs a LEFT JOIN using the USING clause, which implicitly joins on the common column personId. The SELECT statement specifies the columns to be included in the output. This is concise and efficient for database operations.

In summary, both the Python and MySQL solutions effectively combine data from two tables while gracefully handling missing addresses, resulting in a complete representation of the combined information. The LEFT JOIN is the core technique that guarantees the inclusion of all persons and efficiently manages the potential absence of corresponding address information.