{x}
blog image

Rising Temperature

Table: Weather

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| id            | int     |
| recordDate    | date    |
| temperature   | int     |
+---------------+---------+
id is the column with unique values for this table.
There are no different rows with the same recordDate.
This table contains information about the temperature on a certain day.

 

Write a solution to find all dates' id with higher temperatures compared to its previous dates (yesterday).

Return the result table in any order.

The result format is in the following example.

 

Example 1:

Input: 
Weather table:
+----+------------+-------------+
| id | recordDate | temperature |
+----+------------+-------------+
| 1  | 2015-01-01 | 10          |
| 2  | 2015-01-02 | 25          |
| 3  | 2015-01-03 | 20          |
| 4  | 2015-01-04 | 30          |
+----+------------+-------------+
Output: 
+----+
| id |
+----+
| 2  |
| 4  |
+----+
Explanation: 
In 2015-01-02, the temperature was higher than the previous day (10 -> 25).
In 2015-01-04, the temperature was higher than the previous day (20 -> 30).

Solution Explanation for LeetCode 197: Rising Temperature

This problem requires finding all IDs in the Weather table where the temperature is higher than the temperature on the previous day. The solutions leverage SQL's capabilities for efficient data comparison.

Approach 1: Self-Join with Date Difference Calculation

This approach uses a self-join to compare each record with the preceding record. The key is using date difference functions (DATEDIFF in MySQL, .diff().dt.days in Pandas) to ensure only consecutive days are compared.

MySQL:

SELECT w1.id
FROM
    Weather AS w1
    JOIN Weather AS w2
        ON DATEDIFF(w1.recordDate, w2.recordDate) = 1 AND w1.temperature > w2.temperature;

This query joins the Weather table (aliased as w1) with itself (aliased as w2). The ON clause specifies the join condition:

  • DATEDIFF(w1.recordDate, w2.recordDate) = 1: This ensures that w1.recordDate is exactly one day after w2.recordDate.
  • w1.temperature > w2.temperature: This checks if the temperature on w1.recordDate is greater than the temperature on w2.recordDate.

The SELECT w1.id statement retrieves the ID from the w1 table (the later date) where both conditions are true.

Python (Pandas):

import pandas as pd
 
def rising_temperature(weather: pd.DataFrame) -> pd.DataFrame:
    weather.sort_values(by="recordDate", inplace=True)
    return weather[
        (weather.temperature.diff() > 0) & (weather.recordDate.diff().dt.days == 1)
    ][["id"]]

The Python solution uses the Pandas library. First, the DataFrame is sorted by recordDate. Then, .diff() calculates the difference between consecutive rows for both 'temperature' and 'recordDate'. The boolean indexing (weather.temperature.diff() > 0) & (weather.recordDate.diff().dt.days == 1) selects rows where the temperature difference is positive and the date difference is exactly one day. Finally, only the 'id' column is returned.

Approach 2: Self-Join with SUBDATE (MySQL)

This approach is similar to the first one but uses the SUBDATE function in MySQL to simplify the date comparison.

SELECT w1.id
FROM
    Weather AS w1
    JOIN Weather AS w2
        ON SUBDATE(w1.recordDate, 1) = w2.recordDate AND w1.temperature > w2.temperature;

SUBDATE(w1.recordDate, 1) subtracts one day from w1.recordDate. The join condition then directly compares this result with w2.recordDate.

Time Complexity Analysis

Both approaches have a similar time complexity. The self-join operation dominates the time complexity. In the worst case (where all temperatures are increasing and dates are consecutive), the time complexity is approximately O(N), where N is the number of rows in the Weather table. This is because each row might be compared to at most one other row. The sorting step in the Pandas solution adds a negligible O(N log N) factor if the data isn't already sorted. However, since database operations are involved, the actual runtime would depend on database optimization and indexing. In practice, a well-indexed database would execute these queries very efficiently.

Space Complexity Analysis

The space complexity is O(1) for both approaches, as they primarily use constant extra space for variables and intermediate results. The Pandas solution may use additional space for the DataFrame depending on its implementation, but this space is still considered relatively small compared to the input size.