Table: Users
+---------------+---------+ | Column Name | Type | +---------------+---------+ | user_id | int | | name | varchar | | mail | varchar | +---------------+---------+ user_id is the primary key (column with unique values) for this table. This table contains information of the users signed up in a website. Some e-mails are invalid.
Write a solution to find the users who have valid emails.
A valid e-mail has a prefix name and a domain where:
'_'
, period '.'
, and/or dash '-'
. The prefix name must start with a letter.'@leetcode.com'
.Return the result table in any order.
The result format is in the following example.
Example 1:
Input: Users table: +---------+-----------+-------------------------+ | user_id | name | mail | +---------+-----------+-------------------------+ | 1 | Winston | winston@leetcode.com | | 2 | Jonathan | jonathanisgreat | | 3 | Annabelle | bella-@leetcode.com | | 4 | Sally | sally.come@leetcode.com | | 5 | Marwan | quarz#2020@leetcode.com | | 6 | David | david69@gmail.com | | 7 | Shapiro | .shapo@leetcode.com | +---------+-----------+-------------------------+ Output: +---------+-----------+-------------------------+ | user_id | name | mail | +---------+-----------+-------------------------+ | 1 | Winston | winston@leetcode.com | | 3 | Annabelle | bella-@leetcode.com | | 4 | Sally | sally.come@leetcode.com | +---------+-----------+-------------------------+ Explanation: The mail of user 2 does not have a domain. The mail of user 5 has the # sign which is not allowed. The mail of user 6 does not have the leetcode domain. The mail of user 7 starts with a period.
This problem requires filtering users from a table based on the validity of their email addresses. The solution uses regular expressions to efficiently check the email format.
The core idea is to leverage a regular expression within an SQL query to filter the Users
table. The regular expression defines the pattern of a valid email address, and only rows matching this pattern are selected.
^[a-zA-Z][a-zA-Z0-9_.-]*@leetcode[.]com$
Let's break down the regular expression:
^
: Matches the beginning of the string. Ensures the pattern matches the entire email, not just a portion.[a-zA-Z]
: Matches a single letter (uppercase or lowercase) at the start of the email (prefix must begin with a letter).[a-zA-Z0-9_.-]*
: Matches zero or more occurrences of letters, digits, underscores, periods, or hyphens. This covers the rest of the prefix.@leetcode[.]com
: Matches the literal string "@leetcode.com". The .
is escaped because it's a special character in regex (it represents "any character").$
: Matches the end of the string. Ensures the entire email matches the pattern.SELECT *
FROM Users
WHERE mail REGEXP '^[a-zA-Z][a-zA-Z0-9_.-]*@leetcode[.]com$';
This SQL query selects all columns (*
) from the Users
table where the mail
column matches the regular expression. The REGEXP
operator is used for regular expression matching in MySQL.
Time Complexity: O(N), where N is the number of rows in the Users
table. The regular expression matching happens for each row, and the database engine will likely optimize this efficiently. The complexity is essentially linear with respect to the number of rows in the table.
Space Complexity: O(1). The space used by the query is constant, irrespective of the input size. It only needs to store the regular expression and the result set. This is negligible compared to the size of the Users
table itself.
This approach provides an efficient and concise solution to the problem, utilizing the built-in capabilities of SQL and regular expressions. Other database systems might have slightly different syntax for regular expression matching but the underlying approach remains the same.