Snowflake Regular Expressions for Effective Validation
When working with user data, email validation is crucial to ensure clean, consistent, and valid information.
Recently, I faced an interesting challenge where I had to validate email addresses in SQL using regular expressions.
Creating the USERS Table
First, let’s create a temporary USERS table to store our users' email addresses:
create or replace temp table "USERS" ( "USER_ID" int autoincrement primary key, "EMAIL" varchar);
Next, we populate this table with a few test email addresses, including valid, invalid, subdomains, and plus-addressing (used by some services for email filtering):
insert overwrite into "USERS" ("EMAIL")values ('[email protected]'), ('[email protected]'), ('[email protected]'), ('[email protected]'), ('[email protected]'), -- Plus-addressing test case ('[email protected]'), -- Subdomain test case ('user@invalid_domain@com'), -- Invalid case ('[email protected]');
The Regular Expression
Now, we get to the core of the solution: the regular expression to validate the email addresses.
select , regexp_like("EMAIL", $$[a-zA-Z0-9.-]+[+]?[a-zA-Z0-9.-]@[a-zA-Z0-9.-]+[.][a-zA-Z0-9]{2,6}$$) as "IS_VALID_EMAIL"from "USERS";
This SQL query checks whether each email follows a valid pattern using the regexp_like() function. Let's break down the regular expression pattern:
Running the Validation
When we run the query, we get a table of email addresses and whether they pass the regex validation