Handy Snowflake functions

There are many very handy functions in Snowflake.

1. LIKE ALL and LIKE ANY

Let’s start with LIKE ALL. How often do you write: WHERE Column1 LIKE '%X%' AND Column1 LIKE '%Y%'? Well we have a function for that in Snowflake. It’s like this: WHERE Column1 LIKE ALL ('%X%', '%Y%'). Handy isn’t it? Obviously in Snowflake strings are case sensitive so we need to use UPPER(). And we have LIKE ANY too, meaning it’s matching ANY of the items in the brackets.

We also have ILIKE and ILIKE ANY, for case insensitive comparison. Which is also handy.

2. DECODE, IIF, GREATEST

I write CASE WHEN like this quite often:

SELECT CASE WHEN Column1 = 1 THEN 'X' 
            WHEN Column1 = 2 THEN 'Y' 
            ELSE 'Z' END         

But we have DECODE in Snowflake, like this: SELECT DECODE(Column1, 1, 'X', 2, 'Y', 'Z').

What's the difference? Decode is a function, whereas CASE WHEN is an expression. So Decode is handy when you want to combine it with another function. And it’s shorter too. Obviously if it is more than one variable/column then we still need to CASE WHEN.

Similar to that is the IFF function. That second F stands for “Function” (because we have IF statement in Snowflake). Like this: IIF(Column1 = 1, 'A', 'B'). So if Column1 = 1 it returns A, otherwise it returns B. Simple isn’t it?

Talking about comparing stuff (string, date, number), we have GREATEST in Snowflake. It takes many arguments, not just two. Like this:

·??????? GREATEST(1,2,3) – it returns 3.

·??????? GREATEST('a','A','b') – it returns b. Because lower case has higher ASCII.

·??????? GREATEST(TO_DATE('2024-08-21'),TO_DATE('2024-07-13')) – it returns 21/8/2024.

GREATEST is handy because it supports any data type including variant. Obviously it has to be the same data type. You can’t compare a date to a string or to a number. And we have GREATEST_IGNORE_NULL too, which is very handy as it ignores the NULLs.

3. LAST_DAY and NEXT_DAY

When I need to find out the end of the month I used to start from the first day of the following month, then substract one day. Well we have a function for that in Snowflake. It's called LAST_DAY. You can use it to find the last day of the month, or the last day of the quarter, or of the year. Like this: LAST_DAY({date}). If you put today's date for example (28th July 2024) it will return 31st July 2024. Handy isn't it? To return the end of quarter, put 'quarter' as the second parameter: LAST_DAY({date}, 'quarter').

Snowflake also have another handy date function called next_day, which return the date of next Thursday, or next Monday, etc. Like this next_day({date}, 'Thursday'). For example, if we put in today's date (28th July 2024) it will return 1st August 2024, the date. You don't need to type "Thursday", you can just type "Thu" it will understand.

4. EQUAL_NULL and NVL

In the WHERE clause we often compare two columns (could be from different tables). And we always have to use IFNULL or COALESCE, in case the data contains NULL. So it’s like this: WHERE COALESCE(Column1,'') = COALESCE(Column2,''). In Snowflake we have a function for that. It’s called EQUAL_NULL. Like this: WHERE EQUAL_NULL(Column1,Column2). If Column1 is NULL but Column2 is not the output is FALSE. The same thing with the reverse (Column2 is NULL but Column1 is not). But if Column1 and Column2 are both NULL, then the output is TRUE. Which is what we want. So this EQUAL_NULL function is handy.

There is a similar function to that, called IS_DISTINCT_FROM. I’ll let you have a look.

We often use COALESCE(Column1, Column2) which means that if Column1 is NULL, then use Column2. Well there is a function for that in Snowflake. It’s called NVL, which stands for NULL Value Logic. Same usage, i.e. NVL(Column1, Column2). But we also have NVL2 like this: NVL2(Column1, Column2, Column3). If Column1 is NULL, it returns Column3, otherwise it returns Column2.

5. EditDistance and JaroWinkler

I’ll close this article with EditDistance, which is very useful when comparing strings. It’s like this: EDITDISTANCE('Night', 'Lighter'). It returns how many changes we need to do to change Night to become Light. In this case it is 3: change N to L, add e, add r.

A more sophisticated version of it is JaroWinkler_Similarity, which returns the similarity percentage. JAROWINKLER_SIMILARITY('Night', 'Lighter') – the output is 79%. Whereas between Night and Light it is 86%.

6. CONCAT, || and CONCAT_WS

Last one, I promise. For string concatenation you can do CONCAT('A', 'B', 'C') – which returns ABC. But you can also do 'A' || 'B' || 'C' which also returns ABC.

And in Snowflake we also have CONCAT_WS. WS stands for “with separator”. It’s like CONCAT but the first argument is the separator. For example if we do CONCAT_WS(',', 'A', 'B', 'C') it returns A,B,C. So it puts comma between the items.

That’s all folks. Happy learning!

要查看或添加评论,请登录

Vincent Rainardi的更多文章

  • DQ Engineering

    DQ Engineering

    DQ stands for Data Quality. If you don't have a background in data quality, read this first: https://www.

    2 条评论
  • Data Product

    Data Product

    For those of you who don't know what a data product and “data as a product” are, please read this first:…

    13 条评论
  • Snowflake vs SQL Server

    Snowflake vs SQL Server

    Sometimes we need to remind ourselves that Snowflake is not an OLTP database. I know today is the era of Hybrid tables…

    6 条评论
  • Data engineer becoming solution architect

    Data engineer becoming solution architect

    Are you a data engineer thinking about transitioning to a cloud solution architect? Data engineer are good with…

    2 条评论
  • Asset Mgt vs Fund Mgt vs Investment Mgt vs Wealth Mgt: What's the difference?

    Asset Mgt vs Fund Mgt vs Investment Mgt vs Wealth Mgt: What's the difference?

    If you work in banking or investment or any other sector in financial services, you might be wondering about the above.…

  • Data Warehousing Basics: Cost

    Data Warehousing Basics: Cost

    If you call yourself a data engineer you need to be aware of 2 additional things compared to a developer. The first one…

    2 条评论
  • My Linkedin post & articles

    My Linkedin post & articles

    The list below goes back to Nov 2024. For older than that see here.

    9 条评论
  • Data Warehousing Basics: Single Customer View

    Data Warehousing Basics: Single Customer View

    Imagine that you work for an insurance company who sell health insurance (HI), life insurance (LI), general insurance…

    2 条评论
  • Data Warehousing Basics: NFR

    Data Warehousing Basics: NFR

    What I’m about to tell you today failed a lot of data warehousing projects which is why it’s worth paying attention so…

    1 条评论
  • ML and AI - What's the difference?

    ML and AI - What's the difference?

    Machine Learning covers about 20-30 algorithms such as Logistic Regression, Decision Tree, Gradient Boosting, Random…

    4 条评论

社区洞察

其他会员也浏览了