Snowflake Shorts: Identifying Patterns and Trends from Queries
Divyansh Saxena
?? Snowflake Advanced Certified Architect ?? Snowflake Data SuperHero 2023-25 ?? | Snowflake Jaipur User Group Leader | Snowflake SnowPro SME | ex-IBM | 12X Multi-Hyperscalar Cloud ?? Certified | 7K Network @ LinkedIn
If you are a data enthusiast, I am sure that you must have been in a phase where you spent hours identifying queries with similar execution plans to understand the optimization scope. Snowflake now has a feature in PuPr, allowing us to identify patterns and trends from queries. Let's discuss more about it in today's snowflake shorts.
Introduction of New Columns
query_hash and query_parameterized_hash are new output columns in the ACCOUNT_USAGE views and in the output of INFORMATION_SCHEMA table functions. These columns contain a hash of the query text. You can use this hash to analyze repeated queries.
These columns are available in the following views and in the output of the following table functions:
As of today, they are part of the behavior bundle release 2023_06. You can enable it with the below query:
SELECT SYSTEM$ENABLE_BEHAVIOR_CHANGE_BUNDLE('2023_06');
Understanding QUERY_HASH
The query_hash column contains a hash value that is computed, based on the canonicalized text of the SQL statement. Repeated queries that have exactly the same query text have the same query_hash values.
For instance, the below queries will have the same QUERY_HASH
SELECT * FROM TEST_DB.SF_SCH_203.CUSTOMER WHERE NAME = 'DIVYANSH';
SELECT * from test_db.sf_sch_203.customer where name= 'DIVYANSH';
Queries will have the same hash if their text differs only in:
Understanding QUERY_PARAMETERIZED_HASH
query_parameterized_hash contains a hash value that is computed based on the parameterized query, which means the version of the query after the literals are parameterized. These literals must be used in the query predicate and must be used with one of the following comparison operators:
For below 2 queries, the query_parameterized_hash will be the same:
领英推荐
SELECT * FROM TEST_DB.SF_SCH_203.CUSTOMER WHERE NAME = 'DIVYANSH';
SELECT * FROM TEST_DB.SF_SCH_203.CUSTOMER WHERE NAME = 'PAUL';
Queries will have the same parameterized hash if their text differs only in:
Important Things to Note
Over time, the logic used by Snowflake to generate the query hash can change. Changes to this logic can result in different hashes produced for the same query.
The views and table function output that include the query_hash and query_parameterized_hash columns also include the following columns that specify the version of the logic used to produce the hashes:
If these columns contain different version numbers for different periods of time, you can use these version columns to identify the different hashes for the same query.
About Me:
Hi there! I am Divyansh Saxena
I am an experienced Data Engineer with a proven track record of success in Snowflake Data Cloud technology. Highly skilled in designing, implementing, and maintaining data pipelines, ETL workflows, and data warehousing solutions. Possessing advanced knowledge of Snowflake’s features and functionality, I am a Snowflake Data superhero & and Snowflake Snowpro Core SME. With a major career in Snowflake Data Cloud, I have a deep understanding of cloud-native data architecture and can leverage it to deliver high-performing, scalable, and secure data solutions.
Follow me on Medium for regular updates on Snowflake Best Practices and other trending topics:
Data Engineer @ IBM | Microsoft Certified, Cloud Computing, Data Management
1 年This inspires me to get one snowpro certificate ?? .