登录查看更多内容

Troubleshooting performance on serverless Synapse SQL pool using QPI library

Jovan Popovic

Principal Program Manager at Microsoft, working on Microsoft Fabric Warehouse. Worked on Azure Synapse, Azure SQL Azure SQL Managed Instance, and SQL Server.

发布日期: 2021年12月21日

Query performance insights

Four years ago, I created a Query performance insights library (QPI). This is a T-SQL library that enables troubleshooting and monitoring performance on #AzureSQL #ManagedInstance. While I worked on Managed Instance performance, this set of scripts helped me to troubleshoot various performance issues. I also created separate versions of QPI library for #AzureSQL database and #SQLServer.

Now, I created another QPI version for the serverless SQL pools in #AzureSynapseAnalytics. In this article, I will explain how to use this library to monitor and optimize your workload on serverless SQL pools.

QPI is an open-source library that is not part of the Azure service.

Setup QPI

The first thing that you need to do is to install QPI scripts on your serverless SQL pool database (it cannot be created on master). Follow this link, pick a version for the serverless SQL pools, and execute the script on a database. You might review the script before you execute it to ensure that there are no dangerous actions. It will create the views and functions that are using system/catalog views, and they are not reading your data.

Once you create these scripts, you would be able to use T-SQL views and functions that can help you to:

Analyze your queries
Optimize your schema.

Analyze the queries

QPI enables you to easily see the queries that are running on your serverless SQL pool using qpi.queries view:

SELECT *
FROM qpi.queries

As a result, you will get a list of the currently executing queries.

Another useful query will return the queries that are executed in the past:

SELECT *
FROM qpi.query_history

This query will return all finished (completed, canceled, and failed) queries that are kept in history. Note that some queries might expire from the history.

Every query has its own request id (or distributed statement id that you might see in the message window when you execute a query). Make sure that you provide this information to Azure support if you are reporting some performance issues with the queries.

领英推荐

YOUR SQL PERFORMANCE SUCKS - AND HOW TO FIX IT

Andrew Madson MSc, MBA 1 个月前

Azure Synapse Analytics

Amir Sherif 5 个月前

Copy Tables from On-Premise SQL Server to Azure Data…

Akshay T. 1 年前

If you are running the queries using Synapse Studio, SSMs, ADS, or some other query tool, you can take this distributed statement id from the messages window. However, if you need to report an issue with a query executed from Power BI or some other tool where you cannot see the info messages, find the query execution in this view and include start_time and request_id from this view.

A query_text_id column is another interesting column that can group the "identical" query texts. It is a hash value created on the query text with removed constants/literals. Note that the queries might not be identical, but this column might be useful when you need to find the executions of the same query:

The serverless SQL pool discards the literals in TOP, file paths, and other parts of the queries so you might have the same hash for different query shapes. But this is still the best way to find similar queries.

You can compare the characteristics of two query executions if you provide the request ids to the function qpi.cmp_queries():

This function might be useful if you need to compare two different executions of a query.

Optimize schema

You need to create tables with optimized schemas to get the best performance. There are many rules for schema optimization, such as:

Minimize the size of the types - for example, use VARCHAR(100) instead of VARCHAR(MAX) where it is possible.
Use VARCHAR type with UTF8 collation when you read Parquet, Delta, or CosmosDB data.
Minimize the types of filter columns and always use BIN2 collation of string filter columns.

You can find a list of best practices on the Synapse SQL documentation page.

Sometimes, it is hard to inspect all tables or views to apply the best practices. With the QPI library, you can use qpi.recommendations view to get the possible improvements that you might make in your database:

This view inspects your schema and provides recommendations. The score column informs you how important is the change. The recommendations with a score of 1 are the most critical.

Note that these are the best-effort recommendations and some of them might not be applicable to your schema. However, it would be good to review them and ensure that you don't have some hidden issue in your schema that might impact your performance.

Conclusion

Query performance insight is a simple T-SQL library that enables you to analyze query performance and get some recommendations that might optimize your workload. It is a lightweight and free library that can help you if you are using serverless SQL pools to analyze your data.

Note that this is an open-source library that is not a part of Azure services. You can use it if it can help you, and you can report bugs/feature requests on the GitHub project site. However, Azure support will not handle any issue related to this library. Use this library to get the additional information that might help you to optimize your databases or provide additional troubleshooting information to support.

Adriano da Silva

Azure Data Architect, Azure Data Engineer: Lead Consultant, Microsoft Certified Trainer (MCT) and Speaker on Azure Data, AI, Microsoft Fabric, Databricks, Power BI.

2 年

Thanks for sharing, Jovan Popovic

Kapil Dev Sapra

Technical Specialist, Architect - Data & AI at Microsoft

3 年

Very useful information.. Thanks for sharing Jovan Popovic !! Appreciate the efforts.

Sanjay Rathod

Azure DW/BI Analytics Tech. Arch.

3 年

Hello Jovan Popovic, We have created sql view in synapse serverless pool to query JSON files over azure blob storage. Is there a feature in synapse serverless to mask sensitive data before displaying to client tools/apps?? Thanks, Sanjay

Erich A. DeJesus ITIL?

3 年

Thank you for sharing this will definitely be of help

1 次回应

查看更多评论

要查看或添加评论，请登录

Jovan Popovic的更多文章

Spatial functions in Fabric Datawarehouse

2024年12月25日

Spatial functions in Fabric Datawarehouse

In today's data-driven world, data warehouses play a crucial role in enabling organizations to perform complex…

3 条评论
JSON in Fabric Datawarehouse

2024年12月14日

JSON in Fabric Datawarehouse

Traditionally, data warehouses operate with strongly schematized tabular data organized in star or snowflake models…

6 条评论
Choosing between Lakehouse and Warehouse in Microsoft Fabric

2023年8月10日

Choosing between Lakehouse and Warehouse in Microsoft Fabric

Microsoft Fabric is a cloud-based platform designed to unify the storage, management, and analysis of large-scale data…

9 条评论
Use shortcuts instead of external tables to reference external data in Fabric Warehouse

2023年7月18日

Use shortcuts instead of external tables to reference external data in Fabric Warehouse

Microsoft Fabric is a unified data and analytics platform that delivers a modern warehouse that integrates smoothly…

12 条评论
Analyze Lakehouse data with T-SQL in Microsoft Fabric

2023年7月17日

Analyze Lakehouse data with T-SQL in Microsoft Fabric

Microsoft Fabric is a unified data analytics platform that enables you to store your data in Lakehouse artifacts. A…
Querying Lakehouse data from Warehouse in Microsoft Fabric

2023年7月12日

Querying Lakehouse data from Warehouse in Microsoft Fabric

Microsoft Fabric is a cloud data platform that provides both Lakehouse and Warehouse artifacts for data storage and…

3 条评论
Use T-SQL queries over your Azure Data Lake files with serverless SQL pool in Azure Synapse Analytics

2020年12月15日

Use T-SQL queries over your Azure Data Lake files with serverless SQL pool in Azure Synapse Analytics

One of the important features of Azure Synapse Analytics is the ability to analyze the files placed on Azure Data Lake…
Analyzing IoT data in Azure SQL

2018年1月23日

Analyzing IoT data in Azure SQL

IoT use cases commonly share some patterns in how they ingest, process, and store data. First, these systems need to…
Introducing Azure SQLDB Managed Instance

2017年10月14日

Introducing Azure SQLDB Managed Instance

Azure SQL Managed instance is a new service that will be publicly available in Microsoft Azure cloud in the near…

9 条评论

See all articles

社区洞察

Data Architecture

What are the best query hints to improve T-SQL stored procedure performance?

Troubleshooting performance on serverless Synapse SQL pool using QPI library

Jovan Popovic

Principal Program Manager at Microsoft, working on Microsoft Fabric Warehouse. Worked on Azure Synapse, Azure SQL Azure SQL Managed Instance, and SQL Server.

Query performance insights

Setup QPI

Analyze the queries

领英推荐

Optimize schema

Conclusion

Jovan Popovic的更多文章

社区洞察

其他会员也浏览了

Mastering SQL Efficiency: How to Optimize Your Queries

The latest announcements for Microsoft and SQL

Difference between local and global indexes in DynamoDB

Stored Procedure in Snowflake using SQL — Aamir P

Weekly SQL Newsletter: SQL Server Deep Dive ft. Pinal Dave + More!

Weekly SQL Newsletter: SQL Server Deep Dive ft. Pinal Dave + More!

Ingesting data into Databricks SQL

How to Enable SQL Insights (preview) to monitor your SQL deployments

Graph Processing in SQL Server 2017 by David Glass

Memory Grant Feed - SQL 2019

Query performance insights

Setup QPI

Analyze the queries

领英推荐

Optimize schema

Conclusion

Jovan Popovic的更多文章

Spatial functions in Fabric Datawarehouse

JSON in Fabric Datawarehouse

Choosing between Lakehouse and Warehouse in Microsoft Fabric

Use shortcuts instead of external tables to reference external data in Fabric Warehouse

Analyze Lakehouse data with T-SQL in Microsoft Fabric

Querying Lakehouse data from Warehouse in Microsoft Fabric

Use T-SQL queries over your Azure Data Lake files with serverless SQL pool in Azure Synapse Analytics

Analyzing IoT data in Azure SQL

Introducing Azure SQLDB Managed Instance

社区洞察

其他会员也浏览了

Mastering SQL Efficiency: How to Optimize Your Queries

The latest announcements for Microsoft and SQL

Difference between local and global indexes in DynamoDB

Stored Procedure in Snowflake using SQL — Aamir P

Weekly SQL Newsletter: SQL Server Deep Dive ft. Pinal Dave + More!

Weekly SQL Newsletter: SQL Server Deep Dive ft. Pinal Dave + More!

Ingesting data into Databricks SQL

How to Enable SQL Insights (preview) to monitor your SQL deployments

Graph Processing in SQL Server 2017 by David Glass

Memory Grant Feed - SQL 2019