What is making Azure Synapse DW (dedicated Pool) Tempdb reaching it’s limit ?
This is Post2 in a series of helpful Articles for Performance Tuning of Azure Synapse Dedicated Pool . In this we will take a closer look on what is consuming Tempdb and how we can get further help . so Let's Begin !
Tempdb Is always grabbing attention from DBAs and for any long running process. Enhanced version of Synapse and very large and distributed tempdb architecture of Azure Synapse supports very large workloads but sometimes you may still struggle with TempDB contention. Scalable Tempdb (For every DW100c configured, 399 GB of tempdb space is allocated (DW1000c would have 3.99 TB of total tempdb space) is here but still nobody can save even the best system from bad code or bad design of underlying objects!
tempdb is used for temporary tables and intermediate space for shuffle, sort, Polybase loads and other operations. The amount of space used will tend to scale with the size of the data processed and size of data relative to memory size for spooling memory overflows to disk. Since tempdb is a shared resource for the data warehouse large consumers of tempdb space may impact other users in the warehouse. For more details, please refer to https://docs.microsoft.com/en-us/azure/sql-data-warehouse/sql-data-warehouse-manage-monitor#monitor-tempdb .
Here we will discuss how to troubleshoot and find the right way of finding the process which is causing your tempdb to reach its limits .
Synapse gives you metric for Tempdb usage which can show you graphical usage pattern of Tempdb . You can also set an alert for tempDb Usage for anytime it cross 90% .
To begin with lets look at the following error msgs which is encountered when you reach the limits of assigned Tempdb space for per node in Azure Synapse Dedicated Pool :
Error message : Failed to run query on the underlying database: Error Message: [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Could not allocate space for object '<temporary system object: xxxxxxxx>' in database 'tempdb' because the 'PRIMARY' filegroup is full. Create disk space by deleting unneeded files, dropping objects in the filegroup, adding additional files to the filegroup, or setting autogrowth on for existing files in the filegroup.
Initial Troubleshooting for Tempdb space utilized by Synapse
1) Use following DMV query to see which node is utilizing how much of space:
/* Monitor tempdb per compute node */
SELECT ssu.pdw_node_id,(SUM((ssu.user_objects_alloc_page_count * 8)) + SUM((ssu.internal_objects_alloc_page_count * 8))) AS 'Tempdb_Space_Allocated_KB' FROM sys.dm_pdw_nodes_db_session_space_usage AS ssu WHERE DB_NAME(ssu.database_id) = 'tempdb' GROUP BY ssu.pdw_node_id order by pdw_Node_id
2) Use following DMV to monitor Tempdb Used by individual running sessions:
/* Monitor tempdb per session*/
-- Monitor tempdb per session SELECT sr.request_id, ssu.session_id, ssu.pdw_node_id, sr.command, sr.total_elapsed_time, exs.login_name AS 'LoginName', DB_NAME(ssu.database_id) AS 'DatabaseName', (es.memory_usage * 8) AS 'MemoryUsage (in KB)', (ssu.user_objects_alloc_page_count * 8) AS 'Space Allocated For User Objects (in KB)', (ssu.user_objects_dealloc_page_count * 8) AS 'Space Deallocated For User Objects (in KB)', (ssu.internal_objects_alloc_page_count * 8) AS 'Space Allocated For Internal Objects (in KB)', (ssu.internal_objects_dealloc_page_count * 8) AS 'Space Deallocated For Internal Objects (in KB)', CASE es.is_user_process WHEN 1 THEN 'User Session' WHEN 0 THEN 'System Session' END AS 'SessionType', es.row_count AS 'RowCount' FROM sys.dm_pdw_nodes_db_session_space_usage AS ssu INNER JOIN sys.dm_pdw_nodes_exec_sessions AS es ON ssu.session_id = es.session_id AND ssu.pdw_node_id = es.pdw_node_id INNER JOIN sys.dm_pdw_nodes_exec_connections AS er ON ssu.session_id = er.session_id AND ssu.pdw_node_id = er.pdw_node_id INNER JOIN microsoft.vw_sql_requests AS sr ON ssu.session_id = sr.spid AND ssu.pdw_node_id = sr.pdw_node_id LEFT JOIN sys.dm_pdw_exec_requests exr on exr.request_id = sr.request_id LEFT JOIN sys.dm_pdw_exec_sessions exs on exr.session_id = exs.session_id WHERE DB_NAME(ssu.database_id) = 'tempdb' AND es.session_id <> @@SPID AND es.login_name <> 'sa'
ORDER BY sr.request_id;
How to understand the Tempdb Usage and what are ways to fix this :
- Ensure table statistics are created and kept up to date.
SQL Data Warehouse supports automatic statistics creation and manual statistics updates by users. As you load data into your data warehouse, query plans can regress if statistics are not up to date, updating or creating table statistics, does cause some load in the database, we do recommend daily stat updates to tables that are fed big volumes of data, due to this you will definitely benefit from setting low-workload time windows dedicated to database maintenance.
o The SQL Data Warehouse query optimizer is a cost-based optimizer. It compares the cost of various query plans, and then chooses the plan with the lowest cost, which is in most cases the plan that executes the fastest. The cost-based optimized relies on table statistics to ensure the most optimized query plan is selected, inaccurate distributed plan creation with undesired intermediate data movements will in turn will reflect on tempdb used space.
- Another recommended mitigation is to break your CTAS or INSERT SELECT statement into multiple load statements so the data volume will not exceed the 1TB per node tempdb limit.
- Pay special attention to queries using DISTINCT, which might be a cover for poor query statements which have CROSS JOINS or poorly understood table relationships resulting in duplicated values when joining.
- Avoid long standing user temporary tables/transactions.
- Avoid directly issuing queries against external tables by loading data first (Ref: Best practices for dedicated SQL pools - Azure Synapse Analytics | Microsoft Docs) :
o External tables are not optimal for queries. External tables residing in Blob storage or Azure data lake does not have compute resources backing them; there the data warehouse cannot offload this work. Therefore, your data warehouse will be required to read the entire file into tempdb to scan the data.
- Ensure queries are running with enough memory allocation, each query has a specific memory grant depending on its resource class and the data warehouse service level. When queries consume their grant, they must spill into tempdb slowing down query performance.
Use the guidelines on how to judge the most appropriate resource classes according to the type of workload, which you can reference here: Resource classes for workload management - Azure Synapse Analytics | Microsoft Docs
I hope this article helps you to understand your Azure Synapse datawarehouse Tempdb utilization .
In this performance tuning series , next (post 3) we will look deeper in Query Execution Plans and statistics for Db objects !
Senior Data Engineer | SQL | Python | Azure | Databricks
2 å¹´Reecha Mishra Is this typo microsoft.vw_sql_requests?
Senior Specialist Solutions Architect at Databricks (Data Warehousing & BI)
3 å¹´Love these articles Reecha Mishra - really insightful and gives a much deeper understanding of what goes on "under the hood." Bookmarked ??