What is making Azure Synapse DW (dedicated Pool) Tempdb reaching it’s limit ?

What is making Azure Synapse DW (dedicated Pool) Tempdb reaching it’s limit ?

This is Post2 in a series of helpful Articles for Performance Tuning of Azure Synapse Dedicated Pool . In this we will take a closer look on what is consuming Tempdb and how we can get further help . so Let's Begin !

Tempdb Is always grabbing attention from DBAs and for any long running process. Enhanced version of Synapse and very large and distributed tempdb architecture of Azure Synapse supports very large workloads but sometimes you may still struggle with TempDB contention. Scalable Tempdb (For every DW100c configured, 399 GB of tempdb space is allocated (DW1000c would have 3.99 TB of total tempdb space)  is here but still nobody can save even the best system from bad code or bad design of underlying objects!

tempdb is used for temporary tables and intermediate space for shuffle, sort, Polybase loads and other operations. The amount of space used will tend to scale with the size of the data processed and size of data relative to memory size for spooling memory overflows to disk. Since tempdb is a shared resource for the data warehouse large consumers of tempdb space may impact other users in the warehouse. For more details, please refer to https://docs.microsoft.com/en-us/azure/sql-data-warehouse/sql-data-warehouse-manage-monitor#monitor-tempdb .

Here we will discuss how to troubleshoot and find the right way of finding the process which is causing your tempdb to reach its limits .

Synapse gives you metric for Tempdb usage which can show you graphical usage pattern of Tempdb . You can also set an alert for tempDb Usage for anytime it cross 90% .

No alt text provided for this image

        

To begin with lets look at the following error msgs which is encountered when you reach the limits of assigned Tempdb space for per node in Azure Synapse Dedicated Pool :

Error message : Failed to run query on the underlying database: Error Message: [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Could not allocate space for object '<temporary system object: xxxxxxxx>' in database 'tempdb' because the 'PRIMARY' filegroup is full. Create disk space by deleting unneeded files, dropping objects in the filegroup, adding additional files to the filegroup, or setting autogrowth on for existing files in the filegroup. 

Initial Troubleshooting for Tempdb space utilized by Synapse

1)     Use following DMV query to see which node is utilizing how much of space:

/* Monitor tempdb per compute node */

SELECT  ssu.pdw_node_id,(SUM((ssu.user_objects_alloc_page_count * 8)) + SUM((ssu.internal_objects_alloc_page_count * 8))) AS 'Tempdb_Space_Allocated_KB' FROM sys.dm_pdw_nodes_db_session_space_usage AS ssu WHERE DB_NAME(ssu.database_id) = 'tempdb' GROUP BY ssu.pdw_node_id  order by pdw_Node_id

2)     Use following DMV to monitor Tempdb Used by individual running sessions:

/* Monitor tempdb per session*/

-- Monitor tempdb per session 
SELECT
    sr.request_id,
    ssu.session_id,
    ssu.pdw_node_id,
    sr.command,
    sr.total_elapsed_time,
    exs.login_name AS 'LoginName',
    DB_NAME(ssu.database_id) AS 'DatabaseName',
    (es.memory_usage * 8) AS 'MemoryUsage (in KB)',
    (ssu.user_objects_alloc_page_count * 8) AS 'Space Allocated For User Objects (in KB)',
    (ssu.user_objects_dealloc_page_count * 8) AS 'Space Deallocated For User Objects (in KB)',
    (ssu.internal_objects_alloc_page_count * 8) AS 'Space Allocated For Internal Objects (in KB)',
    (ssu.internal_objects_dealloc_page_count * 8) AS 'Space Deallocated For Internal Objects (in KB)',
    CASE es.is_user_process
    WHEN 1 THEN 'User Session'
    WHEN 0 THEN 'System Session'
    END AS 'SessionType',
    es.row_count AS 'RowCount'
FROM sys.dm_pdw_nodes_db_session_space_usage AS ssu
    INNER JOIN sys.dm_pdw_nodes_exec_sessions AS es ON ssu.session_id = es.session_id AND ssu.pdw_node_id = es.pdw_node_id
    INNER JOIN sys.dm_pdw_nodes_exec_connections AS er ON ssu.session_id = er.session_id AND ssu.pdw_node_id = er.pdw_node_id
    INNER JOIN microsoft.vw_sql_requests AS sr ON ssu.session_id = sr.spid AND ssu.pdw_node_id = sr.pdw_node_id
    LEFT JOIN sys.dm_pdw_exec_requests exr on exr.request_id = sr.request_id
    LEFT JOIN sys.dm_pdw_exec_sessions exs on exr.session_id = exs.session_id
WHERE DB_NAME(ssu.database_id) = 'tempdb'
    AND es.session_id <> @@SPID
    AND es.login_name <> 'sa'
ORDER BY sr.request_id;

How to understand the Tempdb Usage and what are ways to fix this :

  • Ensure table statistics are created and kept up to date.

SQL Data Warehouse supports automatic statistics creation and manual statistics updates by users. As you load data into your data warehouse, query plans can regress if statistics are not up to date, updating or creating table statistics, does cause some load in the database, we do recommend daily stat updates to tables that are fed big volumes of data, due to this you will definitely benefit from setting low-workload time windows dedicated to database maintenance.

o  The SQL Data Warehouse query optimizer is a cost-based optimizer. It compares the cost of various query plans, and then chooses the plan with the lowest cost, which is in most cases the plan that executes the fastest. The cost-based optimized relies on table statistics to ensure the most optimized query plan is selected, inaccurate distributed plan creation with undesired intermediate data movements will in turn will reflect on tempdb used space.

  • Another recommended mitigation is to break your CTAS or INSERT SELECT statement into multiple load statements so the data volume will not exceed the 1TB per node tempdb limit. 
  • Pay special attention to queries using DISTINCT, which might be a cover for poor query statements which have CROSS JOINS or poorly understood table relationships resulting in duplicated values when joining.
  •  Avoid long standing user temporary tables/transactions.
  •  Avoid directly issuing queries against external tables by loading data first (Ref: Best practices for dedicated SQL pools - Azure Synapse Analytics | Microsoft Docs) :

o  External tables are not optimal for queries. External tables residing in Blob storage or Azure data lake does not have compute resources backing them; there the data warehouse cannot offload this work. Therefore, your data warehouse will be required to read the entire file into tempdb to scan the data.

  •  Ensure queries are running with enough memory allocation, each query has a specific memory grant depending on its resource class and the data warehouse service level. When queries consume their grant, they must spill into tempdb slowing down query performance.

 Use the  guidelines on how to judge the most appropriate resource classes according to the type of workload, which you can reference here: Resource classes for workload management - Azure Synapse Analytics | Microsoft Docs

 I hope this article helps you to understand your Azure Synapse datawarehouse Tempdb utilization .

In this performance tuning series , next (post 3) we will look deeper in Query Execution Plans and statistics for Db objects !

 

 

 

Sajawal Ismaeel

Senior Data Engineer | SQL | Python | Azure | Databricks

2 å¹´

Reecha Mishra Is this typo microsoft.vw_sql_requests?

赞
回复
Mike Dobing

Senior Specialist Solutions Architect at Databricks (Data Warehousing & BI)

3 å¹´

Love these articles Reecha Mishra - really insightful and gives a much deeper understanding of what goes on "under the hood." Bookmarked ??

要查看或添加评论,请登录

Reecha Mishra的更多文章

  • Navigating Organizational Changes: Thriving in midst of changes

    Navigating Organizational Changes: Thriving in midst of changes

    Introduction Change is a constant in the modern workplaces and super common in Microsoft. In last 22 yrs of career, I…

    5 条评论
  • World Mental Health Day

    World Mental Health Day

    What should we know and How can we help ? It’s hard to talk about “Mental Health” in mid of War starting in middle east…

    1 条评论
  • Data & Analytics Lens: What Well-Architected practice look like ?

    Data & Analytics Lens: What Well-Architected practice look like ?

    Do you have your Data Platform or Analytics solution running on cloud ? Do you revisit the design and architecture to…

    4 条评论
  • Steps to recover from your own Mistakes !

    Steps to recover from your own Mistakes !

    Do not brood over your past mistakes and failures as this will only fill your mind with grief, regret and depression…

    7 条评论
  • Oracle databases to Azure - Data Management and Innovation at same place

    Oracle databases to Azure - Data Management and Innovation at same place

    Keeping focus on Data Strategy and culture to meet operational excellence with innovation , I would like to share a key…

    7 条评论
  • How to tune long running Queries on Azure Synapse DW (Dedicated Pool) ?

    How to tune long running Queries on Azure Synapse DW (Dedicated Pool) ?

    By every passing day I am getting this frequent ask from all my DBA friends on help to look into long running queries…

    8 条评论

社区洞察

其他会员也浏览了