登录查看更多内容

Importance of Analyzing Waits in SQL Server

Sandro Servino

M.Sc, MBA, Database′s Administrator and Architect

发布日期: 2024年12月15日

+ 关注

Understanding and analyzing waits in SQL Server is crucial for diagnosing and improving performance bottlenecks.

www.sandroservino.com.br

Wait statistics indicate where SQL Server processes spend most of their time waiting, offering insights into resource contention, query optimization opportunities, and system bottlenecks. Each wait type provides valuable clues about the underlying issue, such as delays caused by CPU, memory, lock, disk I/O, or network resources. By identifying and addressing these waits, you can optimize your system for better performance and reliability.

Below is a SQL script to capture the main wait types currently affecting your environment.

This script is widely known as part of an approach for analyzing wait statistics (wait stats) in SQL Server and was created by Paul Randal, a renowned SQL Server expert and founder of SQLskills. He is one of the leading advocates for using wait statistics to diagnose performance issues in SQL Server.

The script is available on the SQLskills website, where Paul and his team publish educational materials and scripts to assist with SQL Server administration and optimization. The dynamically generated URL in the script (https://www.sqlskills.com/help/waits/) links directly to pages explaining the different types of waits in SQL Server, providing detailed information about their meaning and potential solutions.

Purpose of the Script

This script is used to identify major bottlenecks in SQL Server based on recorded wait types.
It prioritizes the most significant wait types, excluding those generally irrelevant for analysis.
It provides helpful links for additional information, facilitating understanding and troubleshooting.

If you are using this script, it is highly recommended to follow Paul Randal's materials and SQLskills resources to maximize your knowledge about wait stats analysis.

This can help pinpoint which resources or queries are contributing to performance issues:

WITH [Waits]

AS (

SELECT

[wait_type],

[wait_time_ms] / 1000.0 AS [WaitS],

( [wait_time_ms] - [signal_wait_time_ms] ) / 1000.0 AS [ResourceS],

[signal_wait_time_ms] / 1000.0 AS [SignalS],

[waiting_tasks_count] AS [WaitCount],

100.0 * [wait_time_ms] / SUM([wait_time_ms]) OVER () AS [Percentage],

ROW_NUMBER() OVER ( ORDER BY [wait_time_ms] DESC ) AS [RowNum]

FROM sys.dm_os_wait_stats

WHERE [wait_type] NOT IN (

-- Filtered waits types, not usually relevant for analysis

N'BROKER_EVENTHANDLER', N'BROKER_RECEIVE_WAITFOR',

N'CXCONSUMER', N'LAZYWRITER_SLEEP', N'SLEEP_TASK',

N'XE_TIMER_EVENT', -- Add more irrelevant waits here

N'SQLTRACE_BUFFER_FLUSH')

AND [waiting_tasks_count] > 0

)

SELECT

MAX([W1].[wait_type]) AS [WaitType],

CAST(MAX([W1].[WaitS]) AS DECIMAL(16, 2)) AS [Wait_S],

领英推荐

Execution Plan in MS SQL Server

Home Credit Vietnam 1 年前

Endless SQL Possibilities

ARNAB MUKHERJEE ???? 4 个月前

Top 50 Microsoft SQL Server Interview Questions:…

Vivek Kumar 3 个月前

CAST(MAX([W1].[ResourceS]) AS DECIMAL(16, 2)) AS [Resource_S],

CAST(MAX([W1].[SignalS]) AS DECIMAL(16, 2)) AS [Signal_S],

MAX([W1].[WaitCount]) AS [WaitCount],

CAST(MAX([W1].[Percentage]) AS DECIMAL(5, 2)) AS [Percentage],

CAST(( MAX([W1].[WaitS]) / MAX([W1].[WaitCount])) AS DECIMAL(16, 4)) AS [AvgWait_S],

CAST(( MAX([W1].[ResourceS]) / MAX([W1].[WaitCount])) AS DECIMAL(16, 4)) AS [AvgRes_S],

CAST(( MAX([W1].[SignalS]) / MAX([W1].[WaitCount])) AS DECIMAL(16, 4)) AS [AvgSig_S],

CAST('https://www.sqlskills.com/help/waits/' + MAX([W1].[wait_type]) AS XML) AS [Help/Info URL]

FROM [Waits] AS [W1]

INNER JOIN [Waits] AS [W2] ON [W2].[RowNum] <= [W1].[RowNum]

GROUP BY [W1].[RowNum]

HAVING SUM([W2].[Percentage]) - MAX([W1].[Percentage]) < 95; -- Threshold for analysis

Some common wait events and their possible causes:

ASYNC_NETWORK_IO / NETWORKIO: The NETWORKIO (SQL 2000) and ASYNC_NETWORK_IO (SQL 2005+) wait events can point to network-related issues (rarely), but generally indicate that a client application is not processing SQL Server results quickly enough. This event often appears in cases of RBAR (Row-By-Agonizing-Row).

CXPACKET: This event is typically related to queries using parallel processing. It may indicate that one thread has completed its processing and is waiting for other threads in the process to finish. If this wait time is very high, it may be worth revisiting the MAXDOP configuration, evaluating whether the OPTION(MAXDOP x) hint could be helpful for certain queries, reassessing the indexes used by disk-heavy queries (likely those utilizing parallelism), and ensuring that sargable arguments are being used.

DTC: This wait event is not local. When using the Microsoft Distributed Transaction Coordinator (MS-DTC), a transaction is opened across multiple systems simultaneously and only completes once executed on all systems.

OLEDB: This wait event indicates that the process made a call to an OLEDB provider and is waiting for that call to process on the destination server and return the data. This event also occurs when running commands on other instances using Linked Servers (remote calls), BULK INSERT commands, and FULLTEXT-SEARCH queries. There's nothing to do on the local instance, but in cases of remote calls to other instances, you can analyze the destination instance to identify why processing and data return are delayed.

PAGEIOLATCH_*:** This event occurs when SQL Server is waiting to read disk/storage pages not currently in memory, causing disk contention. To reduce this I/O event, consider increasing disk speed (to reduce time), adding more memory (to allocate more pages), or tuning queries causing this I/O event. A very common cause of this event is the lack of optimal indexes for a given query, which can often be resolved by creating new indexes with the help of Missing Index DMVs to avoid Scan operations (check out the post "Understanding Indexes in SQL Server" for more details).

PAGELATCH_*:** The most common causes of this event are contention in TempDB on heavily loaded instances. The PAGELATCH_UP wait generally occurs when multiple threads are trying to access the same bitmap, while the PAGELATCH_EX event occurs when threads are trying to insert data into the same disk page, and PAGELATCH_SH indicates that a thread is attempting to read data from a page being modified.

IO_COMPLETION: This wait event occurs when SQL Server is waiting for I/O operations to complete processing. These are not index or data reads from the disk but rather reads of disk allocation bitmaps (GAM, SGAM, PFS), transaction logs, writes of sort buffers to disk, reads of VLF headers in transaction logs, reads/writes of merge join/eager spools on disk, etc. This event is normal, as it often appears just before starting operations requiring I/O, but it could be problematic if the wait time is very high and your top waits include ASYNC_IO_COMPLETION, LOGMGR, WRITELOG, or PAGEIOLATCH_*.

When there is an I/O issue on the instance, the SQL Server log often shows messages like "I/O requests are taking longer than X seconds to complete," which could occur due to slow disks or operations consuming high I/O, such as BACKUP, ALTER/CREATE DATABASE, and AutoGrowth events.

To investigate this issue, look for queries with high disk read/write times, including the following counters: Disk Avg. Read Time, Disk Avg. Write Time, Avg. Disk Queue Length, Buffer Cache Hit Ratio, Buffer Free Pages, Buffer Page Life Expectancy, and Machine: Memory Used.

SOS_SCHEDULER_YIELD: This event occurs when SqlOS (SOS) is waiting for more CPU resources to finish processing, which could indicate CPU overload, preventing the server from processing all tasks being requested. However, this doesn't always mean a general issue with the instance; it might indicate a specific query requiring more CPU.

If your query has parallelism inhibitors (e.g., HINT MAXDOP(1), serial UDF functions, system table queries, etc.), it might force processing to use only one CPU core, causing the SOS_SCHEDULER_YIELD event even if the server has multiple cores available and low CPU usage. To identify this, try locating queries consuming the most CPU or where CPU time is less than execution time (in parallelism, a query might consume 10s of CPU but execute in 2s, meaning it ran across several CPU cores).

WRITELOG: When a transaction is waiting on the WRITELOG event, it means the SQL Server is waiting for log cache data to be flushed to disk (transaction log). Since this involves disk writes, which are generally much slower than memory access or CPU processing, this type of event can be common on instances with slower disks. To reduce this wait, one option is to use the Delayed Durability feature available from SQL Server 2014 onwards.

LCK*:** A very common event not directly related to performance, this occurs when a transaction modifies an object and places a lock on it, preventing other transactions from modifying or accessing the data while processing. If another transaction tries to access this object, it must wait for the lock to be released before continuing, generating this wait event. This is often mistaken for a performance issue because the query will take as long as necessary for the lock to be released, giving users the impression of slowness, especially if it's a heavily used query in the system. In such cases, analyze the session generating the lock to determine why it’s taking so long to process and release the object.

Thanks

Sandro

Jorge Zacharias

2 个月

Very helpful

1 次回应

Malcolm P.

TSQL C# Python Javascript Typescript Node.js Powershell AWS Github Jenkins solution provider

2 个月

Absolutely agree, this is the dark matter of SQL as it is the lack of resource usage identifying the problem. The biggest gain I made from analysing waits is when the transfer of the data to the client took too long. This identified the correct area to focus on for optimisation work.

1 次回应

Sandro Rafael Hwizdaleck

2 个月

Dicas úteis

1 次回应

查看更多评论

要查看或添加评论，请登录

Sandro Servino的更多文章

Usando Hints no PostgreSQL para For?ar o Uso de índices e Outros Métodos de Execu??o

2025年3月1日

Usando Hints no PostgreSQL para For?ar o Uso de índices e Outros Métodos de Execu??o

www.academiadba.
Guia Completo de Particionamento no SQL Server

2025年2月27日

Guia Completo de Particionamento no SQL Server

www.academiadba.
Estrutura de Diretórios do PostgreSQL 17 (data)

2025年2月25日

Estrutura de Diretórios do PostgreSQL 17 (data)

www.academiadba.
Particionamento de Tabelas no SQL Server: Melhorando a Performance e a Organiza??o dos Dados

2025年2月24日

Particionamento de Tabelas no SQL Server: Melhorando a Performance e a Organiza??o dos Dados

www.academiadba.
O que é PostgreSQL-VACUUM ?

2025年2月23日

O que é PostgreSQL-VACUUM ?

www.academiadba.
Quais bancos relacionais tem recurso de particionamento de tabelas e índices e quais n?o tem?

2025年2月22日

Quais bancos relacionais tem recurso de particionamento de tabelas e índices e quais n?o tem?

www.academiadba.
Redis vs. Bancos de Dados Relacionais: Quando e Por Que Utilizar Cada um?

2025年2月22日

Redis vs. Bancos de Dados Relacionais: Quando e Por Que Utilizar Cada um?

www.AcademiaDBA.
A Importancia da Ordem dos Campos em um índice Composto no SQL Server e no Oracle.

2025年2月21日

A Importancia da Ordem dos Campos em um índice Composto no SQL Server e no Oracle.

www.academiadba.
As Vantagens de Trabalhar com Power BI

2025年2月17日

As Vantagens de Trabalhar com Power BI

O Power BI é uma das ferramentas de Business Intelligence (BI) mais populares do mercado, oferecendo uma solu??o…
Clonagem de PDBs e CDBs no Oracle Database

2025年2月17日

Clonagem de PDBs e CDBs no Oracle Database

Forma??o Oracle com super desconto: https://www.udemy.

See all articles

Importance of Analyzing Waits in SQL Server

Sandro Servino

M.Sc, MBA, Database′s Administrator and Architect

Purpose of the Script

领英推荐

Some common wait events and their possible causes:

Sandro Servino的更多文章

社区洞察

其他会员也浏览了

It's The Assumptions That Get You

A Comprehensive Guide to Enhancing Server Performance

The impasse of SQL performance optimizing

SQL Server Key Lookup & Scan

If Marie Kondo did SQL...

Using SQL to Retrieve a System Value

Why are functions bad SQL Server performance?

Memory Grant Feed - SQL 2019

?? Interview questions for SQL (Beginner to Intermediate)

Optimizing SQL Server: Mitigating the Impact of Expensive Query Operations

Purpose of the Script

领英推荐

Some common wait events and their possible causes:

Sandro Servino的更多文章

Usando Hints no PostgreSQL para For?ar o Uso de índices e Outros Métodos de Execu??o

Guia Completo de Particionamento no SQL Server

Estrutura de Diretórios do PostgreSQL 17 (data)

Particionamento de Tabelas no SQL Server: Melhorando a Performance e a Organiza??o dos Dados

O que é PostgreSQL-VACUUM ?

Quais bancos relacionais tem recurso de particionamento de tabelas e índices e quais n?o tem?

Redis vs. Bancos de Dados Relacionais: Quando e Por Que Utilizar Cada um?

A Importancia da Ordem dos Campos em um índice Composto no SQL Server e no Oracle.

As Vantagens de Trabalhar com Power BI

Clonagem de PDBs e CDBs no Oracle Database

社区洞察

其他会员也浏览了

It's The Assumptions That Get You

A Comprehensive Guide to Enhancing Server Performance

The impasse of SQL performance optimizing

SQL Server Key Lookup & Scan

If Marie Kondo did SQL...

Using SQL to Retrieve a System Value

Why are functions bad SQL Server performance?

Memory Grant Feed - SQL 2019

?? Interview questions for SQL (Beginner to Intermediate)

Optimizing SQL Server: Mitigating the Impact of Expensive Query Operations