Negative SPIDs in sql server and how to deal with it

Negative SPIDs in sql server and how to deal with it

What is a negative SPID?

The negative SPID is known as a distributed transaction SPID or an orphaned distributed transaction SPID. It’s very rare to see the Negative SPID as it is not so common. Negative SPID happens when someone kills a distributed transaction SPID. An orphaned distributed transaction SPID to be precise that involves MSDTC – The Microsoft Distributed Transaction Coordinator.

?What causes negative SPIDs?

When it is needed to work on two or more server at a time and fetch data from many those servers within a transaction, means when a transaction involves data that resides on more than one server – MSDTC needs to become involved.

Let’s assume if something goes wrong with any of the servers involved, like a server drops off a network or some power outage, the distributed transaction is impacted. Usually, MSDTC handles such scenarios very well in itself however when MSDTC can’t recover itself for any reasons, the SPID which is handling the distributed transaction on one or more servers cannot do anymore work and result in an orphaned SPID.

To mark this as an orphaned distributed transaction SPID, the SQL server changes the SPID from the positive number to the negative SPID.

?Impact of the negative SPID

When the SQL changes the orphaned distributed transaction SPID to negative, the only problem is the SPID may still be holding on the resources like table, page, or row locks, and blocking other SPIDs which want to access the database objects. This impacts the SQL performance due to the resultant block chain that blocks dozen other transactions behind it to acquire the resources.

?How to deal with the negative SPID

Until the negative SPID is killed, nothing will happen however the issue is how to kill the negative SPID.

Let’s assume you tried to kill -2 using KILL -2, it will throw an error message as below:

This is because the KILL command can’t handle SPIDs with a value of less than 1.

So now how to resolve this issue.? There is a way to handle this which is explained below.

?You need to look up for the UOW (Unit of Work) ID of the offending SPID to terminate the process.

To get the UOW value, run the following command:

Note down the UOW values returned by the above command. This will return a 32-digit UOW number like 'FA4A8712-12D5-11D3-7213-345ABC7E9'.

Use the below command to kill the process:
Kill 'FA4A8712-12D5-11D3-7213-345ABC7E9'        

This will kill the lead blocker with the negative SPID -2.

?Types of negative SPIDs explained

?Negative SPIDs can be between 2 and 5. Below is an explanation for each of the negative SPIDs.

SESSION ID -2

-2 indicates that the blocking resource is owned by an orphaned distributed transaction.

You can deal with it using the above example. Finding the UOW value and killing the offending process.

?SESSION ID -3

-3 indicates the blocking resource is owned by the deferred recovery transaction.

This indicates a block is occurring because a page being accessed by your query was possibly damaged. This is something that probably needs research. Check the wait_resource column of the sys.dm_exec_requests to see which page is at the issue.

SESSION ID -4

It occurs when the session id of the blocking latch owner could not be determined at this time because of internal latch state transitions.

A session ID of -4 is usually not an issue. It may have been timing of when query to sys.dm_exec_requests ran, and it would show -4.

SESSION ID -5

It occurs when the session id of the blocking latch owner could not be determined because it is not tracked for this latch type. This too is usually not an issue, and this is going to show up for lot many things such as a session waiting on PAGEIOLATCH_SH (which is physical IO wait).? In general, it relates to performance issue. So, if you encounter -5 SPID don’t focus on blocks, instead focus on the wait type the session is encountering.

?Conclusions:

A negative session ID is a session that does not really exists. It is a placeholder for a workload that doesn’t have the session ID associated with it. Some negative session IDs are just noise while others indicate a problem.

要查看或添加评论,请登录

Vishal Srivastava的更多文章

社区洞察

其他会员也浏览了