Article ID: 832524 - View products that this article applies to.
SQL Server Technical BulletinTopic covered in this issue: How to resolve a deadlock
To identify, to troubleshoot, and to recommend a solution for resolving a deadlock.
This article examines a deadlock situation and provides steps for resolving the deadlock. Each deadlock may be different and can be caused by several different environment variables. The information provided in this article can help you identify and resolve a deadlock.
In a case study, we examine a 911 system that has six operators. During peak activity, the Microsoft Visual Basic front-end application they are using experiences broken connections. Because of the broken connections, the operators must re-input data. For a 911 system that operates 24 hours a day, seven days a week, this behavior is unacceptable.
A deadlock occurs when two system server process IDs (SPIDs) are waiting for a resource and neither process can advance because the other process is preventing it from getting the resource.
The lock manager’s thread checks for deadlocks. When a lock manager’s deadlock detection algorithm detects a deadlock, the lock manager chooses one of the SPIDs as a victim. The lock manager initiates a 1205 error message that is sent to the client, and the lock manager kills the SPID. Killing the SPID frees the resources and allows the other SPID to continue. Killing the SPID that is the deadlock victim is what causes the broken connection that the Visual Basic front-end application experiences.
In a well designed application, the front-end application should trap for the 1205 error, reconnect to SQL Server, and then re-submit the transaction.
Although deadlocks can be minimized, they cannot be completely avoided. That is why the front-end application should be designed to handle deadlocks.
To identify a deadlock, you must first obtain log information. If you suspect a deadlock, you must gather information about the (SPIDs) and the resources that are involved in the deadlock. To do this, add the -T1204 and the -T3605 startup parameters to SQL Server. To add these two startup parameters, follow these steps:
The startup parameters will take effect when SQL Server is stopped and then re-started.
The -T1204 startup parameter collects information about the process and the resources when the deadlock detection algorithm encounters a deadlock. The -T3605 startup parameter writes this information to the SQL Server error logs.
The -T1205 startup parameter collects information every time that the deadlock algorithm checks for a deadlock, not when a deadlock is encountered. You do not have to use the -T1205 startup parameter.
If you do use the -T1205 startup parameter, the following is a sample of the output that will be in the SQL Server error log:
2003-05-14 11:46:26.76 spid4 Starting deadlock search 1 2003-05-14 11:46:26.76 spid4 Target Resource Owner: 2003-05-14 11:46:26.76 spid4 ResType:LockOwner Stype:'OR' Mode: S SPID:55 ECID:0 Ec:(0x43CAB580) Value:0x42bdf340 2003-05-14 11:46:26.76 spid4 Node:1 ResType:LockOwner Stype:'OR' Mode: S SPID:55 ECID:0 Ec:(0x43CAB580) Value:0x42bdf340 2003-05-14 11:46:26.76 spid4 2003-05-14 11:46:26.76 spid4 End deadlock search 1 ... a deadlock was not found. 2003-05-14 11:46:26.76 spid4 ---------------------------------- 2003-05-14 11:46:31.76 spid4 ---------------------------------- 2003-05-14 11:46:31.76 spid4 Starting deadlock search 2
Sometimes, you might not be able to stop and re-start SQL Server. In that case, you can use Query Analyzer to run the following command to enable the deadlock trace flags.
Note This way you can gather information about the deadlocks immediately. The "-1" indicates all SPIDs.
Next, you must collect a SQL Profiler trace. If you turn on the deadlock trace flag, you will get most of the required information, but not always. For example, in a case study the trace flag output identified that a sp_cursoropen system stored procedure and an "UPDATE tblQueuedEvents set notifyid = 3, ResynchDate" statement were involved in a deadlock. Unfortunately, you do not know the definition of the sp_cursoropen system stored procedure. You also do not have the complete UPDATE statement because it was truncated.
SQL Profiler can obtain the full statements in addition to the execution plans of the statements. A SQL Profiler trace also has a lock event for "deadlock" and for "deadlock chain." "Deadlock" corresponds to the -T1204 flag, and "deadlock chain" corresponds to the -T1205 flag. Turning on the deadlock trace flags and running a SQL Profiler trace during the occurrence of a deadlock should provide you the data that you must have to troubleshoot a deadlock. In this case, and in others, running SQL Profiler changes the timing of execution enough to prevent the deadlock. Therefore, you will typically capture the deadlock information with the trace flags, and then you run SQL Profiler.
After a deadlock occurs, you can gather information about the deadlock by using the sqldiag utility and by using SQL Profiler. In the output of the SQLDiag.txt file, look for a "Wait-for-graph" entry. A "Wait-for graph" entry indicates that a deadlock was encountered.
The following is a sample of the output that you might see in the SQL Server error log when you use the -T1205 startup parameter.
2003-05-05 15:11:50.80 spid4 Wait-for graph 2003-05-05 15:11:50.80 spid4 Node:1 2003-05-05 15:11:50.80 spid4 ResType:LockOwner Stype:'OR' Mode: S SPID:55 ECID:0 Ec:(0x33AE1538) Value:0x193 2003-05-05 15:11:50.80 spid4 Victim Resource Owner: 2003-05-05 15:11:50.80 spid4 ResType:LockOwner Stype:'OR' Mode: X SPID:60 ECID:0 Ec:(0x1F1BB5B0) Value:0x193 2003-05-05 15:11:50.80 spid4 Requested By: 2003-05-05 15:11:50.80 spid4 Input Buf: RPC Event: sp_cursoropen;1 2003-05-05 15:11:50.80 spid4 SPID: 55 ECID: 0 Statement Type: EXECUTE Line #: 1 2003-05-05 15:11:50.80 spid4 Owner:0x1937f2a0 Mode: S Flg:0x0 Ref:1 Life:00000000 SPID:55 ECID:0 2003-05-05 15:11:50.80 spid4 Grant List 0:: 2003-05-05 15:11:50.80 spid4 KEY: 8:1653632984:2 (da00ce043a9e) CleanCnt:1 Mode: U Fl ags: 0x0 2003-05-05 15:11:50.80 spid4 Node:2 2003-05-05 15:11:50.80 spid4 ResType:LockOwner Stype:'OR' Mode: S SPID:55 ECID:0 Ec:(0x33AE1538) Value:0x193 2003-05-05 15:11:50.80 spid4 Requested By: 2003-05-05 15:11:50.80 spid4 Input Buf: Language Event: Update tblQueuedEvents Set NotifyID = 2, ResynchDate 2003-05-05 15:11:50.80 spid4 SPID: 60 ECID: 0 Statement Type: UPDATE Line #: 1 2003-05-05 15:11:50.80 spid4 Owner:0x1936e420 Mode: X Flg:0x0 Ref:0 Life:02000000 SPID:60 ECID:0 2003-05-05 15:11:50.80 spid4 Grant List 0:: 2003-05-05 15:11:50.80 spid4 KEY: 8:1653632984:1 (2d018af70d80) CleanCnt:1 Mode: X Flags: 0x0
In the "Wait-for-graph" entry, you have Node 1 and Node 2. In each node, you have a grant section and a request section. The grant section is the "Grant List", and the request section is the "Request By."
In each node, you can identify the following:
For example, in Node 1, the Grant List, SPID 55 had been granted an update lock, Mode: U, on resource KEY: 8:1653632984:2. 8=DBID, 1653632984=ObjectID, and 2=Indid. To obtain the database identification number, run the sp_helpdb stored procedure. To obtain the table, run the following code:
To obtain the index, run the following code:
If IndexId is equal to 2, you know the index is a nonclustered index. The command that SPID 55 was executing was the sp_cursoropen stored procedure.
In Node 2, the Grant List, SPID 60 has been granted an exclusive lock, Mode: X, on resource KEY: 8:1653632984:1. 8=DBID, 1653632984=ObjectID, 1=Indid. This is on the same table but index 1 is the clustered index. The command that SPID 60 was executing was:
An IndexId that is equal to 1 is a clustered index.
An IndexId that is equal to 2 is a nonclustered index.
Note Deadlocks are very time sensitive.
Next, in Node 1, Request By, SPID 55 requested a shared lock, Mode: S, on IndexId=1. In Node 2, Request By, SPID 60 requested an exclusive lock, Mode: X, on IndexId=2. Because these lock requests occur at the same time, the deadlock occurs. Each SPID’s granted locks are preventing the requested locks from continuing.
The following table shows the lock compatibility chart. For more information about lock compatibility, see the "Lock Compatibility" topic in SQL Server 2000 Books Online.
Lock compatibility chart
Collapse this tableExpand this table
Next, by looking at the output, you identify ObjectId 1653632984 as the tblQueuedEvents table, and you obtain a sp_help stored procedure output for the table. There were two indexes on the table. The two indexes were ix_tblQueuedEvents and PK_tblQueuedEvent. ix_tblQueuedEvents is a clustered index on ResynchDate, and PK_tblQueuedEvent is a primary key, unique nonclustered index on EventSID.
The SQL Profiler trace was not able to capture the deadlock occurrence. Remember, deadlocks are very time dependent. The overhead of SQL Profiler probably added some time to the execution of one of the processes and that prevented SQL Profiler from getting in a deadlock situation. However, it did provide information that you can use to troubleshoot the issue. You found the full update tblQueuedEvents statement to be similar to the following:
You also found the execution plan. You still do not have the full sp_cursoropen stored procedure statement, but you do have enough information to recommend a solution that will resolve the deadlock.
Here is the execution plan.
Note This particular execution plan is read right to left and bottom to top.
StmtText -------------------------------------------------------------------------------------------------------------------------------------------------------------------- Update tblQueuedEvents Set NotifyID = 2, ResynchDate = '5/7/2003 10:44:16' where eventSID = 73023 |--Clustered Index Update(OBJECT:([SOTS].[dbo].[tblQueuedEvents].[ix_tblQueuedEvents ]), SET:([tblQueuedEvents].[NotifyID]=[@1], [tblQueuedEvents].[ResynchDate]=[Expr1004])) |--Top(1) |--Compute Scalar(DEFINE:([Expr1004]=Convert([@2]))) |--Index Seek(OBJECT:([SOTS].[dbo].[tblQueuedEvents].[PK_tblQueuedEvents]), SEEK:([tblQueuedEvents].[EventSID]=[@3])
Note that the UPDATE statement is performing a "clustered index update" on the clustered index. Therefore, the nonclustered index and the clustered index must both be updated. The clustered index is ix_tblQueuedEvents and the nonclustered index is PK_tblQueuedEvents. To perform the updates, the UPDATE statement must obtain exclusive locks on both indexes. These two indexes are the indexes that are involved in the deadlock. From reviewing SQL Profiler traces, you did not see any queries that used the ResynchDate in the WHERE clause. All the statements were very specific, and they used the EventSID in the WHERE clause. A better choice of a clustered index would be EventSID. With this information and a discussion with the customer, we found that the ResynchDate index was old, and it was not necessary. We recommended that the customer drop the ix_tblQueuedEvents index on ResynchDate, and that they make PK_tblQueuedEvent a clustered index. This resolved the deadlock situation.
This is only one example of a deadlock case that involves locks. Deadlocks can also involve parallelism and involve threads. They can involve one, two, three, or more SPIDs, and resources. With any deadlock case, you must obtain the –T1204 startup parameter output, and the SQL Profiler trace to identify, to troubleshoot, and to resolve the deadlock. Your deadlock situation will involve different processes and resources. Therefore, solutions will vary from case to case. Typical methods you can use to resolve deadlocks include:
For more information about deadlocks, visit the following Microsoft Web sites: