Deadlock detection – unindexed foreign keys

Recently I have had a shortage of things to blog about. Nevertheless I have resisted the urge to join the cacophony by just announcing that 11g is in fact available for download on Linux.

Sure enough, eventually an opportunity has presented itself. Well, two opportunities but more of the other elsewhere later, but in this instance – deadlocks, specifically regular and numerous deadlocks caused by regular and numerous jobs, jobs which should have been non-conflicting.

However, dear reader, I shall spoil the surprise by revealing that these deadlocks were caused by unindexed foreign keys – more detail further down. It’s an old problem that’s been around for a long time, but it still happens (although, granted, more often than not in the databases that I see these days, this tends not to be a problem because people haven’t bothered with any sort of constraint in their database, but that’s a rant for another day).

I’m reluctant to admit it for fear of being shot down by the BAAG police 😉 , but at the time of my proclaimation that unindexed foreign keys was the issue, it was a hunch, a gut feeling backed by a decade of experience and the briefest of brief glances at the resulting trace file from the deadlock (is it just me or is there a blurred line between a guess and a hunch / gutfeeling / shortcut that experience suggests?) . Fortunately, a longer look at the trace file backed me up and the subsequent indexing of the columns in the foreign keys was a massive success.

From an end user perspective (including both developers connecting to the database via whatever tool and also application users), the symptoms initially were of an unresponsive development database, seemingly hanging every couple of minutes for a couple of minutes.

Immediately, to the alert log, where there were deadlock entries and in the background dump dest, where there were trace files from the jobs with more detailed information on the deadlocks detected.

Deadlock detection takes some time and so whilst waiting for these to be detected, the sessions on both sides of the deadlock were hogging the limelight and in the meantime bringing the database to its knees for everyone else.

In the trace files, the regular entries were all very similar, such as this:


*** SESSION ID:(37.18266) 2007-08-09 03:22:47.678
DEADLOCK DETECTED
Current SQL statement for this session:
UPDATE TABLE_A RC SET COLUMN_A = :B2 WHERE COLUMN_B = :B1
----- PL/SQL Call Stack -----
object line object
handle number name
000006FBE7909428 375 package body PACKAGE_A
000006FBE7909428 1443 package body PACKAGE A
000006FBF4399740 1 anonymous block
The following deadlock is not an ORACLE error. It is a
deadlock due to user error in the design of an application
or from issuing incorrect ad-hoc SQL. The following
information may aid in determining the deadlock:
Deadlock graph:
---------Blocker(s)-------- ---------Waiter(s)---------
Resource Name process session holds waits process session holds waits
TM-00024f42-00000000 17 37 SX SSX 24 22 SX SSX
TM-00024f42-00000000 24 22 SX SSX 17 37 SX SSX
session 37: DID 0001-0011-00023C06 session 22: DID 0001-0018-000032FA
session 22: DID 0001-0018-000032FA session 37: DID 0001-0011-00023C06

The key information was firstly the statement being issued. This was always the same – there was just this one statement causing all the problem:


UPDATE TABLE_A RC SET COLUMN_A = :B2 WHERE COLUMN_B = :B1

Secondly, the sessions that were deadlocking were using different values for :B1 – this was visible from the additional bespoke logging as part of this PACKAGE_A code module.

Next, the deadlocks caused by this one statement were on a variety of objects – all child tables of TABLE_A or primary key indexes on child tables of TABLE_A.

In this particular trace file:


---------Blocker(s)-------- ---------Waiter(s)---------
Resource Name process session holds waits process session holds waits
TM-00024f42-00000000 17 37 SX SSX 24 22 SX SSX
TM-00024f42-00000000 24 22 SX SSX 17 37 SX SSX

the two sessions are trying to escalate a held TM lock from SX to SSX. A TM lock, a DML Enqueue, makes sure that the structure of a table is not altered whilst, in this example, you are updating a row in a table.
From the concepts guide:

A row exclusive table lock (also called a subexclusive table lock, SX) generally indicates that the transaction holding the lock has made one or more updates to rows in the table.
...
A share row exclusive table lock (also sometimes called a share-subexclusive table lock, SSX) is more restrictive than a share table lock... Only one transaction at a time can acquire a share row exclusive table lock on a given table. A share row exclusive table lock held by a transaction allows other transactions to query or lock specific rows using SELECT with the FOR UPDATE clause, but not to update the table... A share row exclusive table lock held by a transaction prevents other transactions from obtaining row exclusive table locks and modifying the same table.

What object is this TM lock being held on?
Note the lock name – TM-00024f42-00000000 – that 00024f42 is a hex number, 151362 is its decimal converstion. Look up object_id 151362 in dba_objects and you’ll find the details of the object. In the continued interest of obscuring client code and object names, in this example it was TABLE_B, a child of TABLE_A with a foreign key enforcing the constraint but no index.

So, to the solution – to index all the foreign keys between these handful of tables. A bit of a sledgehammer to crack a nut approach, but under the circumstances and the local urgency to get this done and out, an advisable one with the caveat that there might be some impact to blindly adding indexes left, right and centre.

In terms of SQL to identify unindexed foreign keys, I was lazy and did a search on google, albeit with this caveat in mind. Interestingly I found a script on experts-exchange that I wrote to do exactly the same thing nearly a decade ago. But it was a rubbish script, and I used the one linked to from Tom Kyte’s page.

Just as a conclusion on this topic, note some points that Tom Kyte makes on how things change, the subject matter of which is exactly this problem. Also, I managed to deadlock myself in a single session last week by using an autonomous transaction to update a table, the child of which had been modified by the outer transaction – not clever.

9 Responses to Deadlock detection – unindexed foreign keys

  1. Gary says:

    “Is it just me or is there a blurred line between a guess and a hunch / gutfeeling / shortcut that experience suggests?”
    I think the difference between a “guess” and a “hypothesis” is that you make an attempt to validate the hypothesis before proceeding, like you did in check the trace files.
    My feeling on BAAG is that they are against making changes based on guesses, rather than using them as a starting point for investigation.

  2. Jason says:

    Also on the “hunch/gutfeeling” topic, I’ve been reading lately a good book that discusses research on the process of decision-making that gives some credence to the “skilled guess” way of doing things. “Sources of Power: How People Make Decisions”, Gary A. Klein.

  3. Pingback: OraStory Activity « OraStory

  4. Pingback: Three months since my last confession… « OraStory

  5. DeadlockLearner says:

    When you update any row in Child table, does it create a lock on parent table index row?

  6. Pingback: Confluence: Datenbank

  7. Pingback: Confluence: Datenbankadministration

  8. Pingback: Confluence: Datenbankadministration

Leave a reply to Dom Brooks Cancel reply