WFG – mode 5 TM deadlock

Just a brief note to preserve some observations on a deadlock situation reported a while back on the OTN forums.

The symptoms of the problem were deadlocks on a RAC system and this example partial extract of a wait-for-graph from an LMD trace file:

Global Wait-For-Graph(WFG) at ddTS[0.5] :
BLOCKED 0x3d2438198 4 wq 2 cvtops x1 TM 0x1df87.0x0(ext 0x0,0x0)[41000-0001-000004A5] inst 1 
BLOCKER 0x3d7846108 4 wq 2 cvtops x1 TM 0x1df87.0x0(ext 0x0,0x0)[38000-0002-00002C68] inst 2 
BLOCKED 0x3d7846108 4 wq 2 cvtops x1 TM 0x1df87.0x0(ext 0x0,0x0)[38000-0002-00002C68] inst 2 
BLOCKER 0x3d75a4fd8 4 wq 2 cvtops x1 TM 0x1df87.0x0(ext 0x0,0x0)[43000-0003-000009C1] inst 3 
BLOCKED 0x3d75a4fd8 4 wq 2 cvtops x1 TM 0x1df87.0x0(ext 0x0,0x0)[43000-0003-000009C1] inst 3 
BLOCKER 0x3d2438198 4 wq 2 cvtops x1 TM 0x1df87.0x0(ext 0x0,0x0)[41000-0001-000004A5] inst 1 

The details of a WFG are slightly different compared to a non-RAC/LMD deadlock graph, not least how to interpret lock mode.

Column 8 indicates a TM lock on the object in column 9 – 0x1df87 (hex that can be converted to dec and used in dba_objects)

The lock mode in column 3 is slightly misleading in that “4” does not mean mode 4 but mode 5, thanks to the following resources:

and this:

#define KJUSERNL 0         /* no permissions */   (Null)
#define KJUSERCR 1         /* concurrent read */ (Row-S (SS))
#define KJUSERCW 2         /* concurrent write */ (Row-X (SX))
#define KJUSERPR 3         /* protected read */    (Share)
#define KJUSERPW 4         /* protected write */ (S/Row-X (SSX))
#define KJUSEREX 5         /* exclusive access */ (Exclusive)

“wq 2″ is apparently the convert queue, “wq 1″ being the grant queue.

So, in summary, deadlock issues regarding mode 5 TM locks.

In this particular case, the usual culprit was an unindexed foreign key, albeit a less common variation with a self-referencing foreign key in conjunction with updates to the primary key column – a questionable habit which in my mind then raises questions about the design.

E.g.

Session 1>create table t1
  2  (col1         number       not null
  3  ,col2         varchar2(10) not null
  4  ,related_col1 number
  5  ,constraint pk_t1 primary key (col1)
  6  ,constraint fk_t1 foreign key (related_col1) references t1 (col1));

Table created.

Session 1>insert into t1 values (1,'X',null);

1 row created.

Session 1>insert into t1 values (2,'X',null);

1 row created.

Session 1>commit;

Commit complete.

Session 1>

If we try to update the pk then session 2 blocks:

Session 1:

Session 1>update t1 set col1 = 1  where col1 = 1;

1 row updated.

Session 1>

Session 2:

Session 2>update t1 set col1 = 2 where col1 = 2;
... hangs...

Blocking on a mode 5 TM:

Session 1>select process,
  2         l.sid,
  3         type,
  4         lmode,
  5         request,
  6         do.object_name
  7  from   v$locked_object lo,
  8         dba_objects do,
  9         v$lock l
 10  where  lo.object_id   = do.object_id
 11  AND    l.sid          = lo.session_id
 12  AND    do.object_name IN ('T1');

PROCESS           SID TY      LMODE    REQUEST OBJECT_NAME
---------- ---------- -- ---------- ---------- -----------
6052:1248        1010 AE          4          0 T1
6052:1248        1010 TM          0          5 T1
496:4000         1640 AE          4          0 T1
496:4000         1640 TM          3          0 T1
496:4000         1640 TX          6          0 T1

Session 1>

It’s not hard to go and engineer a deadlock scenario but as it would be so artificial, I won’t bother.

And as expected, an index addresses the issue.

Session 1:

Session 1>rollback;

Rollback complete.

Session 1>create index i_t1 on t1 (related_col1);

Index created.

Session 1>update t1 set col1 = 1  where col1 = 1;

1 row updated.

Session 1>

Session 2:

Session 2>rollback;

Rollback complete.

Session 2>update t1 set col1 = 2 where col1 = 2;

1 row updated.

Session 2>

Red Tape

Modern IT departments…..

IO is slow on a new server.

I talk to DBA.

DBA talks to SysAdmin.

I get told to contact Storage team.

Storage team says to talk to Platform team and get Platform team to raise to Storage if there’s an issue.

Platform team is represented by SysAdmin already involved.

Sod it.

Follow

Get every new post delivered to your Inbox.

Join 75 other followers