Which of my sql statements are using dynamic sampling?

From a reply I gave on an OTN forum thread, how to see all queries currently in memory which use dynamic sampling?

Dynamic sampling is an attribute in V$SQL_PLAN.OTHER_XML.

In 11gR1, it says “yes”.
In 11gR2, it gives the level.

Using EXTRACTVALUE which is deprecated in 11.2:

select p.sql_id,  extractvalue(h.column_value,'/info') lvl
from   v$sql_plan p
,      table(xmlsequence(extract(xmltype(p.other_xml),'/other_xml/info'))) h
where  p.other_xml is not null
and    extractvalue(h.column_value,'/info/@type') = 'dynamic_sampling';

Another approach using one of the prefered alternatives – XMLTABLE:

select p.sql_id, t.val
from   v$sql_plan p
,      xmltable('for $i in /other_xml/info
                 where $i/@type eq "dynamic_sampling"
                 return $i'
                passing xmltype(p.other_xml)
                columns attr varchar2(50) path '@type',
                        val  varchar2(50) path '/') t
where  p.other_xml is not null;

Warning: I tried a number of approaches with XMLTABLE and whilst they all worked in 11.2.0.3, in 11.1.0.7 they all ran into the error below. It might well just be a local issue but I did not investigate:

ORA-00600: internal error code, arguments: [qctVCO:csform], [0], [0], [0], [0], [112], [2], [224], [], [], [], [] 

Fun with Distributed Transactions II

Short on examples, long on words…

Last week I did a quick post about a couple of the more obscure implications of using distributed transactions, in particular:

  • The current impact on subquery materialisations
  • The incompatibility with MVs
  • The unusual situation of transactions with locks but no sessions

Anyway, earlier today I was having a closer look at some “DFS lock handle” wait events in an 11gR1 RAC database (not because there was an obvious problem but because I saw a few of them an wanted to have a closer look).

On a DFS lock handle wait, you have to decode P1 to find out what it’s all about as discussed by Riyaj Shamsudeen.

In my case, these were mostly DX and BB enqueues which are related to my old friend the distributed transaction.

Why am I still banging on about distributed transactions?

Well, you might think that they’re not very common but in the JDBC world they seem to be everywhere.

XA transactions are used (overused) throughout the JDBC world and so they might well be very relevant to any database that you’ve got with a JDBC app sitting atop.

A common pattern seen is to take some message off a queue, do something related to that message in the database, and use an XA transaction so that both stand or fail together.

In my previous post, I mentioned an AskTom thread discussing Materialized Views and distributed transactions.

I added some thoughts to that thread and Tom’s observation was

that the java/jdbc world for some reasons wishes to use an external resource manager so they have to do two phase commits against a single database isn’t what I was talking about. when I’ve seen the XA stuff – there is typically *one* database involved and it makes everything really complex,
hard to understand and slower than it needs to be.

Which might well be a fair point but still doesn’t change the fact that if your insert/update/delete comes in on an XA, then being unable to use a fast refresh on commit MV is really quite a restriction.

Anyway, back to my observations on these enqueues.

There’s not much information related to the BB enqueue but I believe that it’s related to the coordination of global transactions on a RAC cluster.

From 11gR1 distributed transactions can be processed on any instance in the cluster and these BB enqueues seem to be part of that picture. Prior to this change, any branches of a distributed transaction had to execute on the same node.

Whilst we’re on the subject, inevitably there are some bugs around these, in particular the BB enqueue and GTX processes (processes introduced to manage these enhanced cross cluster distributed transaction features).

The DX enqueue is perhaps more interesting.

It looks after “tightly coupled distributed transactions”.

So, WTF are tightly coupled distributed transactions and how do they differ from loosely coupled distributed transactions?

Tightly coupled transactions enable other branches of the transaction – and from our database perspective we’re only talking about multiple sessions in the same database which are part of the same distributed transaction – to:

  • share each others locks and
  • see each others changes.

The DX enqueue helps manage this.

As part of this, what the DX enqueue does is make sure that only one transaction branch is actively executing SQL at any one time.

So that’s potentially a pretty hefty point of serialisation then.

For more information, the Oracle whitepaper “XA and Oracle controlled Distributed Transactions” linked to below is a really good resource particularly the section “Distributed Transactions and Database Locking”.

However… just to emphasise the point …

In a tightly coupled tranasction, the DX enqueue is obtained before executing any statement.

By contrast, loosely coupled tranasctions do not need to get this DX lock before executing a statement, i.e. no serialisation between different transaction branches.

As the developer’s guide below says “loosely coupled transaction branches result in greater concurrency.”

So, not only is there an overhead to XA transactions but there is an additional overhead to tightly coupled transactions.

And, how many applications really use multiple transaction branches in a single database.

Very, very few, I wager!

And if they do, how many of those applications appreciate the serialisation involved anyway?

Now depending on your version, you might not see the DX and BB enqueues in your enqueue statistics.

In 11.1.0.7 I don’t seem to see anything in V$ENQUEUE_STAT for DX or BB.

There are however some relevant session/system statistic buckets:

  • DX/BB enqueue lock foreground requests
  • DX/BB enqueue lock foreground wait time
  • DX/BB enqueue lock background gets
  • DX/BB enqueue lock background get time

For more information see:

RAC – It’s all about the node affinity

Or what happens when you don’t have node affinity.

Here’s an old, short post that was lying around unpublished.

SQL executed on the 14th on node 2, on 15th on node 1.

DT SQL ID Node Rows Execs Elapsed Cluster Wait Time
15-NOV 7287jpw5vtm8j 1 4129697 7591 32625 32175
14-NOV 7287jpw5vtm8j 2 19703872 30909 78 60

Timings in seconds.

32625 seconds is just over 9 hours, of which 32175 is cluster wait time.

Database services and application datasource config have now been reconfigured to the extent that there is node affinity and the data required for the workload on node 1 is completely segregated from the workload on node 2.

If you use RAC as a black box and you have problems with performance or problems with predictability of performance then look into whether node affinity, or a lack thereof, is your problem.

Fun with distributed transactions

Three items within:
Part 1 – Subquery materialisation not possible (fixed in 12c)
Part 2 – Materialised views
Part 3 – Lock scripts

Part 1 – Subquery materialisation not possible
Yesterday I was reminder of a familiar issue, officially listed as a bug, that can occur when combining subquery materialisation with distributed transactions.

I had previously tuned a query for a colleague, the best approach for which was using implicit materialised subquery factoring (implicit because it wasn’t hinted as /*+ materalize */ but was referenced multiple times in the main query) and I had provided such a solution, without even a thought to the performance threat that the query might be running in an XA transaction.

As a result, unsurprisingly with the benefit of hindsight, the query was not performing as expected from the application in said XA transaction.

Let me provide a refresher on the issue.

create table t1
as
select rownum col1
from   dual
connect by rownum <= 100000;

alter session set statistics_level = all;

with x as  
(select count(*) col1  
 from   t1)  
select x1.*, x2.*
from x x1, x x2;  

select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));

provides

PLAN_TABLE_OUTPUT 
---------------------------------------------------------------------------------------------
SQL_ID  aqsc5m0wtj7wn, child number 0
-------------------------------------                              
with x as   (select count(*) col1    from   t1)   select x1.*, x2.*
from x x1, x x2

Plan hash value: 1115620710 

---------------------------------------------------------------------------------------------
| Id  | Operation                  | Name                        | Starts | E-Rows | A-Rows |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |                             |      1 |        |      1 |
|   1 |  TEMP TABLE TRANSFORMATION |                             |      1 |        |      1 |
|   2 |   LOAD AS SELECT           |                             |      1 |        |      1 |
|   3 |    SORT AGGREGATE          |                             |      1 |      1 |      1 |
|   4 |     TABLE ACCESS FULL      | T1                          |      1 |    122K|    100K|
|   5 |   MERGE JOIN CARTESIAN     |                             |      1 |      1 |      1 |
|   6 |    VIEW                    |                             |      1 |      1 |      1 |
|   7 |     TABLE ACCESS FULL      | SYS_TEMP_0FD9D6717_BCA95716 |      1 |      1 |      1 |
|   8 |    BUFFER SORT             |                             |      1 |      1 |      1 |
|   9 |     VIEW                   |                             |      1 |      1 |      1 |
|  10 |      TABLE ACCESS FULL     | SYS_TEMP_0FD9D6717_BCA95716 |      1 |      1 |      1 |
---------------------------------------------------------------------------------------------

Note
-----  
   - dynamic sampling used for this statement
                  

 27 rows selected 

As you might expect, subquery materialisation using a temp table.

But if you execute the query within a distributed transaction, it will probably not use a temporary table (see original post above about why I say probably).

I’m using a simple insert script over a db link to start the distributed transaction but in my real world occurrence it was the JDBC XA driver.

drop table t2;

create table t2
(col1 number);

alter system flush shared_pool;

select count(*) from gv$sql where sql_text like 'with x%';

alter session set statistics_level = all;

insert into t2@mydba values(1);

with x as  
(select count(*) col1  
 from   t1)  
select x1.*, x2.*
from x x1, x x2;  

select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));

rollback;

Which gives:

PLAN_TABLE_OUTPUT  
---------------------------------------------------------------------------------------
SQL_ID  aqsc5m0wtj7wn, child number 0
-------------------------------------
with x as   (select count(*) col1    from   t1)   select x1.*, x2.*
from x x1, x x2

Plan hash value: 3433669518

---------------------------------------------------------------------------------------
| Id  | Operation            | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |      1 |        |      1 |00:00:00.08 |     314 |
|   1 |  NESTED LOOPS        |      |      1 |      1 |      1 |00:00:00.08 |     314 |
|   2 |   VIEW               |      |      1 |      1 |      1 |00:00:00.04 |     157 |
|   3 |    SORT AGGREGATE    |      |      1 |      1 |      1 |00:00:00.04 |     157 |
|   4 |     TABLE ACCESS FULL| T1   |      1 |    122K|    100K|00:00:00.01 |     157 |
|   5 |   VIEW               |      |      1 |      1 |      1 |00:00:00.04 |     157 |
|   6 |    SORT AGGREGATE    |      |      1 |      1 |      1 |00:00:00.04 |     157 |
|   7 |     TABLE ACCESS FULL| T1   |      1 |    122K|    100K|00:00:00.01 |     157 |
--------------------------------------------------------------------------------------- 

Note
-----
   - dynamic sampling used for this statement


 24 rows selected 

As mentioned, the distributed transaction bypasses materialisation which could have a very negative effect on the performance of a particular query where that materialisation was key.

Part II – Materialized views.

This trouble with distributed transactions reminds me of a particular bugbear with materialised views.

This restriction is listed here in the documentation

http://docs.oracle.com/cd/E11882_01/server.112/e10706/repmview.htm

I’m surprised that a bigger deal is not made of it, given the proliferation of XA drivers and distributed transactions in many enterprise applications.

This is demonstrated quite easily, again using a db link to do the distributed transaction.

drop table t2;

create table t2
(col1 number
,constraint pk_t2 primary key(col1));

create materialized view log on t2;

create materialized view mv2
build immediate
refresh fast on commit
as
select * from t2;

insert into t2@mydba values (1);

commit;

Which, on commit, gives:

SQL Error: ORA-02050: transaction 26.8.810414 rolled back, some remote DBs may be in-doubt
ORA-02050: transaction 26.8.810414 rolled back, some remote DBs may be in-doubt
ORA-02051: another session in same transaction failed
ORA-02063: preceding 2 lines from MYDBA
02050. 00000 -  "transaction %s rolled back, some remote DBs may be in-doubt"
*Cause:    network or remote failure in 2PC.
*Action:   Notify operations; remote DBs will automatically re-sync when the
           failure is repaired.

I keep meaning to double check that the same applies when using XA driver and Java but there’s no reason to think it wouldn’t be the same.

Tom Kyte has mentioned this several times on AskTom, “recently” here.

I’m not quite sure I get his comment about OLTP databases and distributed transactions but, not for the first time, I’ve found myself wanting to use MVs to mitigate certain performance problems and have run into this restriction

Part III – Locking scripts
There are a fair number of scripts out there on t’internet for showing / diagnosing locking.

During my special distributed transaction day yesterday, I was reminded that anything lock related that starts with V$SESSION or depends on / makes assumptions about SID/SESSION_ID can be misleading.

I initially thought something a little unusual was up because neither V$SESSION or ASH could tell me who was blocking a bunch of requested mode 6 TX locks. But I thought some of the blocking diagnosis was missing because of cross-node RAC complications.

But it actually was because there were transactions holding locks but no sessions.

Admittedly, this was a slightly strange situation, but essentially what happened was that an XA JDBC transaction that was also mixing with ActiveMQ ended up causing what I suppose would essentially be orphaned transactions, and locks held by transactions that had no associated sessions.

This may be an “exotic” situation but V$LOCKED_OBJECT, for example, exposes SESSION_ID but no SESSION_SERIAL# so itself can be very misleading if you go off joining willy nilly by SESSION_ID because SIDs get reused quickly.

Similarly, DBA_BLOCKERS and DBA_WAITERS both showed zilch.

A more reliable starting point was to ignore V$LOCKED_OBJECT.SESSION_ID and pay more attention to the transction information, joining to V$TRANSACTION and then outer joining to V$SESSION, like so:

select lo.inst_id
,      lo.object_id
,      lo.session_id
,      lo.os_user_name
,      lo.process
,      lo.locked_mode
,      ob.owner
,      ob.object_name
,      tx.addr
,      tx.start_time txn_start_time
,      tx.status
,      s.*
from   gv$locked_object lo
,      dba_objects      ob
,      gv$transaction    tx
,      gv$session        s
where  ob.object_id = lo.object_id
and    tx.xidusn    (+) = lo.xidusn
and    tx.xidslot   (+) = lo.xidslot
and    tx.xidsqn    (+) = lo.xidsqn
and    s.taddr      (+) = tx.addr; 

Which showed me, as mentioned, a whole bunch of locks in these orphaned, PREPARED distributed transactions dating back a few hours.

CPU_time much higher than elapsed_time

Of course, one of the first places you should go is Oracle Support, but if you, like me, put off doing that and scratch your head for a while wondering why ELAPSED_TIME in V$SQL might be significantly lower than CPU_TIME when there are no obvious other explanations, then note these two bugs in 11.1.0.7:

BUG 7168259 – Millisecond elapsed_time shows as 0 : Elapsed_time for SQL statements which execute in under one millisecond can show up as 0

BUG 7561762 – Elapsed_time is under-reported : Elapsed_time can be 0 or significantly less than V$SQL.CPU_TIME if there is significant idle wait class time

Modelling a “simple” ITL problem

Today I have been looking at an application problem.

Last night important client processing missed its SLA with the application reporting ORA-02049: timeout: distributed timeout waiting for lock.

This post isn’t about the distributed lock timing out.

But consider the evidence that further investigation into these distributed locks threw up.

  1. The parameter distributed_lock_timeout is set to the default of 60 (seconds) – this means that any distributed transaction that waits for an enqueue for 60 seconds will be picked off by a sniper, similar to the standard deadlock detection after 3 seconds.
  2. The 1-in-10 second ASH samples in DBA_HIST_ACTIVE_SESS_HISTORY show clear waits on “enq: TX – allocate ITL entry”
  3. These same ASH samples show one particular sql id as being involved in the ITL waits.
  4. This particular sql id is a DELETE on a single table, no triggers, no on delete cascade referential integrity.

Sounds easy right?

But…

  1. V$SEGMENT_STATISTICS and DBA_HIST_SEG_STAT show 0 ITL waits for the segment/s related to the DELETE sql id above.
  2. In the ASH data, for the event “enq: TX – allocate ITL entry”, the CURRENT_OBJ# column indicates a variety of objects that might show ITL waits under concurrent DML but, again, all unrelated to the sql id identified by ASH.

So…

Time for some theories:

  1. ASH could be misleading us about the sql ids and/or
  2. ASH could be misleading about current_obj# and/or
  3. If our enqueue times out because of the distributed lock timeout, then perhaps the ITL wait is not recorded – i.e. segment stats are misleading.

Theories about ASH being misleading under certain circumstances are incredibly convoluted to test but, as far as I am concerned, even up to the latest 11gR2 this is a known known. There are certain values that are not cleared out or not recorded properly or disappear under certain conditions.

But perhaps we can test theory 3 relatively easily?

I like recycling – I’m going to reuse Randolf Geist’s ITL test case here.

SQL> get itl_case.sql
  1  create table t1 (c1 varchar2(1336)) pctfree 10;
  2  insert into t1 values ('a');
  3  insert into t1 values ('b');
  4  insert into t1 values ('c');
  5  insert into t1 values ('d');
  6  insert into t1 values ('e');
  7  insert into t1 values ('f');
  8  commit;
  9  update t1 set c1 = rpad(c1, 1335, c1) where length(c1) = 1;
 10* commit;
 11  .
SQL> @itl_case

Table created.
1 row created.
1 row created.
1 row created.
1 row created.
1 row created.
1 row created.
Commit complete.
6 rows updated.
Commit complete.

SQL>

Now I’m going to delete some rows across three sessions.
The first two session simple uncommitted deletes. The third will do a select over a db link in order to start the distributed transaction followed by the same simple delete.

First session – delete some rows and leave uncommitted so the ITL cannot be cleared:

SQL_SESS1>select distinct sid from v$mystat;

       SID
----------
      2056

SQL_SESS1>delete from t1 where c1 like 'a%';

1 row deleted.

SQL_SESS1>

Second session – delete some rows and leave uncommitted so the ITL cannot be cleared:

SQL_SESS2>select distinct sid from v$mystat;

       SID
----------
      2126

1 row selected.

SQL_SESS2>select to_char(sysdate,'DD-MON-YYYY HH24:MI') from dual;

TO_CHAR(SYSDATE,'
-----------------
01-NOV-2012 23:32

1 row selected.

SQL_SESS2>delete from t1 where c1 like 'b%';

1 row deleted.

SQL_SESS2>

Finally – third session – try to delete some rows within a distributed transaction:

SQL_SESS3>select distinct sid from v$mystat;

       SID
----------
      1884

1 row selected.

SQL_SESS3>select 1 from dual@mydba;

         1
----------
         1

1 row selected.

SQL_SESS3>delete from t1 where c1 like 'c%';
delete from t1 where c1 like 'c%'
*
ERROR at line 1:
ORA-02049: timeout: distributed transaction waiting for lock


SQL_SESS3>

Right, so what does ASH tell us about what happened?

SQL_SESS2>l
  1  select to_char(sample_time,'DD-MON-YYYY HH24:MI:SS'), sql_id, event, blocking_session, current_obj#
  2  from   v$active_Session_history
  3  where  event = 'enq: TX - allocate ITL entry'
  4  and    sample_time >= to_date('01-NOV-2012 23:32','DD-MON-YYYY HH24:MI')
  5* order  by sample_time desc
SQL_SESS2>/

TO_CHAR(SAMPLE_TIME, SQL_ID        EVENT                          BLOCKING_SESSION CURRENT_OBJ#
-------------------- ------------- ------------------------------ ---------------- ------------
01-NOV-2012 23:35:14 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:35:13 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:35:12 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:35:11 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:35:10 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:35:09 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:35:08 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:35:07 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:35:06 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:35:05 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:35:04 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:35:03 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:35:02 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:35:01 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:35:00 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:59 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:58 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:57 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:56 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:55 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:54 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:53 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:52 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:51 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:50 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:49 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:48 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:47 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:46 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:45 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:44 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:43 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:42 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:41 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:40 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:39 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:38 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:37 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:36 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:35 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:34 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:33 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:32 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:31 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:30 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:29 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:28 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:27 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:26 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:25 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:24 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:23 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:22 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:21 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:20 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:19 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:18 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:17 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:16 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:15 430xqxrrucgc5 enq: TX - allocate ITL entry               2056           -1
01-NOV-2012 23:34:14 430xqxrrucgc5 enq: TX - allocate ITL entry               2056           -1
01-NOV-2012 23:34:13 430xqxrrucgc5 enq: TX - allocate ITL entry               2056           -1
01-NOV-2012 23:34:12 430xqxrrucgc5 enq: TX - allocate ITL entry               2056           -1
01-NOV-2012 23:34:11 430xqxrrucgc5 enq: TX - allocate ITL entry               2056           -1

71 rows selected.

Elapsed: 00:00:00.04
SQL_SESS2>

So, we are reporting wait event “enq: TX – allocate ITL entry” which was the aim.

And look at column current_obj# – seems perhaps that it is not set properly for this wait event (even though my rows in my real world example were not -1)

Sql id 430xqxrrucgc5 is my delete statement so that is bang on the money.

Finally, what do segment statistics say about ITL waits for this object t1?

SQL_SESS3>l
  1  select statistic_name, value
  2  from   v$segment_statistics
  3  where  object_name = 'T1'
  4* and    statistic_name = 'ITL waits'
SQL_SESS3>/

STATISTIC_NAME                                                        VALUE
---------------------------------------------------------------- ----------
ITL waits                                                                 0

1 row selected.

SQL_SESS3>

Because the statement was sniped by the distributed transaction timeout, the segment ITL statistics are not incremented and so these (and AWRs snaps thereof) miss the significance.

A good point made by Coskan is that I did not include the blocking session information in my ASH output.

Simply this is because it made the output too wide.

One of the sessions is reported as blocker but in reality there may be many blockers – it’s as many as there are ITL slots.

Also, depending on how many sessions are waiting, you may find that your blocked statement is unlucky and gets superceded by other sessions who get the ITL just through good timing of when they tried to get the enqueue. This is what was happening in my particular situation where an unlucky session was blocked for 1 minute – up to the distributed lock timeout – by a variety of blockers who at other times in that minute were busy doing other stuff not all waiting on this enqueue.

Manual Concatenation and LNNVL

A quick example about things to consider when considering rewriting a SQL OR condition to a UNION/UNION ALL, as pointed out to me by Jonathan Lewis referencing this post on this recent OTN forum thread.

At the heart of the issue are:

  • how to deal with the possibility of duplicates
  • the potential impact of NULLS

The impact of NULLS is best described by Jonathan in the article already mentioned so I’m going to use a noddy example to illustrate:

  • the problem of duplicates
  • why suggesting a UNION is not an appropriate generic approach (although it may work for specific examples)
  • why that generic approach should be a UNION ALL plus LNNVL

So… if, for whatever reason, we want to rewrite this statement:

select *
from   t2, t1
where  t2.col1 = t1.par
or     t2.col1 = t1.child;

If this is our setup that includes a noddy duplicate:

SQL> create table t1
  2  (par   number
  3  ,child number);

Table created.

SQL> create table t2
  2  (col1 number
  3  ,col2 varchar2(1));

Table created.

SQL> insert into t1 values (1,1);

1 row created.

SQL> insert into t1 values (1,1);

1 row created.

SQL> insert into t2 values (1,'A');

1 row created.

SQL> select * from t1;

       PAR      CHILD
---------- ----------
         1          1
         1          1

2 rows selected.

SQL> select * from t2;

      COL1 C
---------- -
         1 A

1 row selected.

SQL> 

Such that our target statement returns this:

SQL> select *
  2  from   t2, t1
  3  where  t2.col1 = t1.par
  4  or     t2.col1 = t1.child;

      COL1 C        PAR      CHILD
---------- - ---------- ----------
         1 A          1          1
         1 A          1          1

2 rows selected.

SQL> 

We can see that rewriting with a UNION is not equivalent to the original:

SQL> select *
  2  from   t2, t1
  3  where  t2.col1 = t1.par
  4  union
  5  select *
  6  from   t2, t1
  7  where  t2.col1 = t1.child;

      COL1 C        PAR      CHILD
---------- - ---------- ----------
         1 A          1          1

1 row selected.

SQL> 

And if we use a UNION ALL:

SQL> select *
  2  from   t2, t1
  3  where  t2.col1 = t1.par
  4  union all
  5  select *
  6  from   t2, t1
  7  where  t2.col1 = t1.child;

      COL1 C        PAR      CHILD
---------- - ---------- ----------
         1 A          1          1
         1 A          1          1
         1 A          1          1
         1 A          1          1

4 rows selected.

Then we need LNNVL to eliminate the duplicates from the second branch, the lower half of the statement:

SQL> select *
  2  from   t2, t1
  3  where  t2.col1 = t1.par
  4  union all
  5  select *
  6  from   t2, t1
  7  where  t2.col1 = t1.child
  8  and    lnnvl(t1.par=t1.child);

      COL1 C        PAR      CHILD
---------- - ---------- ----------
         1 A          1          1
         1 A          1          1

2 rows selected.

SQL> 
Follow

Get every new post delivered to your Inbox.

Join 75 other followers