CPU_time much higher than elapsed_time

Of course, one of the first places you should go is Oracle Support, but if you, like me, put off doing that and scratch your head for a while wondering why ELAPSED_TIME in V$SQL might be significantly lower than CPU_TIME when there are no obvious other explanations, then note these two bugs in 11.1.0.7:

BUG 7168259 – Millisecond elapsed_time shows as 0 : Elapsed_time for SQL statements which execute in under one millisecond can show up as 0

BUG 7561762 – Elapsed_time is under-reported : Elapsed_time can be 0 or significantly less than V$SQL.CPU_TIME if there is significant idle wait class time

Modelling a “simple” ITL problem

Today I have been looking at an application problem.

Last night important client processing missed its SLA with the application reporting ORA-02049: timeout: distributed timeout waiting for lock.

This post isn’t about the distributed lock timing out.

But consider the evidence that further investigation into these distributed locks threw up.

  1. The parameter distributed_lock_timeout is set to the default of 60 (seconds) – this means that any distributed transaction that waits for an enqueue for 60 seconds will be picked off by a sniper, similar to the standard deadlock detection after 3 seconds.
  2. The 1-in-10 second ASH samples in DBA_HIST_ACTIVE_SESS_HISTORY show clear waits on “enq: TX – allocate ITL entry”
  3. These same ASH samples show one particular sql id as being involved in the ITL waits.
  4. This particular sql id is a DELETE on a single table, no triggers, no on delete cascade referential integrity.

Sounds easy right?

But…

  1. V$SEGMENT_STATISTICS and DBA_HIST_SEG_STAT show 0 ITL waits for the segment/s related to the DELETE sql id above.
  2. In the ASH data, for the event “enq: TX – allocate ITL entry”, the CURRENT_OBJ# column indicates a variety of objects that might show ITL waits under concurrent DML but, again, all unrelated to the sql id identified by ASH.

So…

Time for some theories:

  1. ASH could be misleading us about the sql ids and/or
  2. ASH could be misleading about current_obj# and/or
  3. If our enqueue times out because of the distributed lock timeout, then perhaps the ITL wait is not recorded – i.e. segment stats are misleading.

Theories about ASH being misleading under certain circumstances are incredibly convoluted to test but, as far as I am concerned, even up to the latest 11gR2 this is a known known. There are certain values that are not cleared out or not recorded properly or disappear under certain conditions.

But perhaps we can test theory 3 relatively easily?

I like recycling – I’m going to reuse Randolf Geist’s ITL test case here.

SQL> get itl_case.sql
  1  create table t1 (c1 varchar2(1336)) pctfree 10;
  2  insert into t1 values ('a');
  3  insert into t1 values ('b');
  4  insert into t1 values ('c');
  5  insert into t1 values ('d');
  6  insert into t1 values ('e');
  7  insert into t1 values ('f');
  8  commit;
  9  update t1 set c1 = rpad(c1, 1335, c1) where length(c1) = 1;
 10* commit;
 11  .
SQL> @itl_case

Table created.
1 row created.
1 row created.
1 row created.
1 row created.
1 row created.
1 row created.
Commit complete.
6 rows updated.
Commit complete.

SQL>

Now I’m going to delete some rows across three sessions.
The first two session simple uncommitted deletes. The third will do a select over a db link in order to start the distributed transaction followed by the same simple delete.

First session – delete some rows and leave uncommitted so the ITL cannot be cleared:

SQL_SESS1>select distinct sid from v$mystat;

       SID
----------
      2056

SQL_SESS1>delete from t1 where c1 like 'a%';

1 row deleted.

SQL_SESS1>

Second session – delete some rows and leave uncommitted so the ITL cannot be cleared:

SQL_SESS2>select distinct sid from v$mystat;

       SID
----------
      2126

1 row selected.

SQL_SESS2>select to_char(sysdate,'DD-MON-YYYY HH24:MI') from dual;

TO_CHAR(SYSDATE,'
-----------------
01-NOV-2012 23:32

1 row selected.

SQL_SESS2>delete from t1 where c1 like 'b%';

1 row deleted.

SQL_SESS2>

Finally – third session – try to delete some rows within a distributed transaction:

SQL_SESS3>select distinct sid from v$mystat;

       SID
----------
      1884

1 row selected.

SQL_SESS3>select 1 from dual@mydba;

         1
----------
         1

1 row selected.

SQL_SESS3>delete from t1 where c1 like 'c%';
delete from t1 where c1 like 'c%'
*
ERROR at line 1:
ORA-02049: timeout: distributed transaction waiting for lock


SQL_SESS3>

Right, so what does ASH tell us about what happened?

SQL_SESS2>l
  1  select to_char(sample_time,'DD-MON-YYYY HH24:MI:SS'), sql_id, event, blocking_session, current_obj#
  2  from   v$active_Session_history
  3  where  event = 'enq: TX - allocate ITL entry'
  4  and    sample_time >= to_date('01-NOV-2012 23:32','DD-MON-YYYY HH24:MI')
  5* order  by sample_time desc
SQL_SESS2>/

TO_CHAR(SAMPLE_TIME, SQL_ID        EVENT                          BLOCKING_SESSION CURRENT_OBJ#
-------------------- ------------- ------------------------------ ---------------- ------------
01-NOV-2012 23:35:14 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:35:13 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:35:12 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:35:11 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:35:10 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:35:09 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:35:08 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:35:07 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:35:06 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:35:05 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:35:04 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:35:03 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:35:02 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:35:01 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:35:00 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:59 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:58 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:57 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:56 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:55 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:54 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:53 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:52 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:51 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:50 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:49 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:48 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:47 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:46 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:45 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:44 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:43 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:42 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:41 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:40 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:39 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:38 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:37 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:36 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:35 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:34 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:33 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:32 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:31 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:30 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:29 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:28 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:27 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:26 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:25 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:24 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:23 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:22 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:21 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:20 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:19 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:18 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:17 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:16 430xqxrrucgc5 enq: TX - allocate ITL entry               2126           -1
01-NOV-2012 23:34:15 430xqxrrucgc5 enq: TX - allocate ITL entry               2056           -1
01-NOV-2012 23:34:14 430xqxrrucgc5 enq: TX - allocate ITL entry               2056           -1
01-NOV-2012 23:34:13 430xqxrrucgc5 enq: TX - allocate ITL entry               2056           -1
01-NOV-2012 23:34:12 430xqxrrucgc5 enq: TX - allocate ITL entry               2056           -1
01-NOV-2012 23:34:11 430xqxrrucgc5 enq: TX - allocate ITL entry               2056           -1

71 rows selected.

Elapsed: 00:00:00.04
SQL_SESS2>

So, we are reporting wait event “enq: TX – allocate ITL entry” which was the aim.

And look at column current_obj# – seems perhaps that it is not set properly for this wait event (even though my rows in my real world example were not -1)

Sql id 430xqxrrucgc5 is my delete statement so that is bang on the money.

Finally, what do segment statistics say about ITL waits for this object t1?

SQL_SESS3>l
  1  select statistic_name, value
  2  from   v$segment_statistics
  3  where  object_name = 'T1'
  4* and    statistic_name = 'ITL waits'
SQL_SESS3>/

STATISTIC_NAME                                                        VALUE
---------------------------------------------------------------- ----------
ITL waits                                                                 0

1 row selected.

SQL_SESS3>

Because the statement was sniped by the distributed transaction timeout, the segment ITL statistics are not incremented and so these (and AWRs snaps thereof) miss the significance.

A good point made by Coskan is that I did not include the blocking session information in my ASH output.

Simply this is because it made the output too wide.

One of the sessions is reported as blocker but in reality there may be many blockers – it’s as many as there are ITL slots.

Also, depending on how many sessions are waiting, you may find that your blocked statement is unlucky and gets superceded by other sessions who get the ITL just through good timing of when they tried to get the enqueue. This is what was happening in my particular situation where an unlucky session was blocked for 1 minute – up to the distributed lock timeout – by a variety of blockers who at other times in that minute were busy doing other stuff not all waiting on this enqueue.

Manual Concatenation and LNNVL

A quick example about things to consider when considering rewriting a SQL OR condition to a UNION/UNION ALL, as pointed out to me by Jonathan Lewis referencing this post on this recent OTN forum thread.

At the heart of the issue are:

  • how to deal with the possibility of duplicates
  • the potential impact of NULLS

The impact of NULLS is best described by Jonathan in the article already mentioned so I’m going to use a noddy example to illustrate:

  • the problem of duplicates
  • why suggesting a UNION is not an appropriate generic approach (although it may work for specific examples)
  • why that generic approach should be a UNION ALL plus LNNVL

So… if, for whatever reason, we want to rewrite this statement:

select *
from   t2, t1
where  t2.col1 = t1.par
or     t2.col1 = t1.child;

If this is our setup that includes a noddy duplicate:

SQL> create table t1
  2  (par   number
  3  ,child number);

Table created.

SQL> create table t2
  2  (col1 number
  3  ,col2 varchar2(1));

Table created.

SQL> insert into t1 values (1,1);

1 row created.

SQL> insert into t1 values (1,1);

1 row created.

SQL> insert into t2 values (1,'A');

1 row created.

SQL> select * from t1;

       PAR      CHILD
---------- ----------
         1          1
         1          1

2 rows selected.

SQL> select * from t2;

      COL1 C
---------- -
         1 A

1 row selected.

SQL> 

Such that our target statement returns this:

SQL> select *
  2  from   t2, t1
  3  where  t2.col1 = t1.par
  4  or     t2.col1 = t1.child;

      COL1 C        PAR      CHILD
---------- - ---------- ----------
         1 A          1          1
         1 A          1          1

2 rows selected.

SQL> 

We can see that rewriting with a UNION is not equivalent to the original:

SQL> select *
  2  from   t2, t1
  3  where  t2.col1 = t1.par
  4  union
  5  select *
  6  from   t2, t1
  7  where  t2.col1 = t1.child;

      COL1 C        PAR      CHILD
---------- - ---------- ----------
         1 A          1          1

1 row selected.

SQL> 

And if we use a UNION ALL:

SQL> select *
  2  from   t2, t1
  3  where  t2.col1 = t1.par
  4  union all
  5  select *
  6  from   t2, t1
  7  where  t2.col1 = t1.child;

      COL1 C        PAR      CHILD
---------- - ---------- ----------
         1 A          1          1
         1 A          1          1
         1 A          1          1
         1 A          1          1

4 rows selected.

Then we need LNNVL to eliminate the duplicates from the second branch, the lower half of the statement:

SQL> select *
  2  from   t2, t1
  3  where  t2.col1 = t1.par
  4  union all
  5  select *
  6  from   t2, t1
  7  where  t2.col1 = t1.child
  8  and    lnnvl(t1.par=t1.child);

      COL1 C        PAR      CHILD
---------- - ---------- ----------
         1 A          1          1
         1 A          1          1

2 rows selected.

SQL> 

Plan_hash_value and internal temporary table names

Here’s a little thing about plan_hash_2 that’s come via a pointer from Jonathan Lewis to bug 10162430.

From querying DBA_HIST_SQLSTAT I happened to notice that over a period of 65 days a particular query had 63 different plans (PLAN_HASH_VALUE).

I found that the differences were down done to the different internal temporary table names used for a materialized subquery, i.e. the plans were essentially the same but for the name of SYS_TEMP_0FD9D6786_B6EF87D2, SYS_TEMP_0FD9D6825_BE8671F, SYS_TEMP_0FD9D6684_EAD00827, etc.

Exactly what bug 10162430 addresses is unclear because it is listed as fixed in 11.2.0.3 and my different plan hashes come from an 11.2.0.3 database.

But the key information in the bug description highlighted that PLAN_HASH_2 is not affected by these differing temp table names.

Exactly what plan_hash_2 is and how it differs from plan_hash_value is undocumented but:

  • it is found in V$SQL_PLAN.OTHER_XML;
  • also in the AWR repository DBA_HIST_SQL_PLAN.OTHER_XML;
  • and is used as the enforcing plan hash in sql plan baselines – not exposed in DBA_SQL_PLAN_BASELINES but internally represented in column PLAN_ID in SYS.SQLOBJ$, SYS.SQLOBJ$DATA, SYS.SQLOBJ$AUXDATA.

So, if you regularly look at DBA_HIST_SQLSTAT to look at historic execution statistics of a SQL statement and your queries use materialized subqueries then I recommend adding PLAN_HASH_2 to the output, something like this:

select sn.snap_id
,      sn.end_interval_time
,      st.module
,      st.sql_id
,      st.plan_hash_value
,      (select extractvalue(xmltype(other_xml),'other_xml/info[@type="plan_hash_2"]')
        from   dba_hist_sql_plan hp
        where  hp.sql_id          = st.sql_id
        and    hp.plan_hash_value = st.plan_hash_value
        and    hp.other_xml is not null) plan_hash_2
,      rows_processed_delta rws
,      executions_delta     execs
,      elapsed_time_delta   elp
,      cpu_time_delta       cpu
,      buffer_gets_delta    gets
,      iowait_delta         io
from   dba_hist_snapshot sn
,      dba_hist_sqlstat  st
where  st.snap_id            = sn.snap_id
and    st.sql_id             = '&sql_id'
and    st.elapsed_time_delta > 0
order by sn.snap_id desc; 

And just to show this difference between PHV and PH2:

   SNAP_ID END_INTERVAL_TIME         PLAN_HASH_VALUE PLAN_HASH_2
---------- ------------------------- --------------- ------------
      9131 12/07/2012 00:00               1987570133 3399935153
      9083 11/07/2012 00:00               1567016512 3399935153
      9035 10/07/2012 00:00               3386529786 3399935153
      8891 07/07/2012 00:00               2197008736 3399935153
      8842 05/07/2012 23:30                991904690 3399935153
      8794 04/07/2012 23:30               1331394991 3399935153
      8746 03/07/2012 23:30               2577119670 3399935153
      8699 03/07/2012 00:00               3774955286 3399935153
      8698 02/07/2012 23:30               3774955286 3399935153
      8555 30/06/2012 00:00               3308813978 3399935153
      8507 29/06/2012 00:00                796849390 3399935153
      8459 28/06/2012 00:00                917735929 3399935153
      8410 26/06/2012 23:30                139934889 3399935153
      8363 26/06/2012 00:00               1936665492 3399935153
      8219 23/06/2012 00:00                666306815 3399935153
      8171 22/06/2012 00:00               1053584101 3399935153
      8123 21/06/2012 00:00               1353471101 3399935153
...

Virtual Column

Stated requirement from the forum:
To have a foreign key to a parent table only when the status in the child table is a certain value whilst allowing for rubbish and NULLs in the child’s fk column when the status is not ACTIVE.

We can do this with a virtual column.

Here’s how.

Setup:

SQL> create table t1
  2  (col1 number primary key);

Table created.

SQL> insert into t1 values (1);

1 row created.

SQL> create table t2
  2  (col1    number primary key
  3  ,status  varchar2(12) not null
  4  ,col2    number
  5  ,col2v   number generated always as (case when status = 'ACTIVE' then col2 end) VIRTUAL
  6  ,constraint t2_fk foreign key (col2v) references t1(col1)
  7  ,constraint t2_ck check (status IN ('ACTIVE','INACTIVE') and (status = 'INACTIVE' or col2 IS NOT NULL)));

Table created.

SQL> create sequence s1;

Sequence created.

SQL> 

First requirement – insert an active row into T2 with a valid FK in col2:

SQL> insert into t2 (col1, status, col2) values (s1.nextval,'ACTIVE',1);

1 row created.

SQL> 

Second requirement – ensure that active rows have a valid foreign key:

SQL> insert into t2 (col1, status, col2) values (s1.nextval,'ACTIVE',-99);
insert into t2 (col1, status, col2) values (s1.nextval,'ACTIVE',-99)
*
ERROR at line 1:
ORA-02291: integrity constraint (RIMS.T2_FK) violated - parent key not found

SQL> 

Third requirement – allow inactive rows to have rubbish in col2 (for historic data quality reasons?):

SQL> insert into t2 (col1, status, col2) values (s1.nextval,'INACTIVE',-99);

1 row created.

SQL> 

Fourth requirement – prevent NULL foreign key values for active rows:

SQL> insert into t2 (col1, status, col2) values (s1.nextval,'ACTIVE',NULL);
insert into t2 (col1, status, col2) values (s1.nextval,'ACTIVE',NULL)
*
ERROR at line 1:
ORA-02290: check constraint (RIMS.T2_CK) violated

SQL> 

Just some minor tweaks on the requirements which came up in this forum question…

Firstly, let’s say it’s not just statuses of ACTIVE/INACTIVE. We want to ensure that data quality for ACTIVE is protected but there’s a whole bunch of other statuses not just INACTIVE.

So, change the check constraint:

SQL>  alter table t2 drop constraint t2_ck;

Table altered.

SQL> alter table t2 add constraint t2_ck check (NOT (status = 'ACTIVE' and col2 IS NULL));

Table altered.

SQL> insert into t2 (col1, status, col2) values (s1.nextval,'WHO CARES',-75);

1 row created.

SQL> 

Or what if active statuses can insert a valid foreign key or NULL?

Just lose the check constraint altogether:

SQL>  alter table t2 drop constraint t2_ck;

Table altered.

SQL> insert into t2 (col1, status, col2) values (s1.nextval,'ACTIVE',NULL);

1 row created.

SQL> insert into t2 (col1, status, col2) values (s1.nextval,'WHO KNOWS',-75);

1 row created.

SQL> 
SQL> insert into t2 (col1, status, col2) values (s1.nextval,'INACTIVE',-99);

1 row created.

SQL> 
SQL> insert into t2 (col1, status, col2) values (s1.nextval,'ACTIVE',-99);
insert into t2 (col1, status, col2) values (s1.nextval,'ACTIVE',-99)
*
ERROR at line 1:
ORA-02291: integrity constraint (RIMS.T2_FK) violated - parent key not found


SQL> 
SQL> insert into t2 (col1, status, col2) values (s1.nextval,'ACTIVE',1);

1 row created.

SQL> 

Thoughts on TM_DELTA_TIME

What are these columns in V$ACTIVE_SESSION_HISTORY – TM_DELTA_TIME, TM_DELTA_CPU_TIME and TM_DELTA_DB_TIME?

What does the doco say?

  • TM_DELTA_TIME – Time interval (in microseconds) over which TM_DELTA_CPU_TIME and TM_DELTA_DB_TIME are accumulated
  • TM_DELTA_CPU_TIME – Amount of time this session spent on CPU over the last TM_DELTA_TIME microseconds
  • TM_DELTA_DB_TIME – Amount of time spent by this session in database calls over the last TM_DELTA_TIME microseconds

Need a reminder on TIME MODEL STATISTICS?

The most important of the time model statistics is DB time.

This statistics represents the total time spent in database calls and is an indicator of the total instance workload.

It is calculated by aggregating the CPU and wait times of all sessions not waiting on idle wait events (non-idle user sessions).

i.e. DB TIME is time spent either actively working or actively waiting

What this effectively means is that TM_DELTA_CPU_TIME and TM_DELTA_DB_TIME have no context without TM_DELTA_TIME.

So when looking at TM_DELTA_CPU_TIME and TM_DELTA_DB_TIME, all other sample times/periods are pretty much irrelevant apart from TM_DELTA_TIME – the latter is the only value that gives context to the other two.

TM_DELTA_DB_TIME is time spent actively working or actively waiting, some of this might be time spent on CPU – TM_DELTA_CPU_TIME – the rest will be time spent actively waiting on some non-idle wait events plus some overheads no doubt.

And TM_DELTA_DB_TIME should never exceed TM_DELTA_TIME, right?

The difference between TM_DELTA_TIME and TM_DELTA_DB_TIME should be idle time.

Sounds like something that might be quite useful particularly to counterbalance some of the pitfalls of looking at sampled data?

How might we investigate this further?

Shall we start with a plsql loop that should do nothing but sit on CPU for a short while?

SQL> select * from v$version;

BANNER
-------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
PL/SQL Release 11.2.0.3.0 - Production
CORE    11.2.0.3.0      Production
TNS for Linux: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production

SQL> select distinct sid from v$mystat;

       SID
----------
       642

SQL> alter session set plsql_optimize_level = 0;

Session altered.

SQL> set timing on
SQL> DECLARE
  2  BEGIN
  3   FOR i IN 1 .. 1000000000 LOOP
  4     NULL;
  5   END LOOP;
  6  END;
  7  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:16.46
SQL> 

What does ASH say?

SQL> select to_char(sample_time,'HH24:MI:SS')
  2  ,      is_awr_sample
  3  ,      event
  4  ,      session_state
  5  ,      to_char(tm_delta_time) tm_delta_time
  6  ,      to_char(tm_delta_cpu_time) tm_delta_cpu_time
  7  ,      to_char(tm_delta_db_time) tm_delta_db_time
  8  from   v$active_session_history
  9  where  session_id = 642
 10  and    sql_id     = '6b5drmtrvs6vv'
 11  order by sample_id desc;

TO_CHAR( I EVENT                     SESSION TM_DELTA_TIME   TM_DELTA_CPU_TIME TM_DELTA_DB_TIME
-------- - ------------------------- ------- --------------- ----------------- ----------------
14:45:25 N                           ON CPU
14:45:24 N                           ON CPU  2019831         2016694           2019831
14:45:23 N                           ON CPU
14:45:22 N                           ON CPU  2019978         2017693           2019978
14:45:21 N                           ON CPU
14:45:20 N                           ON CPU  2019848         2017694           2019848
14:45:19 N                           ON CPU
14:45:18 Y                           ON CPU
14:45:17 N                           ON CPU  2019934         2017693           2019934
14:45:16 N                           ON CPU
14:45:15 N                           ON CPU  2019803         2017693           2019803
14:45:14 N                           ON CPU
14:45:13 N                           ON CPU  2019925         2017693           2019925
14:45:12 N                           ON CPU
14:45:11 N                           ON CPU  4711327         1921708           1926278
14:45:10 N                           ON CPU  40504273        130979            134479

16 rows selected.

SQL> 

So, in this example, roughly every two seconds I get feedback of what my session was doing for that DELTA TIME period.

Unsurprisingly, given what we were doing, the session mainly spent 100% of it’s time in TM_DELTA_DB_TIME actively working, and TM_DELTA_CPU_TIME is the major component of that.

The difference between CPU_TIME – DB_TIME would be the time spent actively waiting plus some overhead.

If did more waiting on a non-idle wait event – IO for example – our DB_TIME would be similar but the element of CPU_TIME would be smaller, e.g.

SQL> DECLARE
  2  BEGIN
  3   FOR i IN (SELECT * FROM all_source) LOOP
  4     NULL;
  5   END LOOP;
  6  END;
  7  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:37.82

which gives

SQL> select to_char(sample_time,'HH24:MI:SS')
  2  ,      is_awr_sample
  3  ,      event
  4  ,      session_state
  5  ,      to_char(tm_delta_time) tm_delta_time
  6  ,      to_char(tm_delta_cpu_time) tm_delta_cpu_time
  7  ,      to_char(tm_delta_db_time) tm_delta_db_time
  8  from   v$active_session_history
  9  where  session_id = 642
 10  and    sql_id     IN ('2nj4dm3qwmkqh','34y8jhfh5w4q9')
 11  order by sample_id desc;

TO_CHAR( I EVENT                     SESSION TM_DELTA_TIME   TM_DELTA_CPU_TIME TM_DELTA_DB_TIME
-------- - ------------------------- ------- --------------- ----------------- ----------------
15:05:43 N                           ON CPU  2019899         1089835           2019899
15:05:42 Y db file scattered read    WAITING
15:05:41 N db file scattered read    WAITING 2020879         1359793           2020879
15:05:40 N                           ON CPU
15:05:39 N db file scattered read    WAITING 2022891         1487774           2022891
15:05:38 N db file scattered read    WAITING
15:05:37 N db file scattered read    WAITING 2021913         1123828           2021913
15:05:36 N                           ON CPU
15:05:35 N db file scattered read    WAITING 2036914         953856            2036914
15:05:34 N                           ON CPU
15:05:33 N db file scattered read    WAITING 1017936         686895            1017936
15:05:32 Y db file scattered read    WAITING 2039867         897864            2039867
15:05:31 N db file scattered read    WAITING
15:05:30 N db file scattered read    WAITING 1928599         1035843           1928599
15:05:29 N                           ON CPU
15:05:28 N                           ON CPU  2069218         1185819           2069218
15:05:27 N                           ON CPU
15:05:26 N                           ON CPU  2092901         840872            2092901
15:05:25 N                           ON CPU
15:05:24 N db file scattered read    WAITING 1998272         1044841           1998272
15:05:23 N db file scattered read    WAITING
15:05:22 Y db file scattered read    WAITING 2052570         1131828           2052570
15:05:21 N                           ON CPU
15:05:20 N db file scattered read    WAITING 2021863         1411786           2021863
15:05:19 N                           ON CPU
15:05:18 N db file scattered read    WAITING 1674137         797879            1674137
15:05:17 N db file scattered read    WAITING
15:05:16 N                           ON CPU  2350338         270958            2350338
15:05:15 N                           ON CPU
15:05:14 N db file scattered read    WAITING 1944057         165975            1944057
15:05:13 N db file scattered read    WAITING
15:05:12 Y db file scattered read    WAITING 2078929         1194819           2078929
15:05:11 N db file scattered read    WAITING
15:05:10 N                           ON CPU  2052015         1266807           2052015
15:05:09 N                           ON CPU
15:05:08 N db file scattered read    WAITING 985873021       915860            1846906
15:05:07 N                           ON CPU
15:05:06 N                           ON CPU  6746807         836873            843056

82 rows selected.

SQL> 

As expected, we see from TM_DELTA_DB_TIME that we are mainly spending 100% of our time actively working or actively waiting.

But, because we’ve had to do some IO, the proportion of that time which is CPU has reduced and the difference between TM_DELTA_DB_TIME and TM_DELTA_CPU_TIME should be roughly the time spent actively waiting during that TM_DELTA_TIME.

But in this output immediately above, do we start to see some of the limitations in this data?

What about this?

TO_CHAR( I EVENT                     SESSION TM_DELTA_TIME   TM_DELTA_CPU_TIME TM_DELTA_DB_TIME
-------- - ------------------------- ------- --------------- ----------------- ----------------
15:05:08 N db file scattered read    WAITING 985873021       915860            1846906

Why the big TM_DELTA_TIME all of a sudden? What’s that all about?

I’ve no idea and if anyone does then please comment.

I would presume that it’s a bug in the accounting.

And if it is a bug, it seriously limits the usefulness of this data for now and the ability to build diagnostic scripts on top of it.

So, moving quickly to how this data is stored down into the repository – DBA_HIST_ACTIVE_SESS_HISTORY.

You probably already know that every 10th second’s worth of the ASH buffer is marked for storage in the repository, as marked by V$ACTIVE_SESSION_HISTORY.IS_AWR_SAMPLE (included in the output above).

From the last buffer output above executing the loop of cpu+io workload, it should be these rows:

TO_CHAR( I EVENT                     SESSION TM_DELTA_TIME   TM_DELTA_CPU_TIME TM_DELTA_DB_TIME
-------- - ------------------------- ------- --------------- ----------------- ----------------
15:05:42 Y db file scattered read    WAITING
15:05:32 Y db file scattered read    WAITING 2039867         897864            2039867
15:05:22 Y db file scattered read    WAITING 2052570         1131828           2052570
15:05:12 Y db file scattered read    WAITING 2078929         1194819           2078929

My initial thoughts were to wonder whether these TM_DELTA figures are persisted “as is”.
In which case, their value would be seriously limited.

Don’t want to wait for the flushing of ASH buffer to the repository?
We can force the flush by creating an AWR snapshot.

SQL> exec dbms_workload_repository.create_snapshot;

PL/SQL procedure successfully completed.

SQL> 

And wham, bam, we can check out what has actually been persisted to DBA_HIST_ACTIVE_SESSION_HISTORY:

SQL> select to_char(sample_time,'HH24:MI:SS')
  2  ,      event
  3  ,      session_state
  4  ,      to_char(tm_delta_time) tm_delta_time
  5  ,      to_char(tm_delta_cpu_time) tm_delta_cpu_time
  6  ,      to_char(tm_delta_db_time) tm_delta_db_time
  7  from   dba_hist_active_sess_history
  8  where  session_id = 642
  9  and    sql_id     IN ('2nj4dm3qwmkqh','34y8jhfh5w4q9')
 10  order by sample_id desc;

TO_CHAR( EVENT                     SESSION TM_DELTA_TIME   TM_DELTA_CPU_TIME TM_DELTA_DB_TIME
-------- ------------------------- ------- --------------- ----------------- ----------------
15:05:42 db file scattered read    WAITING 9120533         5612146           703679604684133
15:05:32 db file scattered read    WAITING 10128857        7202905           281471853489246
15:05:22 db file scattered read    WAITING 10042965        7493861           20148167
15:05:12 db file scattered read    WAITING 1194282071      20199931          31190611

9 rows selected.

SQL> 

So, the good news would be that the values from the buffer are not just copied down but seem to be an aggregation, presumably of the sampled rows. I’m not sure because some of the numbers add up, some don’t.

But the big, bad news is that there are similar accounting bugs – note the values in the first two rows for TM_DELTA_DB_TIME.

It should not be possible for TM_DELTA_DB_TIME to be greater than TM_DELTA_TIME.

As a result, this data seems occasionally unreliable and this unreliability undermines it’s usefulness.

Unless of course someone else has another explanation?

Flip Flop Hooray, Ho

(Title reference for any old skool hip hop fans?)

An investigation into a flip flopping sql statement – the downside of dynamic sampling – why it can be a problem for fast changing tables and frequently executed SQL.

Yesterday morning a sql statement which processes outstanding messages was having performance problems.

There were two child cursors being executed by the four concurrent sessions, depending on their optimizer settings.

I’m not going to detail the optimzer environment differences because it’s mainly irrelvant other than, because of the child cursors, there was a hard parse at time T0 and a hard parse at time T1 and then these cursors were executed repeatedly / regularly over a number of hours.

SQL statistics show this:

SQL> select to_char(sn.end_interval_time,'DD/MM/YYYY HH24:MI') dt
  2  ,      st.plan_hash_value plan_hash
  3  ,      rows_processed_delta rws
  4  ,      executions_delta     execs
  5  ,      elapsed_time_delta   elp
  6  ,      cpu_time_delta       cpu
  7  ,      buffer_gets_delta    gets
  8  ,      iowait_delta         io
  9  from   dba_hist_snapshot sn
 10  ,      dba_hist_sqlstat  st
 11  where  st.snap_id            = sn.snap_id
 12  and    st.sql_id             = '1s1vgw24g2u1k'
 13  and    st.elapsed_time_delta > 0
 14  and    sn.end_interval_time between to_timestamp('18/06/2012 09:00','DD/MM/YYYY HH24:MI')
 15                              and     to_timestamp('18/06/2012 10:00','DD/MM/YYYY HH24:MI')
 16  order by sn.snap_id, st.plan_hash_value; 

DT                PLAN_HASH        RWS      EXECS        ELP        CPU       GETS         IO
---------------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
18/06/2012 09:00 1081606725       1525        182 2163175676  101353586    3938349 2062115102
18/06/2012 09:00 3791039348         48        299     449781     380944      71472      68103
18/06/2012 09:30 1081606725       2032         24 3019992733  146890665   12995474 2886335201
18/06/2012 09:30 3791039348         43        289   73670959   73468837   56740610      24170

SQL> 

Whereas this is representative of “normal”:

DT                PLAN_HASH        RWS      EXECS        ELP        CPU       GETS         IO
---------------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
15/06/2012 09:00 3791039348       1289       1077  174434637   16030542    5470115     280865

SQL> 

I mentioned that this code was processing outstanding messages.
These different processes aren’t balanced and aren’t expected to be.

But plan hash 3791039348 is good for everyone and in general plan hash 1081606725 suits no-one.

Here’s the sanitised SQL – remember this isn’t so much a SQL tuning/rewriting exercise so much as a root cause of slowness exercise (where the root cause is not initially meant to be “because the SQL is not brilliant”)

      SELECT  ....
      FROM    cmsg,
              sint,
              setm,
              comp,
              trga,
		          seti,
		          trad,
              (SELECT COUNT(*) count,
                      setm_il.sett_mesg_tag sett_mesg_tag
               FROM   prco prco_il,
                      cmsg cmsg_il,
                      setm setm_il,
                      seti seti_il
               WHERE  cmsg_il.unacknowledged = prco_il.unack_code
               AND    cmsg_il.evnt_type      = 'SI'
               AND   (cmsg_il.evnt_tag       = seti_il.sett_istr_tag 
               OR     cmsg_il.evnt_tag       = seti_il.amending_si_tag)
               AND    setm_il.sett_istr_tag  = seti_il.sett_istr_tag
               GROUP BY setm_il.sett_mesg_tag) cmsg_si
      WHERE   cmsg.rowid IN (SELECT row_id
                             FROM (SELECT rowid row_id
                                   FROM   cmsg  cmsg2
                                   WHERE  unacknowledged = :p_unack_code
                                   AND    evnt_type     != 'SOD'
                                   ORDER BY seq_num)
                             WHERE rownum <= :p_batch_lim)
      AND     sint.sint_tag         (+) = cmsg.evnt_tag
      AND     sint.stus_ind         (+) = 'C'
      AND     setm.sett_mesg_tag    (+) = sint.sett_mesg_tag
      AND     comp.cmpy_num         (+) = setm.cmpy_num
      AND     trga.trga_code        (+) = comp.trga_code
      AND     seti.sett_istr_tag    (+) = setm.sett_istr_tag
      AND     trad.trad_tag         (+) = seti.trad_tag
      AND     trad.cmpy_num         (+) = seti.cmpy_num
      AND     cmsg_si.sett_mesg_tag (+) = sint.sett_mesg_tag
      ORDER by cmsg.seq_num;

Here is the relevant sections of the two execution plans:

The good:

SQL_ID 1s1vgw24g2u1k
 
Plan hash value: 3791039348
 
------------------------------------------------------------------------------------------------
| Id  | Operation                                         | Name       | Rows  | Cost  | Time  |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                                  |            |       |  9192 |       |
|   1 |  FOR UPDATE                                       |            |       |       |       |
|   2 |   SORT ORDER BY                                   |            |     1 |  9192 | 00:46 |
|   3 |    NESTED LOOPS OUTER                             |            |     1 |  9191 | 00:46 |
|   4 |     NESTED LOOPS OUTER                            |            |     1 |  9190 | 00:46 |
|   5 |      NESTED LOOPS OUTER                           |            |     1 |  9188 | 00:46 |
|   6 |       NESTED LOOPS OUTER                          |            |     1 |  9187 | 00:46 |
|   7 |        NESTED LOOPS OUTER                         |            |     1 |  9186 | 00:46 |
|   8 |         HASH JOIN OUTER                           |            |     1 |  9184 | 00:46 |
|   9 |          NESTED LOOPS OUTER                       |            |     1 |    10 | 00:01 |
|  10 |           NESTED LOOPS                            |            |     1 |     7 | 00:01 |
|  11 |            VIEW                                   | VW_NSO_1   |     3 |     5 | 00:01 |
|  12 |             SORT UNIQUE                           |            |     1 |       |       |
|  13 |              COUNT STOPKEY                        |            |       |       |       |
|  14 |               VIEW                                |            |     3 |     5 | 00:01 |
|  15 |                SORT ORDER BY STOPKEY              |            |     3 |     5 | 00:01 |
|  16 |                 TABLE ACCESS BY INDEX ROWID       | CMSG       |     3 |     4 | 00:01 |
|  17 |                  INDEX RANGE SCAN                 | CMSG3_IDX  |     4 |     3 | 00:01 |
|  18 |            TABLE ACCESS BY USER ROWID             | CMSG       |     1 |     1 | 00:01 |
|  19 |           TABLE ACCESS BY INDEX ROWID             | SETI       |     1 |     3 | 00:01 |
|  20 |            INDEX RANGE SCAN                       | SETI_1     |     1 |     2 | 00:01 |
|  21 |          VIEW                                     |            | 14490 |  9173 | 00:46 |
|  22 |           SORT GROUP BY                           |            | 14490 |       |       |
|  23 |            CONCATENATION                          |            |       |       |       |
|  24 |             NESTED LOOPS                          |            |       |       |       |
|  25 |              NESTED LOOPS                         |            |   562 |  8472 | 00:43 |
|  26 |               NESTED LOOPS                        |            |   428 |  6757 | 00:34 |
|  27 |                NESTED LOOPS                       |            |   304 |    30 | 00:01 |
|  28 |                 TABLE ACCESS BY INDEX ROWID       | CMSG       |   304 |    30 | 00:01 |
|  29 |                  INDEX FULL SCAN                  | CMSG4_IDX  |     1 |    29 | 00:01 |
|  30 |                 INDEX UNIQUE SCAN                 | PRCO_2_IDX |     1 |     0 |       |
|  31 |                PARTITION LIST ALL                 |            |     1 |    24 | 00:01 |
|  32 |                 TABLE ACCESS BY LOCAL INDEX ROWID | SETI       |     1 |    24 | 00:01 |
|  33 |                  INDEX RANGE SCAN                 | SETI_3_IDX |    36 |    22 | 00:01 |
|  34 |               INDEX RANGE SCAN                    | SETM_2_IDX |     1 |     2 | 00:01 |
|  35 |              TABLE ACCESS BY INDEX ROWID          | SETM       |     1 |     4 | 00:01 |
|  36 |             NESTED LOOPS                          |            |       |       |       |
|  37 |              NESTED LOOPS                         |            |    20 |   700 | 00:04 |
|  38 |               NESTED LOOPS                        |            |    15 |   640 | 00:04 |
|  39 |                NESTED LOOPS                       |            |    15 |   640 | 00:04 |
|  40 |                 TABLE ACCESS BY INDEX ROWID       | CMSG       |   304 |    30 | 00:01 |
|  41 |                  INDEX FULL SCAN                  | CMSG4_IDX  |     1 |    29 | 00:01 |
|  42 |                 TABLE ACCESS BY GLOBAL INDEX ROWID| SETI       |     1 |     2 | 00:01 |
|  43 |                  INDEX UNIQUE SCAN                | SETI_IDX   |     1 |     1 | 00:01 |
|  44 |                INDEX UNIQUE SCAN                  | PRCO_2_IDX |     1 |     0 |       |
|  45 |               INDEX RANGE SCAN                    | SETM_2_IDX |     1 |     2 | 00:01 |
|  46 |              TABLE ACCESS BY INDEX ROWID          | SETM       |     1 |     4 | 00:01 |
|  47 |         TABLE ACCESS BY INDEX ROWID               | SETM       |     1 |     2 | 00:01 |
|  48 |          INDEX UNIQUE SCAN                        | SETM_IDX   |     1 |     1 | 00:01 |
|  49 |        TABLE ACCESS BY INDEX ROWID                | COMP       |     1 |     1 | 00:01 |
|  50 |         INDEX UNIQUE SCAN                         | COMP_PK    |     1 |     0 |       |
|  51 |       TABLE ACCESS BY INDEX ROWID                 | TRGA       |     1 |     1 | 00:01 |
|  52 |        INDEX UNIQUE SCAN                          | TRGA_IDX   |     1 |     0 |       |
|  53 |      TABLE ACCESS BY GLOBAL INDEX ROWID           | SETI       |     1 |     2 | 00:01 |
|  54 |       INDEX UNIQUE SCAN                           | SETI_IDX   |     1 |     1 | 00:01 |
|  55 |     PARTITION LIST ITERATOR                       |            |     1 |     1 | 00:01 |
|  56 |      INDEX UNIQUE SCAN                            | TRAD_IDX   |     1 |     1 | 00:01 |
------------------------------------------------------------------------------------------------

And the downright bad and ugly:

SQL_ID 1s1vgw24g2u1k
 
Plan hash value: 1081606725
 
------------------------------------------------------------------------------------------------
| Id  | Operation                                         | Name       | Rows  | Cost  | Time  |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                                  |            |       |   436K|       |
|   1 |  FOR UPDATE                                       |            |       |       |       |
|   2 |   SORT ORDER BY                                   |            |     1 |   436K| 36:24 |
|   3 |    NESTED LOOPS OUTER                             |            |     1 |   436K| 36:24 |
|   4 |     NESTED LOOPS OUTER                            |            |     1 |   436K| 36:24 |
|   5 |      NESTED LOOPS OUTER                           |            |     1 |   436K| 36:23 |
|   6 |       NESTED LOOPS OUTER                          |            |     1 |   436K| 36:23 |
|   7 |        NESTED LOOPS OUTER                         |            |     1 |   436K| 36:23 |
|   8 |         HASH JOIN OUTER                           |            |     1 |   436K| 36:23 |
|   9 |          NESTED LOOPS OUTER                       |            |     1 |    10 | 00:01 |
|  10 |           NESTED LOOPS                            |            |     1 |     7 | 00:01 |
|  11 |            VIEW                                   | VW_NSO_1   |     2 |     5 | 00:01 |
|  12 |             SORT UNIQUE                           |            |     1 |       |       |
|  13 |              COUNT STOPKEY                        |            |       |       |       |
|  14 |               VIEW                                |            |     2 |     5 | 00:01 |
|  15 |                SORT ORDER BY STOPKEY              |            |     2 |     5 | 00:01 |
|  16 |                 TABLE ACCESS BY INDEX ROWID       | CMSG       |     2 |     4 | 00:01 |
|  17 |                  INDEX RANGE SCAN                 | CMSG3_IDX  |     2 |     3 | 00:01 |
|  18 |            TABLE ACCESS BY USER ROWID             | CMSG       |     1 |     1 | 00:01 |
|  19 |           TABLE ACCESS BY INDEX ROWID             | SETI       |     1 |     3 | 00:01 |
|  20 |            INDEX RANGE SCAN                       | SETI_1     |     1 |     2 | 00:01 |
|  21 |          VIEW                                     |            |   719K|   436K| 36:23 |
|  22 |           SORT GROUP BY                           |            |   719K|       |       |
|  23 |            CONCATENATION                          |            |       |       |       |
|  24 |             NESTED LOOPS                          |            |       |       |       |
|  25 |              NESTED LOOPS                         |            | 27212 |   403K| 33:40 |
|  26 |               HASH JOIN                           |            | 20520 |   301K| 25:06 |
|  27 |                HASH JOIN                          |            | 14473 |    35 | 00:01 |
|  28 |                 INDEX FULL SCAN                   | PRCO_2_IDX |   502 |     1 | 00:01 |
|  29 |                 TABLE ACCESS BY INDEX ROWID       | CMSG       | 14473 |    33 | 00:01 |
|  30 |                  INDEX FULL SCAN                  | CMSG4_IDX  |     1 |    32 | 00:01 |
|  31 |                PARTITION LIST ALL                 |            |    30M|   300K| 25:02 |
|  32 |                 TABLE ACCESS FULL                 | SETI       |    30M|   300K| 25:02 |
|  33 |               INDEX RANGE SCAN                    | SETM_2_IDX |     1 |     3 | 00:01 |
|  34 |              TABLE ACCESS BY INDEX ROWID          | SETM       |     1 |     5 | 00:01 |
|  35 |             NESTED LOOPS                          |            |       |       |       |
|  36 |              NESTED LOOPS                         |            |   960 | 32673 | 02:44 |
|  37 |               NESTED LOOPS                        |            |   724 | 29046 | 02:26 |
|  38 |                NESTED LOOPS                       |            |   724 | 29045 | 02:26 |
|  39 |                 TABLE ACCESS BY INDEX ROWID       | CMSG       | 14473 |    33 | 00:01 |
|  40 |                  INDEX FULL SCAN                  | CMSG4_IDX  |     1 |    32 | 00:01 |
|  41 |                 TABLE ACCESS BY GLOBAL INDEX ROWID| SETI       |     1 |     2 | 00:01 |
|  42 |                  INDEX UNIQUE SCAN                | SETI_IDX   |     1 |     1 | 00:01 |
|  43 |                INDEX UNIQUE SCAN                  | PRCO_2_IDX |     1 |     0 |       |
|  44 |               INDEX RANGE SCAN                    | SETM_2_IDX |     1 |     3 | 00:01 |
|  45 |              TABLE ACCESS BY INDEX ROWID          | SETM       |     1 |     5 | 00:01 |
|  46 |         TABLE ACCESS BY INDEX ROWID               | SETM       |     1 |     2 | 00:01 |
|  47 |          INDEX UNIQUE SCAN                        | SETM_IDX   |     1 |     1 | 00:01 |
|  48 |        TABLE ACCESS BY INDEX ROWID                | COMP       |     1 |     2 | 00:01 |
|  49 |         INDEX UNIQUE SCAN                         | COMP_PK    |     1 |     1 | 00:01 |
|  50 |       TABLE ACCESS BY INDEX ROWID                 | TRGA       |     1 |     1 | 00:01 |
|  51 |        INDEX UNIQUE SCAN                          | TRGA_IDX   |     1 |     0 |       |
|  52 |      TABLE ACCESS BY GLOBAL INDEX ROWID           | SETI       |     1 |     2 | 00:01 |
|  53 |       INDEX UNIQUE SCAN                           | SETI_IDX   |     1 |     1 | 00:01 |
|  54 |     PARTITION LIST ITERATOR                       |            |     1 |     1 | 00:01 |
|  55 |      INDEX UNIQUE SCAN                            | TRAD_IDX   |     1 |     1 | 00:01 |
------------------------------------------------------------------------------------------------
 
Note
-----
   - dynamic sampling used for this statement (level=4)
 

The trouble with specific examples is that sometimes they’re a bit longer than ideal (as well as being a bit out of context).

Let me highlight what I see as the problem:

  • There is a non-merged view with an OR (which is expanded to a CONCATENATION)
  • In the good plan, @ line 33, one part of the OR condition is dealt with via an INDEX RANGE SCAN on SETI_3_IDX.
  • In the bad plan, @ line 32, this is a TABLE ACCESS FULL on SETI.
  • There’s a dynamic sampling note in the bad plan

Good:

|  21 | VIEW                                    |            | 14490 |  9173   (1)| 00:00:46 |
...
|  26 |     NESTED LOOPS                        |            |   428 |  6757   (1)| 00:00:34 |
|  27 |      NESTED LOOPS                       |            |   304 |    30   (0)| 00:00:01 |
|  28 |       TABLE ACCESS BY INDEX ROWID       | CMSG       |   304 |    30   (0)| 00:00:01 |
|  29 |        INDEX FULL SCAN                  | CMSG4_IDX  |     1 |    29   (0)| 00:00:01 |
|  30 |       INDEX UNIQUE SCAN                 | PRCO_2_IDX |     1 |     0   (0)|          |
|  31 |      PARTITION LIST ALL                 |            |     1 |    24   (0)| 00:00:01 |
|  32 |       TABLE ACCESS BY LOCAL INDEX ROWID | SETI       |     1 |    24   (0)| 00:00:01 |
|  33 |        INDEX RANGE SCAN                 | SETI_3_IDX |    36 |    22   (0)| 00:00:01 |
...

Bad:

|  21 | VIEW                                    |            |   719K|   436K  (3)| 00:36:23 |
...
|  26 |     HASH JOIN                           |            | 20520 |   301K  (4)| 00:25:06 |
|  27 |      HASH JOIN                          |            | 14473 |    35   (3)| 00:00:01 |
|  28 |       INDEX FULL SCAN                   | PRCO_2_IDX |   502 |     1   (0)| 00:00:01 |
|  29 |       TABLE ACCESS BY INDEX ROWID       | CMSG       | 14473 |    33   (0)| 00:00:01 |
|  30 |        INDEX FULL SCAN                  | CMSG4_IDX  |     1 |    32   (0)| 00:00:01 |
|  31 |      PARTITION LIST ALL                 |            |    30M|   300K  (4)| 00:25:02 |
|  32 |       TABLE ACCESS FULL                 | SETI       |    30M|   300K  (4)| 00:25:02 |
...
Note
-----
   - dynamic sampling used for this statement (level=4)

It’s always nice to simplify a situation and because the problem seems to be in this non-merged view, we can actually eliminate half the sql and then see if we can just reproduce the problem by adjusting the estimates.

Which we can, and via trial and error, we see that as we approach an estimate of 14000 rows from CMSG, the plan changes to the FULL TABLE SCAN of SETI:

SQL> explain plan for
  2                 SELECT /*+
  3                          opt_estimate(table cmsg_il rows=14000)
  4                          */
  5                        COUNT(*) count,
  6                        setm_il.sett_mesg_tag sett_mesg_tag
  7                 FROM   prco    prco_il,
  8                        cmsg    cmsg_il,
  9                        setm    setm_il,
 10                        seti    seti_il
 11                 WHERE  cmsg_il.unacknowledged = prco_il.unack_code
 12                 AND    cmsg_il.evnt_type = 'SI'
 13                 AND   (cmsg_il.evnt_tag = seti_il.sett_istr_tag OR
 14                        cmsg_il.evnt_tag = seti_il.amending_si_tag)
 15                 AND    setm_il.sett_istr_tag = seti_il.sett_istr_tag
 16                 GROUP BY setm_il.sett_mesg_tag;

Explained.

SQL> select * from table(dbms_xplan.display);
Plan hash value: 273475595

-----------------------------------------------------------------------------------------------
| Id  | Operation                                | Name       | Rows  | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                         |            |   695K|   432K  (3)| 00:36:01 |
|   1 |  HASH GROUP BY                           |            |   695K|            |          |
|   2 |   CONCATENATION                          |            |       |            |          |
|   3 |    NESTED LOOPS                          |            |       |            |          |
|   4 |     NESTED LOOPS                         |            | 26323 |   400K  (3)| 00:33:23 |
|*  5 |      HASH JOIN                           |            | 19850 |   301K  (4)| 00:25:06 |
|*  6 |       HASH JOIN                          |            | 14000 |    35   (3)| 00:00:01 |
|   7 |        INDEX FULL SCAN                   | PRCO_2_IDX |   502 |     1   (0)| 00:00:01 |
|*  8 |        TABLE ACCESS BY INDEX ROWID       | CMSG       | 14000 |    33   (0)| 00:00:01 |
|*  9 |         INDEX FULL SCAN                  | CMSG4_IDX  |     1 |    32   (0)| 00:00:01 |
|  10 |       PARTITION LIST ALL                 |            |    30M|   300K  (4)| 00:25:02 |
|  11 |        TABLE ACCESS FULL                 | SETI       |    30M|   300K  (4)| 00:25:02 |
|* 12 |      INDEX RANGE SCAN                    | SETM_2_IDX |     1 |     3   (0)| 00:00:01 |
|  13 |     TABLE ACCESS BY INDEX ROWID          | SETM       |     1 |     5   (0)| 00:00:01 |
|  14 |    NESTED LOOPS                          |            |       |            |          |
|  15 |     NESTED LOOPS                         |            |   928 | 31605   (1)| 00:02:39 |
|  16 |      NESTED LOOPS                        |            |   700 | 28097   (1)| 00:02:21 |
|  17 |       NESTED LOOPS                       |            |   700 | 28097   (1)| 00:02:21 |
|* 18 |        TABLE ACCESS BY INDEX ROWID       | CMSG       | 14000 |    33   (0)| 00:00:01 |
|* 19 |         INDEX FULL SCAN                  | CMSG4_IDX  |     1 |    32   (0)| 00:00:01 |
|* 20 |        TABLE ACCESS BY GLOBAL INDEX ROWID| SETI       |     1 |     2   (0)| 00:00:01 |
|* 21 |         INDEX UNIQUE SCAN                | SETI_IDX   |     1 |     1   (0)| 00:00:01 |
|* 22 |       INDEX UNIQUE SCAN                  | PRCO_2_IDX |     1 |     0   (0)| 00:00:01 |
|* 23 |      INDEX RANGE SCAN                    | SETM_2_IDX |     1 |     3   (0)| 00:00:01 |
|  24 |     TABLE ACCESS BY INDEX ROWID          | SETM       |     1 |     5   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - access("CMSG_IL"."EVNT_TAG"="SETI_IL"."AMENDING_SI_TAG")
   6 - access("CMSG_IL"."UNACKNOWLEDGED"="PRCO_IL"."UNACK_CODE")
   8 - filter("CMSG_IL"."EVNT_TYPE"='SI')
   9 - filter("CMSG_IL"."UNACKNOWLEDGED" IS NOT NULL)
  12 - access("SETM_IL"."SETT_ISTR_TAG"="SETI_IL"."SETT_ISTR_TAG")
  18 - filter("CMSG_IL"."EVNT_TYPE"='SI')
  19 - filter("CMSG_IL"."UNACKNOWLEDGED" IS NOT NULL)
  20 - filter(LNNVL("CMSG_IL"."EVNT_TAG"="SETI_IL"."AMENDING_SI_TAG"))
  21 - access("CMSG_IL"."EVNT_TAG"="SETI_IL"."SETT_ISTR_TAG")
  22 - access("CMSG_IL"."UNACKNOWLEDGED"="PRCO_IL"."UNACK_CODE")
  23 - access("SETM_IL"."SETT_ISTR_TAG"="SETI_IL"."SETT_ISTR_TAG")

46 rows selected.

SQL> 

So, optimizer environment differences and bind variable peeking are pretty much ruled out as influencing factors.

What remains is that the bad plan has a higher estimate of rows than the good plan and the bad plan has dynamic sampling.

Both SQL statements are the same.

Dynamic sampling level is the same for both (4 set at system level).

Dynamic sampling should be being applied because of the parameter setting and because there are two single table access predicates
– one from the predicate on CMSG_IL.EVENT_TYPE and one implicit IS NOT NULL predicate on the join column UNACKNOWLEDGED.

So why would we see DS in one plan and not anohther?

If we do a 10053 trace of the distilled non-mergeable view, we see:

Access path analysis for CMSG
***************************************
SINGLE TABLE ACCESS PATH
  Single Table Cardinality Estimation for CMSG[CMSG_IL]

*** 2012-06-19 14:32:04.576
** Performing dynamic sampling initial checks. **
  Column (#4):
    NewDensity:0.000062, OldDensity:0.000000 BktCnt:8090, PopBktCnt:8084, PopValCnt:46, NDV:80
  Column (#4): EVNT_TYPE(
    AvgLen: 4 NDV: 80 Nulls: 0 Density: 0.000062
    Histogram: Freq  #Bkts: 52  UncompBkts: 8090  EndPtVals: 52
** Dynamic sampling initial checks returning TRUE (level = 4).

*** 2012-06-19 14:32:04.576
** Generated dynamic sampling query:
    query text :
SELECT /* OPT_DYN_SAMP */ 
       /*+ 
         ALL_ROWS 
         IGNORE_WHERE_CLAUSE 
         NO_PARALLEL(SAMPLESUB) 
         opt_param('parallel_execution_enabled', 'false') 
         NO_PARALLEL_INDEX(SAMPLESUB) 
         NO_SQL_TUNE 
         */ 
       NVL(SUM(C1),0), 
       NVL(SUM(C2),0) 
FROM (SELECT /*+ 
               IGNORE_WHERE_CLAUSE NO_PARALLEL("CMSG_IL") 
               FULL("CMSG_IL") 
               NO_PARALLEL_INDEX("CMSG_IL") 
               */ 
             1 AS C1, 
             CASE WHEN "CMSG_IL"."EVNT_TYPE"=:B1 
                  AND  "CMSG_IL"."UNACKNOWLEDGED" IS NOT NULL 
                  THEN 1 
                  ELSE 0 
             END AS C2 
FROM "CMSG" SAMPLE BLOCK (0.002711 , 1) SEED (1) "CMSG_IL") SAMPLESUB

*** 2012-06-19 14:32:04.612
** Executed dynamic sampling query:
    level : 4
    sample pct. : 0.002711
    actual sample size : 1756
    filtered sample card. : 0
    orig. card. : 75899119
    block cnt. table stat. : 1118909
    block cnt. for sampling: 1143461
    max. sample block cnt. : 32
    sample block cnt. : 31
    min. sel. est. : 0.00000001
** Not using dynamic sampling for single table sel. or cardinality.

So, the results of dynamic sampling are discarded as unreliable and the estimates fall back on using the stored object stats.

But at some point (one of the hard parses) the data returned in the sample must have found some relevant data and been considered reliable which led to the “bad” plan being executed repeatedly in inappropriate circumstances.

Two footnotes to this investigation.

Firstly, when you have tables with fast-changing data and row counts, there are usually two choices in approach for statistics
– either find a set of stats that gives the best (balance of) plans and lock them.
– or use dynamic sampling

The scenario above shows the downside of the dynamic sampling approach.
It may well be that the plan hash 1081606725 was good for the initial execution(s) of the statement when there maybe were a large number of outstanding messages but it very quickly became unsuitable given the frequent executions.

Secondly, I mentioned that this was not a SQL rewriting exercise, but, if it was, I’m not a great fan of the outer-joined inline view. It’s not merged but it seems sensible that a predicate be pushed into view.
Pushing the predicate, whether happening by default or hinted explicitly, would give a subsection of the plan like:

|  20 | VIEW PUSHED PREDICATE                  |            |     1 |    38   (0)| 00:00:01 |
|  21 |  SORT GROUP BY                         |            |     1 |    38   (0)| 00:00:01 |
|  22 |   NESTED LOOPS                         |            |     1 |    38   (0)| 00:00:01 |
|  23 |    NESTED LOOPS                        |            |     1 |    38   (0)| 00:00:01 |
|  24 |     NESTED LOOPS                       |            |     1 |     5   (0)| 00:00:01 |
|  25 |      TABLE ACCESS BY INDEX ROWID       | SETM       |     1 |     3   (0)| 00:00:01 |
|* 26 |       INDEX UNIQUE SCAN                | SETM_IDX   |     1 |     2   (0)| 00:00:01 |
|  27 |      TABLE ACCESS BY GLOBAL INDEX ROWID| SETI       |    30M|     2   (0)| 00:00:01 |
|* 28 |       INDEX UNIQUE SCAN                | SETI_IDX   |     1 |     1   (0)| 00:00:01 |
|* 29 |     TABLE ACCESS BY INDEX ROWID        | CMSG       |     1 |    33   (0)| 00:00:01 |
|* 30 |      INDEX FULL SCAN                   | CMSG_IDX   |     1 |    32   (0)| 00:00:01 |
|* 31 |    INDEX UNIQUE SCAN                   | PRCO_2_IDX |     1 |     0   (0)| 00:00:01 |

However, I’m not keen on explicitly hinting this nor leaving it to chance and as it’s outer joined, I would strongly consider using a scalar subselect:

      WITH cmsg AS
           (SELECT *
            FROM   (SELECT cmsg.evnt_tag,
                           cmsg.evnt_type,
                           cmsg.seq_num
                    FROM   cmsg
                    WHERE  unacknowledged = :p_unack_code
                    AND    evnt_type != 'SOD'
                    ORDER BY seq_num)
            WHERE rownum <= :p_batch_lim)
      SELECT  ...,
              (SELECT COUNT(*)
               FROM   proc    prco_il,
                      cmsg    cmsg_il,
                      setm    setm_il,
                      seti    seti_il
               WHERE  cmsg_il.unacknowledged = prco_il.unack_code
               AND    cmsg_il.evnt_type      = 'SI'
               AND   (cmsg_il.evnt_tag       = seti_il.sett_istr_tag
               OR     cmsg_il.evnt_tag       = seti_il.amending_si_tag)
               AND    setm_il.sett_istr_tag  = seti_il.sett_istr_tag
               AND    setm_il.sett_mesg_tag  = sint.sett_mesg_tag) count
      FROM    cmsg,
              sint,
              setm,
              comp,
              trga,
              seti,
              trad
      WHERE   sint.sint_tag      (+) = cmsg.evnt_tag
      AND     sint.stus_ind      (+) = 'C'
      AND     setm.sett_mesg_tag (+) = sint.sett_mesg_tag
      AND     comp.cmpy_num      (+) = setm.cmpy_num
      AND     trga.trga_code     (+) = comp.trga_code
      AND     seti.sett_istr_tag (+) = setm.sett_istr_tag
      AND     trad.trad_tag      (+) = seti.trad_tag
      AND     trad.cmpy_num      (+) = seti.cmpy_num
      ORDER by cmsg.seq_num;
Follow

Get every new post delivered to your Inbox.

Join 69 other followers