AWR: Was a baselined plan used?

Sometimes a simple question turns out to be harder than expected.

“Can we see if a particular SQL execution in AWR used a baselined plan?”

Initial thoughts might be:

Q: Does DBMS_XPLAN.DISPLAY_AWR tell us this?
A: Apparently not. See below. This question could also be rephrased as two other possible questions:

Q:Isn’t there a column on DBA_HIST_SQLSTAT which tell us this?
A: No. You’d think there should be. There is a SQL_PROFILE column. There isn’t a SQL_PLAN_BASELINE column.
There also isn’t an EXACT_MATCHING_SIGNATURE although there is a FORCE_MATCHING_SIGNATURE.

Q: Is it in DBA_HIST_SQL_PLAN.OTHER_XML?
A. No although this is where DBMS_XPLAN.DISPLAY_AWR gets it’s notes about cardinality feedback and dynamic sampling from.

First of all, Let’s see that it’s not shown in DBMS_XPLAN.DISPLAY_AWR.
Let’s get a sql statement baselined, in AWR and not in memory.

The usual setup

SQL> create table t1
  2  (col1  number
  3  ,col2  varchar2(50)
  4  ,flag  varchar2(2));

Table created.

SQL> insert into t1
  2  select rownum
  3  ,      lpad('X',50,'X')
  4  ,      case when rownum = 1
  5              then 'Y1'
  6              when rownum = 2
  7              then 'Y2'
  8              when mod(rownum,2) = 0
  9              then 'N1'
 10              else 'N2'
 11         end
 12  from   dual
 13* connect by rownum <= 100000
SQL> /

100000 rows created.

SQL> commit;

Commit complete.

Get a plan and put it in a baseline:

SQL> var n varchar2(2);
SQL> exec :n := 'N1';

PL/SQL procedure successfully completed.

SQL> select count(*), max(col2) from t1 where flag = :n;
select * from table(dbms_xplan.display_cursor);
  COUNT(*) MAX(COL2)
---------- --------------------------------------------------
     49999 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

SQL> select * from table(dbms_xplan.display_cursor);


PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
SQL_ID  731b98a8u0knf, child number 0
-------------------------------------
select count(*), max(col2) from t1 where flag = :n

Plan hash value: 3724264953

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |       |   221 (100)|          |
|   1 |  SORT AGGREGATE    |      |     1 |    30 |            |          |
|*  2 |   TABLE ACCESS FULL| T1   | 50254 |  1472K|   221   (1)| 00:00:03 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("FLAG"=:N)

Note
-----
   - dynamic sampling used for this statement (level=2)

SQL> declare
  2   l_op pls_integer;
  3  begin
  4    l_op :=
  5    dbms_spm.load_plans_from_cursor_cache('731b98a8u0knf');
  6* end;
SQL> /

PL/SQL procedure successfully completed.

Verify that DBMS_XPLAN.DISPLAY_CURSOR reports baseline usage:

SQL> select count(*), max(col2) from t1 where flag = :n;

  COUNT(*) MAX(COL2)
---------- --------------------------------------------------
     49999 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
SQL_ID  731b98a8u0knf, child number 0
-------------------------------------
select count(*), max(col2) from t1 where flag = :n

Plan hash value: 3724264953

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |       |   221 (100)|          |
|   1 |  SORT AGGREGATE    |      |     1 |    30 |            |          |
|*  2 |   TABLE ACCESS FULL| T1   | 50254 |  1472K|   221   (1)| 00:00:03 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("FLAG"=:N)

Note
-----
   - dynamic sampling used for this statement (level=2)
   - SQL plan baseline SQL_PLAN_13w748wknkcwd616acf47 used for this statement

See that dynamic_sampling note most likely comes from OTHER_XML and Baseline note doesn’t, it comes from V$SQL:

SQL>select sql_id
  2  ,      to_char(exact_matching_signature) sig
  3  ,      plan_hash_value
  4  ,      sql_plan_baseline
  5  from   v$sql
  6* where  sql_id = '731b98a8u0knf'
SQL> /

SQL_ID        SIG                                      PLAN_HASH_VALUE SQL_PLAN_BASELINE
------------- ---------------------------------------- --------------- ------------------------------
731b98a8u0knf 1292784087274697613                           3724264953 SQL_PLAN_13w748wknkcwd616acf47

SQL> set long 10000
SQL> select other_xml
  2  from   v$sql_plan
  3  where  sql_id = '731b98a8u0knf'
  4* and    other_xml is not null
SQL> /

OTHER_XML
--------------------------------------------------------------------------------
<other_xml><info type="db_version">11.2.0.3</info><info type="parse_schema"><![C
DATA["E668983_DBA"]]></info><info type="dynamic_sampling">2</info><info type="pl
an_hash">3724264953</info><info type="plan_hash_2">1634389831</info><peeked_bind
s><bind nam=":N" pos="1" dty="1" csi="178" frm="1" mxl="32">4e31</bind></peeked_
binds><outline_data><hint><![CDATA[IGNORE_OPTIM_EMBEDDED_HINTS]]></hint><hint><!
[CDATA[OPTIMIZER_FEATURES_ENABLE('11.2.0.3')]]></hint><hint><![CDATA[DB_VERSION(
'11.2.0.3')]]></hint><hint><![CDATA[OPT_PARAM('query_rewrite_enabled' 'false')]]
></hint><hint><![CDATA[ALL_ROWS]]></hint><hint><![CDATA[OUTLINE_LEAF(@"SEL$1")]]
></hint><hint><![CDATA[FULL(@"SEL$1" "T1"@"SEL$1")]]></hint></outline_data></oth
er_xml>

Now, let’s add statement to AWR, do an AWR snap and flush the SP:

SQL> exec dbms_workload_repository.add_colored_sql('731b98a8u0knf');

PL/SQL procedure successfully completed.

SQL> exec dbms_workload_repository.create_snapshot;

PL/SQL procedure successfully completed.

SQL> alter system flush shared_pool;

System altered.

Now if we look back at the plan in memory, there’s nothing there.

SQL> select * from v$sql where sql_id = '731b98a8u0knf';

no rows selected

SQL> select * from table(dbms_xplan.display_cursor('731b98a8u0knf'));

PLAN_TABLE_OUTPUT
------------------------------------------------------------
SQL_ID: 731b98a8u0knf, child number: 0 cannot be found

But the SQL is in AWR and let’s see that DBMS_XPLAN.DISPLAY_AWR does not tells us that a baseline is used:

SQL> select * from table(dbms_xplan.display_awr('731b98a8u0knf'));

PLAN_TABLE_OUTPUT
---------------------------------------------------------------
SQL_ID 731b98a8u0knf
--------------------
select count(*), max(col2) from t1 where flag = :n

Plan hash value: 3724264953

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |       |   221 (100)|          |
|   1 |  SORT AGGREGATE    |      |     1 |    30 |            |          |
|   2 |   TABLE ACCESS FULL| T1   | 50238 |  1471K|   221   (1)| 00:00:03 |
---------------------------------------------------------------------------

Note
-----
   - dynamic sampling used for this statement (level=2)

But it will tell us if a sql profile is used because it can via DBA_HIST_SQLSTAT.SQL_PROFILE – I leave that exercise to the reader.

So, back to the original question:

“Can we see if a particular SQL execution in AWR used a baselined plan?”

I’ve never needed to do this but as an exercise we can infer this connection by the presence of a baselined plan with the same phv2 but it’s convoluted.

It may be that there are simpler ways but it seems to me that we need to:

  1. Get the exact_matching_signature from the sql text
  2. Get the phv2 out of DBA_HIST_SQL_PLAN.OTHER_XML
  3. Match that phv2 with planid which is not exposed in DBA_SQL_PLAN_BASELINES but is in the underlying SYS.SQLOBJ$ tables
SQL> with subq_mysql as
  2  (select sql_id
  3   ,      (select dbms_sqltune.sqltext_to_signature(ht.sql_text)
  4           from dual) sig
  5   from   dba_hist_sqltext       ht
  6   where  sql_id = '731b98a8u0knf')
  7  ,    subq_baselines as
  8  (select b.signature
  9   ,      b.plan_name
 10   ,      b.accepted
 11   ,      b.created
 12   ,      o.plan_id
 13   from   subq_mysql             ms
 14   ,      dba_sql_plan_baselines b
 15   ,      sys.sqlobj$            o
 16   where  b.signature   = ms.sig
 17   and    o.signature   = b.signature
 18   and    o.name        = b.plan_name)
 19  ,    subq_awr_plans as
 20  (select  sn.snap_id
 21   ,       to_char(sn.end_interval_time,'DD-MON-YYYY HH24:MI') dt
 22   ,       hs.sql_id
 23   ,       hs.plan_hash_value
 24   ,       t.phv2
 25   ,       ms.sig
 26   from    subq_mysql        ms
 27   ,       dba_hist_sqlstat  hs
 28   ,       dba_hist_snapshot sn
 29   ,       dba_hist_sql_plan hp
 30   ,       xmltable('for $i in /other_xml/info
 31                     where $i/@type eq "plan_hash_2"
 32                     return $i'
 33                    passing xmltype(hp.other_xml)
 34                    columns phv2 number path '/') t
 35   where   hs.sql_id          = ms.sql_id
 36   and     sn.snap_id         = hs.snap_id
 37   and     sn.instance_number = hs.instance_number
 38   and     hp.sql_id          = hs.sql_id
 39   and     hp.plan_hash_value = hs.plan_hash_value
 40   and     hp.other_xml      is not null)
 41  select awr.*
 42  ,       nvl((select max('Y')
 43               from   subq_baselines b
 44               where  b.signature = awr.sig
 45               and    b.accepted  = 'YES'),'N') does_baseline_exist
 46  ,      nvl2(b.plan_id,'Y','N') is_baselined_plan
 47  ,      to_char(b.created,'DD-MON-YYYY HH24:MI')  when_baseline_created
 48  from   subq_awr_plans awr
 49  ,      subq_baselines b
 50  where  b.signature (+) = awr.sig
 51  and    b.plan_id   (+) = awr.phv2
 52* order by awr.snap_id
SQL> /

   SNAP_ID DT                SQL_ID        PLAN_HASH_VALUE       PHV2        SIG D I WHEN_BASELINE_CRE
---------- ----------------- ------------- --------------- ---------- ---------- - - -----------------
      8703 05-FEB-2014 15:18 731b98a8u0knf      3724264953 1634389831 1.2928E+18 Y Y 05-FEB-2014 15:08

SQL>

Surprising.
Unless, of course, I’ve overlooked something bleeding obvious.

Failed Logins can cause row cache lock on dc_users

One “feature” to be aware of in old-fashioned Oracle 11g – took me a while to figure it out, so just in case anyone else gets hit by it…

I changed the password of a schema in a test database to stop anything connecting whilst that schema was being logically refreshed.

Following this change, this environment started crawling but, despite being heavily used, no-one was complaining apart from me.

I couldn’t connect, just seemed to be “hanging” and for any existing connections certain statements were very slow.

I could see lots of row cache locks, the p1 of which decoded to dc_users.

Spent a while trying to figure it out, and eventually asked for a couple of second opinions on at the environment – they had no problems connecting, could see the row cache locks and eventually bounced the database.

Nothing got any better.

Another strange thing was that when looking at the sessions via ASH, they were all SYS sessions but with a session module that could not have been SYS – i.e application connections, etc.

So, that pointed at recursive SQL.

select count(*) over (partition by sample_time) sess_cnt
,      (select username from dba_users u where u.user_id = h.user_id) u
,      sample_time, session_state, session_id, sql_id
,      module, event, p1, blocking_session
from   dba_hist_active_sess_history h
where  event = 'row cache lock'
order by sample_time desc;
SESS_CNT U     SAMPLE_TIME                   SESSION_STATE SESSION_ID MODULE                  EVENT             P1 BLOCKING_SESSION
-------- ----- ----------------------------- ------------- ---------- ----------------------- ----------------- -- ----------------
      19 SYS   11-JUL-13 10.19.15.935000000  WAITING             1828 sqlplus@xx (TNS V1-V3)  row cache lock     7                  
      19 SYS   11-JUL-13 10.19.15.935000000  WAITING             2172 JDBC Thin Client        row cache lock     7             1898 
      19 SYS   11-JUL-13 10.19.15.935000000  WAITING             1837 JDBC Thin Client        row cache lock     7             2144 
      19 SYS   11-JUL-13 10.19.15.935000000  WAITING             1848 JDBC Thin Client        row cache lock     7             1898 
      19 SYS   11-JUL-13 10.19.15.935000000  WAITING             1880 sqlplus@xx (TNS V1-V3)  row cache lock     7             2144 
      19 SYS   11-JUL-13 10.19.15.935000000  WAITING             1886 JDBC Thin Client        row cache lock     7             1909 
      19 SYS   11-JUL-13 10.19.15.935000000  WAITING             2012 JDBC Thin Client        row cache lock     7             2144 
      19 SYS   11-JUL-13 10.19.15.935000000  WAITING             2026 sqlplus@xx (TNS V1-V3)  row cache lock     7             1898 
      19 SYS   11-JUL-13 10.19.15.935000000  WAITING             2031 JDBC Thin Client        row cache lock     7             1898 
      19 SYS   11-JUL-13 10.19.15.935000000  WAITING             2041 JDBC Thin Client        row cache lock     7             2144 
      19 SYS   11-JUL-13 10.19.15.935000000  WAITING             2058 sqlplus@xx (TNS V1-V3)  row cache lock     7             1898 
      19 SYS   11-JUL-13 10.19.15.935000000  WAITING             2063 JDBC Thin Client        row cache lock     7             2144 
      19 SYS   11-JUL-13 10.19.15.935000000  WAITING             2064 SQL Developer           row cache lock     7             2144 
      19 SYS   11-JUL-13 10.19.15.935000000  WAITING             2073 sqlplus@xx (TNS V1-V3)  row cache lock     7             1898 
      19 SYS   11-JUL-13 10.19.15.935000000  WAITING             2084 sqlplus@xx (TNS V1-V3)  row cache lock     7                  
      19 SYS   11-JUL-13 10.19.15.935000000  WAITING             2105 sqlplus@xx (TNS V1-V3)  row cache lock     7             1898 
      19 SYS   11-JUL-13 10.19.15.935000000  WAITING             2123 sqlplus@xx (TNS V1-V3)  row cache lock     7             1898 
      19 SYS   11-JUL-13 10.19.15.935000000  WAITING             2144 JDBC Thin Client        row cache lock     7             2041 
      19 SYS   11-JUL-13 10.19.15.935000000  WAITING             1832 sqlplus@xx (TNS V1-V3)  row cache lock     7             2144 

 19 rows selected 
select trunc(sample_time), count(*), min(sample_time), max(sample_time)
from   dba_hist_active_Sess_history
where  event = 'row cache lock'
group by trunc(sample_time)
order by trunc(sample_time);
TRUNC(SAMPLE_TIME) COUNT(*) MIN(SAMPLE_TIME)              MAX(SAMPLE_TIME)            
------------------ -------- ----------------------------- -----------------------------
02-JUL-13                10 02-JUL-13 04.08.48.955000000  02-JUL-13 16.12.30.128000000  
03-JUL-13                 2 03-JUL-13 11.42.30.367000000  03-JUL-13 22.02.38.886000000  
04-JUL-13                 6 04-JUL-13 04.10.25.568000000  04-JUL-13 22.02.52.536000000  
05-JUL-13                16 05-JUL-13 04.09.00.444000000  05-JUL-13 22.01.13.067000000  
06-JUL-13                 2 06-JUL-13 01.06.31.261000000  06-JUL-13 14.07.15.208000000  
08-JUL-13                26 08-JUL-13 03.10.53.909000000  08-JUL-13 22.00.45.416000000  
09-JUL-13                19 09-JUL-13 04.06.55.191000000  09-JUL-13 20.23.16.801000000  
10-JUL-13             48293 10-JUL-13 04.00.41.732000000  10-JUL-13 19.00.04.750000000  
11-JUL-13              5412 11-JUL-13 09.20.40.833000000  11-JUL-13 10.41.49.610000000  

 9 rows selected 

Eventually the penny dropped …

See bug 7715339 – Logon failures causes “row cache lock” waits – Allow disable of logon delay [ID 7715339.8]

In 11g there is an intentional delay between allowing failed logon
attempts to retry. For some specific application types this can cause
a problem as the row cache entry is locked for the duration of the
delay . This can lead to excessive row cache lock waits for DC_USERS
for specific users / schemas.

After 3 successive failures a sleep delay is introduced starting
at 3 seconds and extending to 10 seconds max. During each delay
the user X row cache lock is held in exclusive mode preventing
any concurrent logon attempt as user X (and preventing any
other operation which would need the row cache lock for user X).

That explains a) the problem and b) why the impact was limited.

RAC – It’s all about the node affinity

Or what happens when you don’t have node affinity.

Here’s an old, short post that was lying around unpublished.

SQL executed on the 14th on node 2, on 15th on node 1.

DT SQL ID Node Rows Execs Elapsed Cluster Wait Time
15-NOV 7287jpw5vtm8j 1 4129697 7591 32625 32175
14-NOV 7287jpw5vtm8j 2 19703872 30909 78 60

Timings in seconds.

32625 seconds is just over 9 hours, of which 32175 is cluster wait time.

Database services and application datasource config have now been reconfigured to the extent that there is node affinity and the data required for the workload on node 1 is completely segregated from the workload on node 2.

If you use RAC as a black box and you have problems with performance or problems with predictability of performance then look into whether node affinity, or a lack thereof, is your problem.

CPU_time much higher than elapsed_time

Of course, one of the first places you should go is Oracle Support, but if you, like me, put off doing that and scratch your head for a while wondering why ELAPSED_TIME in V$SQL might be significantly lower than CPU_TIME when there are no obvious other explanations, then note these two bugs in 11.1.0.7:

BUG 7168259 – Millisecond elapsed_time shows as 0 : Elapsed_time for SQL statements which execute in under one millisecond can show up as 0

BUG 7561762 – Elapsed_time is under-reported : Elapsed_time can be 0 or significantly less than V$SQL.CPU_TIME if there is significant idle wait class time

Flip Flop Hooray, Ho

(Title reference for any old skool hip hop fans?)

An investigation into a flip flopping sql statement – the downside of dynamic sampling – why it can be a problem for fast changing tables and frequently executed SQL.

Yesterday morning a sql statement which processes outstanding messages was having performance problems.

There were two child cursors being executed by the four concurrent sessions, depending on their optimizer settings.

I’m not going to detail the optimzer environment differences because it’s mainly irrelvant other than, because of the child cursors, there was a hard parse at time T0 and a hard parse at time T1 and then these cursors were executed repeatedly / regularly over a number of hours.

SQL statistics show this:

SQL> select to_char(sn.end_interval_time,'DD/MM/YYYY HH24:MI') dt
  2  ,      st.plan_hash_value plan_hash
  3  ,      rows_processed_delta rws
  4  ,      executions_delta     execs
  5  ,      elapsed_time_delta   elp
  6  ,      cpu_time_delta       cpu
  7  ,      buffer_gets_delta    gets
  8  ,      iowait_delta         io
  9  from   dba_hist_snapshot sn
 10  ,      dba_hist_sqlstat  st
 11  where  st.snap_id            = sn.snap_id
 12  and    st.sql_id             = '1s1vgw24g2u1k'
 13  and    st.elapsed_time_delta > 0
 14  and    sn.end_interval_time between to_timestamp('18/06/2012 09:00','DD/MM/YYYY HH24:MI')
 15                              and     to_timestamp('18/06/2012 10:00','DD/MM/YYYY HH24:MI')
 16  order by sn.snap_id, st.plan_hash_value; 

DT                PLAN_HASH        RWS      EXECS        ELP        CPU       GETS         IO
---------------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
18/06/2012 09:00 1081606725       1525        182 2163175676  101353586    3938349 2062115102
18/06/2012 09:00 3791039348         48        299     449781     380944      71472      68103
18/06/2012 09:30 1081606725       2032         24 3019992733  146890665   12995474 2886335201
18/06/2012 09:30 3791039348         43        289   73670959   73468837   56740610      24170

SQL> 

Whereas this is representative of “normal”:

DT                PLAN_HASH        RWS      EXECS        ELP        CPU       GETS         IO
---------------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
15/06/2012 09:00 3791039348       1289       1077  174434637   16030542    5470115     280865

SQL> 

I mentioned that this code was processing outstanding messages.
These different processes aren’t balanced and aren’t expected to be.

But plan hash 3791039348 is good for everyone and in general plan hash 1081606725 suits no-one.

Here’s the sanitised SQL – remember this isn’t so much a SQL tuning/rewriting exercise so much as a root cause of slowness exercise (where the root cause is not initially meant to be “because the SQL is not brilliant”)

      SELECT  ....
      FROM    cmsg,
              sint,
              setm,
              comp,
              trga,
		          seti,
		          trad,
              (SELECT COUNT(*) count,
                      setm_il.sett_mesg_tag sett_mesg_tag
               FROM   prco prco_il,
                      cmsg cmsg_il,
                      setm setm_il,
                      seti seti_il
               WHERE  cmsg_il.unacknowledged = prco_il.unack_code
               AND    cmsg_il.evnt_type      = 'SI'
               AND   (cmsg_il.evnt_tag       = seti_il.sett_istr_tag 
               OR     cmsg_il.evnt_tag       = seti_il.amending_si_tag)
               AND    setm_il.sett_istr_tag  = seti_il.sett_istr_tag
               GROUP BY setm_il.sett_mesg_tag) cmsg_si
      WHERE   cmsg.rowid IN (SELECT row_id
                             FROM (SELECT rowid row_id
                                   FROM   cmsg  cmsg2
                                   WHERE  unacknowledged = :p_unack_code
                                   AND    evnt_type     != 'SOD'
                                   ORDER BY seq_num)
                             WHERE rownum <= :p_batch_lim)
      AND     sint.sint_tag         (+) = cmsg.evnt_tag
      AND     sint.stus_ind         (+) = 'C'
      AND     setm.sett_mesg_tag    (+) = sint.sett_mesg_tag
      AND     comp.cmpy_num         (+) = setm.cmpy_num
      AND     trga.trga_code        (+) = comp.trga_code
      AND     seti.sett_istr_tag    (+) = setm.sett_istr_tag
      AND     trad.trad_tag         (+) = seti.trad_tag
      AND     trad.cmpy_num         (+) = seti.cmpy_num
      AND     cmsg_si.sett_mesg_tag (+) = sint.sett_mesg_tag
      ORDER by cmsg.seq_num;

Here is the relevant sections of the two execution plans:

The good:

SQL_ID 1s1vgw24g2u1k
 
Plan hash value: 3791039348
 
------------------------------------------------------------------------------------------------
| Id  | Operation                                         | Name       | Rows  | Cost  | Time  |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                                  |            |       |  9192 |       |
|   1 |  FOR UPDATE                                       |            |       |       |       |
|   2 |   SORT ORDER BY                                   |            |     1 |  9192 | 00:46 |
|   3 |    NESTED LOOPS OUTER                             |            |     1 |  9191 | 00:46 |
|   4 |     NESTED LOOPS OUTER                            |            |     1 |  9190 | 00:46 |
|   5 |      NESTED LOOPS OUTER                           |            |     1 |  9188 | 00:46 |
|   6 |       NESTED LOOPS OUTER                          |            |     1 |  9187 | 00:46 |
|   7 |        NESTED LOOPS OUTER                         |            |     1 |  9186 | 00:46 |
|   8 |         HASH JOIN OUTER                           |            |     1 |  9184 | 00:46 |
|   9 |          NESTED LOOPS OUTER                       |            |     1 |    10 | 00:01 |
|  10 |           NESTED LOOPS                            |            |     1 |     7 | 00:01 |
|  11 |            VIEW                                   | VW_NSO_1   |     3 |     5 | 00:01 |
|  12 |             SORT UNIQUE                           |            |     1 |       |       |
|  13 |              COUNT STOPKEY                        |            |       |       |       |
|  14 |               VIEW                                |            |     3 |     5 | 00:01 |
|  15 |                SORT ORDER BY STOPKEY              |            |     3 |     5 | 00:01 |
|  16 |                 TABLE ACCESS BY INDEX ROWID       | CMSG       |     3 |     4 | 00:01 |
|  17 |                  INDEX RANGE SCAN                 | CMSG3_IDX  |     4 |     3 | 00:01 |
|  18 |            TABLE ACCESS BY USER ROWID             | CMSG       |     1 |     1 | 00:01 |
|  19 |           TABLE ACCESS BY INDEX ROWID             | SETI       |     1 |     3 | 00:01 |
|  20 |            INDEX RANGE SCAN                       | SETI_1     |     1 |     2 | 00:01 |
|  21 |          VIEW                                     |            | 14490 |  9173 | 00:46 |
|  22 |           SORT GROUP BY                           |            | 14490 |       |       |
|  23 |            CONCATENATION                          |            |       |       |       |
|  24 |             NESTED LOOPS                          |            |       |       |       |
|  25 |              NESTED LOOPS                         |            |   562 |  8472 | 00:43 |
|  26 |               NESTED LOOPS                        |            |   428 |  6757 | 00:34 |
|  27 |                NESTED LOOPS                       |            |   304 |    30 | 00:01 |
|  28 |                 TABLE ACCESS BY INDEX ROWID       | CMSG       |   304 |    30 | 00:01 |
|  29 |                  INDEX FULL SCAN                  | CMSG4_IDX  |     1 |    29 | 00:01 |
|  30 |                 INDEX UNIQUE SCAN                 | PRCO_2_IDX |     1 |     0 |       |
|  31 |                PARTITION LIST ALL                 |            |     1 |    24 | 00:01 |
|  32 |                 TABLE ACCESS BY LOCAL INDEX ROWID | SETI       |     1 |    24 | 00:01 |
|  33 |                  INDEX RANGE SCAN                 | SETI_3_IDX |    36 |    22 | 00:01 |
|  34 |               INDEX RANGE SCAN                    | SETM_2_IDX |     1 |     2 | 00:01 |
|  35 |              TABLE ACCESS BY INDEX ROWID          | SETM       |     1 |     4 | 00:01 |
|  36 |             NESTED LOOPS                          |            |       |       |       |
|  37 |              NESTED LOOPS                         |            |    20 |   700 | 00:04 |
|  38 |               NESTED LOOPS                        |            |    15 |   640 | 00:04 |
|  39 |                NESTED LOOPS                       |            |    15 |   640 | 00:04 |
|  40 |                 TABLE ACCESS BY INDEX ROWID       | CMSG       |   304 |    30 | 00:01 |
|  41 |                  INDEX FULL SCAN                  | CMSG4_IDX  |     1 |    29 | 00:01 |
|  42 |                 TABLE ACCESS BY GLOBAL INDEX ROWID| SETI       |     1 |     2 | 00:01 |
|  43 |                  INDEX UNIQUE SCAN                | SETI_IDX   |     1 |     1 | 00:01 |
|  44 |                INDEX UNIQUE SCAN                  | PRCO_2_IDX |     1 |     0 |       |
|  45 |               INDEX RANGE SCAN                    | SETM_2_IDX |     1 |     2 | 00:01 |
|  46 |              TABLE ACCESS BY INDEX ROWID          | SETM       |     1 |     4 | 00:01 |
|  47 |         TABLE ACCESS BY INDEX ROWID               | SETM       |     1 |     2 | 00:01 |
|  48 |          INDEX UNIQUE SCAN                        | SETM_IDX   |     1 |     1 | 00:01 |
|  49 |        TABLE ACCESS BY INDEX ROWID                | COMP       |     1 |     1 | 00:01 |
|  50 |         INDEX UNIQUE SCAN                         | COMP_PK    |     1 |     0 |       |
|  51 |       TABLE ACCESS BY INDEX ROWID                 | TRGA       |     1 |     1 | 00:01 |
|  52 |        INDEX UNIQUE SCAN                          | TRGA_IDX   |     1 |     0 |       |
|  53 |      TABLE ACCESS BY GLOBAL INDEX ROWID           | SETI       |     1 |     2 | 00:01 |
|  54 |       INDEX UNIQUE SCAN                           | SETI_IDX   |     1 |     1 | 00:01 |
|  55 |     PARTITION LIST ITERATOR                       |            |     1 |     1 | 00:01 |
|  56 |      INDEX UNIQUE SCAN                            | TRAD_IDX   |     1 |     1 | 00:01 |
------------------------------------------------------------------------------------------------

And the downright bad and ugly:

SQL_ID 1s1vgw24g2u1k
 
Plan hash value: 1081606725
 
------------------------------------------------------------------------------------------------
| Id  | Operation                                         | Name       | Rows  | Cost  | Time  |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                                  |            |       |   436K|       |
|   1 |  FOR UPDATE                                       |            |       |       |       |
|   2 |   SORT ORDER BY                                   |            |     1 |   436K| 36:24 |
|   3 |    NESTED LOOPS OUTER                             |            |     1 |   436K| 36:24 |
|   4 |     NESTED LOOPS OUTER                            |            |     1 |   436K| 36:24 |
|   5 |      NESTED LOOPS OUTER                           |            |     1 |   436K| 36:23 |
|   6 |       NESTED LOOPS OUTER                          |            |     1 |   436K| 36:23 |
|   7 |        NESTED LOOPS OUTER                         |            |     1 |   436K| 36:23 |
|   8 |         HASH JOIN OUTER                           |            |     1 |   436K| 36:23 |
|   9 |          NESTED LOOPS OUTER                       |            |     1 |    10 | 00:01 |
|  10 |           NESTED LOOPS                            |            |     1 |     7 | 00:01 |
|  11 |            VIEW                                   | VW_NSO_1   |     2 |     5 | 00:01 |
|  12 |             SORT UNIQUE                           |            |     1 |       |       |
|  13 |              COUNT STOPKEY                        |            |       |       |       |
|  14 |               VIEW                                |            |     2 |     5 | 00:01 |
|  15 |                SORT ORDER BY STOPKEY              |            |     2 |     5 | 00:01 |
|  16 |                 TABLE ACCESS BY INDEX ROWID       | CMSG       |     2 |     4 | 00:01 |
|  17 |                  INDEX RANGE SCAN                 | CMSG3_IDX  |     2 |     3 | 00:01 |
|  18 |            TABLE ACCESS BY USER ROWID             | CMSG       |     1 |     1 | 00:01 |
|  19 |           TABLE ACCESS BY INDEX ROWID             | SETI       |     1 |     3 | 00:01 |
|  20 |            INDEX RANGE SCAN                       | SETI_1     |     1 |     2 | 00:01 |
|  21 |          VIEW                                     |            |   719K|   436K| 36:23 |
|  22 |           SORT GROUP BY                           |            |   719K|       |       |
|  23 |            CONCATENATION                          |            |       |       |       |
|  24 |             NESTED LOOPS                          |            |       |       |       |
|  25 |              NESTED LOOPS                         |            | 27212 |   403K| 33:40 |
|  26 |               HASH JOIN                           |            | 20520 |   301K| 25:06 |
|  27 |                HASH JOIN                          |            | 14473 |    35 | 00:01 |
|  28 |                 INDEX FULL SCAN                   | PRCO_2_IDX |   502 |     1 | 00:01 |
|  29 |                 TABLE ACCESS BY INDEX ROWID       | CMSG       | 14473 |    33 | 00:01 |
|  30 |                  INDEX FULL SCAN                  | CMSG4_IDX  |     1 |    32 | 00:01 |
|  31 |                PARTITION LIST ALL                 |            |    30M|   300K| 25:02 |
|  32 |                 TABLE ACCESS FULL                 | SETI       |    30M|   300K| 25:02 |
|  33 |               INDEX RANGE SCAN                    | SETM_2_IDX |     1 |     3 | 00:01 |
|  34 |              TABLE ACCESS BY INDEX ROWID          | SETM       |     1 |     5 | 00:01 |
|  35 |             NESTED LOOPS                          |            |       |       |       |
|  36 |              NESTED LOOPS                         |            |   960 | 32673 | 02:44 |
|  37 |               NESTED LOOPS                        |            |   724 | 29046 | 02:26 |
|  38 |                NESTED LOOPS                       |            |   724 | 29045 | 02:26 |
|  39 |                 TABLE ACCESS BY INDEX ROWID       | CMSG       | 14473 |    33 | 00:01 |
|  40 |                  INDEX FULL SCAN                  | CMSG4_IDX  |     1 |    32 | 00:01 |
|  41 |                 TABLE ACCESS BY GLOBAL INDEX ROWID| SETI       |     1 |     2 | 00:01 |
|  42 |                  INDEX UNIQUE SCAN                | SETI_IDX   |     1 |     1 | 00:01 |
|  43 |                INDEX UNIQUE SCAN                  | PRCO_2_IDX |     1 |     0 |       |
|  44 |               INDEX RANGE SCAN                    | SETM_2_IDX |     1 |     3 | 00:01 |
|  45 |              TABLE ACCESS BY INDEX ROWID          | SETM       |     1 |     5 | 00:01 |
|  46 |         TABLE ACCESS BY INDEX ROWID               | SETM       |     1 |     2 | 00:01 |
|  47 |          INDEX UNIQUE SCAN                        | SETM_IDX   |     1 |     1 | 00:01 |
|  48 |        TABLE ACCESS BY INDEX ROWID                | COMP       |     1 |     2 | 00:01 |
|  49 |         INDEX UNIQUE SCAN                         | COMP_PK    |     1 |     1 | 00:01 |
|  50 |       TABLE ACCESS BY INDEX ROWID                 | TRGA       |     1 |     1 | 00:01 |
|  51 |        INDEX UNIQUE SCAN                          | TRGA_IDX   |     1 |     0 |       |
|  52 |      TABLE ACCESS BY GLOBAL INDEX ROWID           | SETI       |     1 |     2 | 00:01 |
|  53 |       INDEX UNIQUE SCAN                           | SETI_IDX   |     1 |     1 | 00:01 |
|  54 |     PARTITION LIST ITERATOR                       |            |     1 |     1 | 00:01 |
|  55 |      INDEX UNIQUE SCAN                            | TRAD_IDX   |     1 |     1 | 00:01 |
------------------------------------------------------------------------------------------------
 
Note
-----
   - dynamic sampling used for this statement (level=4)
 

The trouble with specific examples is that sometimes they’re a bit longer than ideal (as well as being a bit out of context).

Let me highlight what I see as the problem:

  • There is a non-merged view with an OR (which is expanded to a CONCATENATION)
  • In the good plan, @ line 33, one part of the OR condition is dealt with via an INDEX RANGE SCAN on SETI_3_IDX.
  • In the bad plan, @ line 32, this is a TABLE ACCESS FULL on SETI.
  • There’s a dynamic sampling note in the bad plan

Good:

|  21 | VIEW                                    |            | 14490 |  9173   (1)| 00:00:46 |
...
|  26 |     NESTED LOOPS                        |            |   428 |  6757   (1)| 00:00:34 |
|  27 |      NESTED LOOPS                       |            |   304 |    30   (0)| 00:00:01 |
|  28 |       TABLE ACCESS BY INDEX ROWID       | CMSG       |   304 |    30   (0)| 00:00:01 |
|  29 |        INDEX FULL SCAN                  | CMSG4_IDX  |     1 |    29   (0)| 00:00:01 |
|  30 |       INDEX UNIQUE SCAN                 | PRCO_2_IDX |     1 |     0   (0)|          |
|  31 |      PARTITION LIST ALL                 |            |     1 |    24   (0)| 00:00:01 |
|  32 |       TABLE ACCESS BY LOCAL INDEX ROWID | SETI       |     1 |    24   (0)| 00:00:01 |
|  33 |        INDEX RANGE SCAN                 | SETI_3_IDX |    36 |    22   (0)| 00:00:01 |
...

Bad:

|  21 | VIEW                                    |            |   719K|   436K  (3)| 00:36:23 |
...
|  26 |     HASH JOIN                           |            | 20520 |   301K  (4)| 00:25:06 |
|  27 |      HASH JOIN                          |            | 14473 |    35   (3)| 00:00:01 |
|  28 |       INDEX FULL SCAN                   | PRCO_2_IDX |   502 |     1   (0)| 00:00:01 |
|  29 |       TABLE ACCESS BY INDEX ROWID       | CMSG       | 14473 |    33   (0)| 00:00:01 |
|  30 |        INDEX FULL SCAN                  | CMSG4_IDX  |     1 |    32   (0)| 00:00:01 |
|  31 |      PARTITION LIST ALL                 |            |    30M|   300K  (4)| 00:25:02 |
|  32 |       TABLE ACCESS FULL                 | SETI       |    30M|   300K  (4)| 00:25:02 |
...
Note
-----
   - dynamic sampling used for this statement (level=4)

It’s always nice to simplify a situation and because the problem seems to be in this non-merged view, we can actually eliminate half the sql and then see if we can just reproduce the problem by adjusting the estimates.

Which we can, and via trial and error, we see that as we approach an estimate of 14000 rows from CMSG, the plan changes to the FULL TABLE SCAN of SETI:

SQL> explain plan for
  2                 SELECT /*+
  3                          opt_estimate(table cmsg_il rows=14000)
  4                          */
  5                        COUNT(*) count,
  6                        setm_il.sett_mesg_tag sett_mesg_tag
  7                 FROM   prco    prco_il,
  8                        cmsg    cmsg_il,
  9                        setm    setm_il,
 10                        seti    seti_il
 11                 WHERE  cmsg_il.unacknowledged = prco_il.unack_code
 12                 AND    cmsg_il.evnt_type = 'SI'
 13                 AND   (cmsg_il.evnt_tag = seti_il.sett_istr_tag OR
 14                        cmsg_il.evnt_tag = seti_il.amending_si_tag)
 15                 AND    setm_il.sett_istr_tag = seti_il.sett_istr_tag
 16                 GROUP BY setm_il.sett_mesg_tag;

Explained.

SQL> select * from table(dbms_xplan.display);
Plan hash value: 273475595

-----------------------------------------------------------------------------------------------
| Id  | Operation                                | Name       | Rows  | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                         |            |   695K|   432K  (3)| 00:36:01 |
|   1 |  HASH GROUP BY                           |            |   695K|            |          |
|   2 |   CONCATENATION                          |            |       |            |          |
|   3 |    NESTED LOOPS                          |            |       |            |          |
|   4 |     NESTED LOOPS                         |            | 26323 |   400K  (3)| 00:33:23 |
|*  5 |      HASH JOIN                           |            | 19850 |   301K  (4)| 00:25:06 |
|*  6 |       HASH JOIN                          |            | 14000 |    35   (3)| 00:00:01 |
|   7 |        INDEX FULL SCAN                   | PRCO_2_IDX |   502 |     1   (0)| 00:00:01 |
|*  8 |        TABLE ACCESS BY INDEX ROWID       | CMSG       | 14000 |    33   (0)| 00:00:01 |
|*  9 |         INDEX FULL SCAN                  | CMSG4_IDX  |     1 |    32   (0)| 00:00:01 |
|  10 |       PARTITION LIST ALL                 |            |    30M|   300K  (4)| 00:25:02 |
|  11 |        TABLE ACCESS FULL                 | SETI       |    30M|   300K  (4)| 00:25:02 |
|* 12 |      INDEX RANGE SCAN                    | SETM_2_IDX |     1 |     3   (0)| 00:00:01 |
|  13 |     TABLE ACCESS BY INDEX ROWID          | SETM       |     1 |     5   (0)| 00:00:01 |
|  14 |    NESTED LOOPS                          |            |       |            |          |
|  15 |     NESTED LOOPS                         |            |   928 | 31605   (1)| 00:02:39 |
|  16 |      NESTED LOOPS                        |            |   700 | 28097   (1)| 00:02:21 |
|  17 |       NESTED LOOPS                       |            |   700 | 28097   (1)| 00:02:21 |
|* 18 |        TABLE ACCESS BY INDEX ROWID       | CMSG       | 14000 |    33   (0)| 00:00:01 |
|* 19 |         INDEX FULL SCAN                  | CMSG4_IDX  |     1 |    32   (0)| 00:00:01 |
|* 20 |        TABLE ACCESS BY GLOBAL INDEX ROWID| SETI       |     1 |     2   (0)| 00:00:01 |
|* 21 |         INDEX UNIQUE SCAN                | SETI_IDX   |     1 |     1   (0)| 00:00:01 |
|* 22 |       INDEX UNIQUE SCAN                  | PRCO_2_IDX |     1 |     0   (0)| 00:00:01 |
|* 23 |      INDEX RANGE SCAN                    | SETM_2_IDX |     1 |     3   (0)| 00:00:01 |
|  24 |     TABLE ACCESS BY INDEX ROWID          | SETM       |     1 |     5   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - access("CMSG_IL"."EVNT_TAG"="SETI_IL"."AMENDING_SI_TAG")
   6 - access("CMSG_IL"."UNACKNOWLEDGED"="PRCO_IL"."UNACK_CODE")
   8 - filter("CMSG_IL"."EVNT_TYPE"='SI')
   9 - filter("CMSG_IL"."UNACKNOWLEDGED" IS NOT NULL)
  12 - access("SETM_IL"."SETT_ISTR_TAG"="SETI_IL"."SETT_ISTR_TAG")
  18 - filter("CMSG_IL"."EVNT_TYPE"='SI')
  19 - filter("CMSG_IL"."UNACKNOWLEDGED" IS NOT NULL)
  20 - filter(LNNVL("CMSG_IL"."EVNT_TAG"="SETI_IL"."AMENDING_SI_TAG"))
  21 - access("CMSG_IL"."EVNT_TAG"="SETI_IL"."SETT_ISTR_TAG")
  22 - access("CMSG_IL"."UNACKNOWLEDGED"="PRCO_IL"."UNACK_CODE")
  23 - access("SETM_IL"."SETT_ISTR_TAG"="SETI_IL"."SETT_ISTR_TAG")

46 rows selected.

SQL> 

So, optimizer environment differences and bind variable peeking are pretty much ruled out as influencing factors.

What remains is that the bad plan has a higher estimate of rows than the good plan and the bad plan has dynamic sampling.

Both SQL statements are the same.

Dynamic sampling level is the same for both (4 set at system level).

Dynamic sampling should be being applied because of the parameter setting and because there are two single table access predicates
- one from the predicate on CMSG_IL.EVENT_TYPE and one implicit IS NOT NULL predicate on the join column UNACKNOWLEDGED.

So why would we see DS in one plan and not anohther?

If we do a 10053 trace of the distilled non-mergeable view, we see:

Access path analysis for CMSG
***************************************
SINGLE TABLE ACCESS PATH
  Single Table Cardinality Estimation for CMSG[CMSG_IL]

*** 2012-06-19 14:32:04.576
** Performing dynamic sampling initial checks. **
  Column (#4):
    NewDensity:0.000062, OldDensity:0.000000 BktCnt:8090, PopBktCnt:8084, PopValCnt:46, NDV:80
  Column (#4): EVNT_TYPE(
    AvgLen: 4 NDV: 80 Nulls: 0 Density: 0.000062
    Histogram: Freq  #Bkts: 52  UncompBkts: 8090  EndPtVals: 52
** Dynamic sampling initial checks returning TRUE (level = 4).

*** 2012-06-19 14:32:04.576
** Generated dynamic sampling query:
    query text :
SELECT /* OPT_DYN_SAMP */ 
       /*+ 
         ALL_ROWS 
         IGNORE_WHERE_CLAUSE 
         NO_PARALLEL(SAMPLESUB) 
         opt_param('parallel_execution_enabled', 'false') 
         NO_PARALLEL_INDEX(SAMPLESUB) 
         NO_SQL_TUNE 
         */ 
       NVL(SUM(C1),0), 
       NVL(SUM(C2),0) 
FROM (SELECT /*+ 
               IGNORE_WHERE_CLAUSE NO_PARALLEL("CMSG_IL") 
               FULL("CMSG_IL") 
               NO_PARALLEL_INDEX("CMSG_IL") 
               */ 
             1 AS C1, 
             CASE WHEN "CMSG_IL"."EVNT_TYPE"=:B1 
                  AND  "CMSG_IL"."UNACKNOWLEDGED" IS NOT NULL 
                  THEN 1 
                  ELSE 0 
             END AS C2 
FROM "CMSG" SAMPLE BLOCK (0.002711 , 1) SEED (1) "CMSG_IL") SAMPLESUB

*** 2012-06-19 14:32:04.612
** Executed dynamic sampling query:
    level : 4
    sample pct. : 0.002711
    actual sample size : 1756
    filtered sample card. : 0
    orig. card. : 75899119
    block cnt. table stat. : 1118909
    block cnt. for sampling: 1143461
    max. sample block cnt. : 32
    sample block cnt. : 31
    min. sel. est. : 0.00000001
** Not using dynamic sampling for single table sel. or cardinality.

So, the results of dynamic sampling are discarded as unreliable and the estimates fall back on using the stored object stats.

But at some point (one of the hard parses) the data returned in the sample must have found some relevant data and been considered reliable which led to the “bad” plan being executed repeatedly in inappropriate circumstances.

Two footnotes to this investigation.

Firstly, when you have tables with fast-changing data and row counts, there are usually two choices in approach for statistics
- either find a set of stats that gives the best (balance of) plans and lock them.
- or use dynamic sampling

The scenario above shows the downside of the dynamic sampling approach.
It may well be that the plan hash 1081606725 was good for the initial execution(s) of the statement when there maybe were a large number of outstanding messages but it very quickly became unsuitable given the frequent executions.

Secondly, I mentioned that this was not a SQL rewriting exercise, but, if it was, I’m not a great fan of the outer-joined inline view. It’s not merged but it seems sensible that a predicate be pushed into view.
Pushing the predicate, whether happening by default or hinted explicitly, would give a subsection of the plan like:

|  20 | VIEW PUSHED PREDICATE                  |            |     1 |    38   (0)| 00:00:01 |
|  21 |  SORT GROUP BY                         |            |     1 |    38   (0)| 00:00:01 |
|  22 |   NESTED LOOPS                         |            |     1 |    38   (0)| 00:00:01 |
|  23 |    NESTED LOOPS                        |            |     1 |    38   (0)| 00:00:01 |
|  24 |     NESTED LOOPS                       |            |     1 |     5   (0)| 00:00:01 |
|  25 |      TABLE ACCESS BY INDEX ROWID       | SETM       |     1 |     3   (0)| 00:00:01 |
|* 26 |       INDEX UNIQUE SCAN                | SETM_IDX   |     1 |     2   (0)| 00:00:01 |
|  27 |      TABLE ACCESS BY GLOBAL INDEX ROWID| SETI       |    30M|     2   (0)| 00:00:01 |
|* 28 |       INDEX UNIQUE SCAN                | SETI_IDX   |     1 |     1   (0)| 00:00:01 |
|* 29 |     TABLE ACCESS BY INDEX ROWID        | CMSG       |     1 |    33   (0)| 00:00:01 |
|* 30 |      INDEX FULL SCAN                   | CMSG_IDX   |     1 |    32   (0)| 00:00:01 |
|* 31 |    INDEX UNIQUE SCAN                   | PRCO_2_IDX |     1 |     0   (0)| 00:00:01 |

However, I’m not keen on explicitly hinting this nor leaving it to chance and as it’s outer joined, I would strongly consider using a scalar subselect:

      WITH cmsg AS
           (SELECT *
            FROM   (SELECT cmsg.evnt_tag,
                           cmsg.evnt_type,
                           cmsg.seq_num
                    FROM   cmsg
                    WHERE  unacknowledged = :p_unack_code
                    AND    evnt_type != 'SOD'
                    ORDER BY seq_num)
            WHERE rownum <= :p_batch_lim)
      SELECT  ...,
              (SELECT COUNT(*)
               FROM   proc    prco_il,
                      cmsg    cmsg_il,
                      setm    setm_il,
                      seti    seti_il
               WHERE  cmsg_il.unacknowledged = prco_il.unack_code
               AND    cmsg_il.evnt_type      = 'SI'
               AND   (cmsg_il.evnt_tag       = seti_il.sett_istr_tag
               OR     cmsg_il.evnt_tag       = seti_il.amending_si_tag)
               AND    setm_il.sett_istr_tag  = seti_il.sett_istr_tag
               AND    setm_il.sett_mesg_tag  = sint.sett_mesg_tag) count
      FROM    cmsg,
              sint,
              setm,
              comp,
              trga,
              seti,
              trad
      WHERE   sint.sint_tag      (+) = cmsg.evnt_tag
      AND     sint.stus_ind      (+) = 'C'
      AND     setm.sett_mesg_tag (+) = sint.sett_mesg_tag
      AND     comp.cmpy_num      (+) = setm.cmpy_num
      AND     trga.trga_code     (+) = comp.trga_code
      AND     seti.sett_istr_tag (+) = setm.sett_istr_tag
      AND     trad.trad_tag      (+) = seti.trad_tag
      AND     trad.cmpy_num      (+) = seti.cmpy_num
      ORDER by cmsg.seq_num;

Flippin’ Witch Hunt – adaptive direct path read

Plan flips… here seems to be general impression that most application performance issues are going to be caused by some sort of “plan flip” and that “plan flips” are bad.

  • If you don’t use plan stability features, should you really expect plan stability?
  • Even outside your stats job window whenever that is?
  • What if you’re using bind variables with histograms?
  • And you’ve got features like dynamic sampling which will probably get a slightly different sample for each hard parse?
  • And if you’ve also got cardinality feedback kicking in?

The old battle between stability and optimisation, I suppose.

I made a throwaway comment on a forum thread about having seen a performance problem caused by a sql statement which changed to doing direct path IO.

I thought it might be vaguely interesting to do a quick blog post and back that up with a tiny bit of “evidence”.

This was getting on for three weeks ago now (29th May) and there were bigger fish to fry at the time so some of the evidence and some of the accuracy of the evidence might wont a little but I love having a long AWR retention and the outputs below are what ASH & AWR can tell us now about what happened back then.

Problem:
A particular sql statment was running slowly.
Initial analysis had identified a change in the plan and that change was caused by cardinality feedback and changes in plans are bad of course ;)

Looking at DBA_HIST_SQLSTAT, I could see that the SQL regularly flip-flopped between two plans, one of which was the plan normally received after a hard parse, the other a change brought on by cardinality feedback.

But there really wasn’t much to choose between the plans in terms of statistics and averages.

In the execution plan, we can also skip specifics, the only relevant detail that there was a FULL TABLE SCAN which was the inner part (i.e. executed multiple times for each row in the driving / outer rowsource) of a NESTED LOOP – something which is rarely going to be a good thing, i.e.

NESTED LOOP
  some row source
  FULL TABLE SCAN

However, if we cross-reference DBA_HIST_ACTIVE_SESS_HISTORY, then we see the critical differences between the executions of this FULL TABLE SCAN on the 28th / 29th May and other days.

If you don’t like averages then I don’t blame you but you won’t like this query:

with ash as
(select trunc(ash.sample_time) snap_day
 ,      min(ash.snap_id) min_snap_id
 ,      max(ash.snap_id) max_snap_id
 ,      ash.current_obj# obj_id
 ,      ash.sql_id
 ,      ash.sql_plan_hash_value
 ,      ash.sql_plan_operation
 ,      ash.sql_plan_options
 ,      ash.current_obj#
 ,      ash.event
 ,      count(distinct ash.sql_exec_id) ash_cnt
 from   dba_hist_active_sess_history ash
 where  ash.sql_id              = 'drbkfkfxg6a6a'
 and    ash.sql_plan_operation  = 'TABLE ACCESS'
 and    ash.sql_plan_options    = 'FULL'
 and    ash.current_obj#        = 169748
 and    ash.sample_time        >= to_date('01/05/2012','DD/MM/YYYY')
 and    ash.event              is not null
 group by
        trunc(ash.sample_time)
 ,      ash.sql_plan_hash_value
 ,      ash.sql_id
 ,      ash.sql_plan_operation
 ,      ash.sql_plan_options
 ,      ash.current_obj#
 ,      ash.event)
select ash.snap_day
,      ash.event
,      ash.ash_cnt
,      (select round(sum(elapsed_time_delta) / sum(executions_delta) /1000/1000)
        from   dba_hist_sqlstat   st
        where  st.sql_id        = ash.sql_id
        and    st.snap_id between ash.min_snap_id
                          and     ash.max_snap_id) avg_elp
,      (select round(sum(rows_processed_delta) / sum(executions_delta))
        from   dba_hist_sqlstat   st
        where  st.sql_id        = ash.sql_id
        and    st.snap_id between ash.min_snap_id
                          and     ash.max_snap_id) avg_rws
,      round((1275068416 -
       (select sum(space_allocated_delta)
        from   dba_hist_seg_stat st
        where  st.obj#          = ash.obj_id
        and    st.snap_id      >= ash.min_snap_id))
      / 1024/1024) "SPACE_USED (ISH)"
,      (select min(bytes)/1024/1024
        from   dba_hist_sgastat st
        where  name             = 'buffer_cache'
        and    st.snap_id between ash.min_snap_id
                          and     ash.max_snap_id) min_bc
from   ash
order by snap_day;

What am I trying to see:

  • Historic running of a particular sql statment
  • Focusing on this FULL TABLE SCAN operation which was a problem on the 28th / 29th
  • Occurences in ASH (be very careful of reading too much into samples of samples and aggregates and averages)
  • What the statement tended to wait on and a feel for how much waiting (beware of sampled samples)
  • And finally a rough approximation of how big the object was relative to the buffer cache (we know what size it is now (which I’ve plugged into the query) so work back through the growth in segment stats. It’s using space allocated because I don’t have access to dbms_space to get space used and space allocated is going to be a high estimate of actual space used – it certainly shouldn’t be more than that)
SNAP_DAY  EVENT                               ASH_CNT    AVG_ELP SPACE_USED (ISH)     MIN_BC
--------- -------------------------------- ---------- ---------- ---------------- -------
01-MAY-12 db file scattered read                    1         19             1152      61696
01-MAY-12 db file sequential read                   1         19             1152      61696
01-MAY-12 db file sequential read                   1         25             1152      61696
02-MAY-12 db file sequential read                   1          8             1152      61696
02-MAY-12 db file scattered read                    1          8             1152      61696
07-MAY-12 db file scattered read                    1          7             1152      61696
08-MAY-12 db file scattered read                    1         24             1152      61696
08-MAY-12 db file sequential read                   1         24             1152      61696
09-MAY-12 db file sequential read                   1        100             1152      61696
09-MAY-12 db file scattered read                    1        100             1152      61696
10-MAY-12 db file sequential read                   1          6             1152      61696
10-MAY-12 db file scattered read                    1          6             1152      61696
11-MAY-12 db file sequential read                   1          7             1152      61696
11-MAY-12 db file scattered read                    1         13             1152      61696
11-MAY-12 db file sequential read                   1         13             1152      61696
15-MAY-12 db file scattered read                    1        148             1152      62208
16-MAY-12 db file scattered read                    1          8             1152      62464
16-MAY-12 db file sequential read                   2          5             1152      62464
17-MAY-12 db file scattered read                    1         17             1152      62208
17-MAY-12 db file sequential read                   1         17             1152      62208
18-MAY-12 db file sequential read                   2         17             1152      62208
18-MAY-12 db file scattered read                    2         17             1152      62208
21-MAY-12 db file scattered read                    1         16             1216      62208
21-MAY-12 db file scattered read                    1          6             1216      62208
22-MAY-12 db file scattered read                    1         19             1216      62208
23-MAY-12 db file scattered read                    3         17             1216      62208
23-MAY-12 read by other session                     1         15             1216      62208
24-MAY-12 db file scattered read                    1         28             1216      61696
25-MAY-12 db file scattered read                    1         67             1216      62208
25-MAY-12 db file sequential read                   1         67             1216      62208
28-MAY-12 direct path read                          3         45             1216      62464
28-MAY-12 direct path read                         96         70             1216      62464
29-MAY-12 direct path read                         10        484             1216      62464
29-MAY-12 direct path read                        137        643             1216      62464
29-MAY-12 enq: KO - fast object checkpoint          1        545             1216      62464

35 rows selected.

SQL> 

The FULL TABLE SCAN as the outer rowsource of a NESTED LOOP is a performance threat anyway but in summary, a performance problem, irrelevant plan stability and a key switch to direct path IO, not obviously caused by a significant growth in object size.

Correlation / causation

On Tuesday morning I walked into a right sh*t storm.

I hadn’t been called but on Monday night performance had been absolutely catastrophic – SLAs missed left, right and centre.

Straight into a meeting where the following observations were made:

  • Performance was somewhat suboptimal the previous night
  • Some long-running queries were observed that aren’t normally top of the lists, these were “uncontrolled” access via Excel or Access, not via the reporting Front End or other application processes
  • When these queries stopped, performance returned to normal

So:

  • Why had these queries brought the database to its knees?
  • Was it stats and the stats job on Sunday night?
  • Was it “plan flips”? (I hear a lot about “plan flips” and them being “bad” – do people really expect plan stability without plan stability features?)
  • Was it cardinality feedback? (Admittedly this does seem to have caused us more positives than negatives particularly when working in conjunction with dynamic sampling when I’m not sure it should be).

At this point, I still hadn’t seen any evidence or symptoms of the issue, just hearsay.

So, what did the problem look like?

Below is a snippet from an AWR report from a “normal” period.
(It’s longer than usually advisable – I just want to show the symptoms over the problem period)

Platform                         CPUs Cores Sockets Memory(GB)
-------------------------------- ---- ----- ------- ----------
Linux x86 64-bit                   24    12       2     141.44

              Snap Id      Snap Time      Sessions Curs/Sess
            --------- ------------------- -------- ---------
Begin Snap:      6678 21-May-12 21:30:26       584      22.8
  End Snap:      6682 21-May-12 23:30:33       565      24.6
   Elapsed:              120.13 (mins)
   DB Time:              582.50 (mins)

Foreground Wait Events
                                                             Avg
                                        %Time Total Wait    wait    Waits   % DB
Event                             Waits -outs   Time (s)    (ms)     /txn   time
-------------------------- ------------ ----- ---------- ------- -------- ------
db file sequential read       2,664,063     0      7,050       3      2.5   20.2
log file sync                 1,027,185     0      3,501       3      1.0   10.0
direct path read                159,706     0      1,191       7      0.2    3.4
db file scattered read           61,649     0        284       5      0.1     .8

Background Wait Events
                                                             Avg
                                        %Time Total Wait    wait    Waits   % bg
Event                             Waits -outs   Time (s)    (ms)     /txn   time
-------------------------- ------------ ----- ---------- ------- -------- ------
log file parallel write         839,190     0      1,465       2      0.8   23.6
db file parallel write          551,198     0        674       1      0.5   10.9
Log archive I/O                  21,277     0        311      15      0.0    5.0
log file sequential read          6,714     0        150      22      0.0    2.4
control file parallel writ       14,311     0         48       3      0.0     .8

and this was what had happened:

              Snap Id      Snap Time      Sessions Curs/Sess
            --------- ------------------- -------- ---------
Begin Snap:      7014 28-May-12 21:30:28       560      22.8
  End Snap:      7018 28-May-12 23:30:01       538      23.7
   Elapsed:              119.54 (mins)
   DB Time:            2,067.54 (mins)

Foreground Wait Events
                                                             Avg
                                        %Time Total Wait    wait    Waits   % DB
Event                             Waits -outs   Time (s)    (ms)     /txn   time
-------------------------- ------------ ----- ---------- ------- -------- ------
log file sync                   399,560     0     83,938     210      1.0   67.7
direct path read                123,631     0     15,639     126      0.3   12.6
db file sequential read         363,158     0      9,406      26      0.9    7.6
db file scattered read           93,140     0      4,083      44      0.2    3.3

Background Wait Events
                                                             Avg
                                        %Time Total Wait    wait    Waits   % bg
Event                             Waits -outs   Time (s)    (ms)     /txn   time
-------------------------- ------------ ----- ---------- ------- -------- ------
db file parallel write          394,366     0      8,048      20      1.0   37.6
log file parallel write         101,489     0      6,197      61      0.2   28.9
Log archive I/O                  12,026     0      1,814     151      0.0    8.5
control file parallel writ       11,240     0      1,793     160      0.0    8.4
log file sequential read          9,587     0      1,179     123      0.0    5.5

So, hang on a second.

We think that a couple of long-running queries caused this devastation – log file syncs out to 70x our chosen “normal” above and no CPU issues? All IO basically choked.

Doesn’t make sense – wouldn’t that take some serious misconfiguration?

And yet, how could this be a storage issue if it all resolved itself when those long-running queries finished? It was this doubt that meant we spent longer than needed checking and double checking this box and this database before contacting the storage team.

Then to cut a long story short and to miss out a whole bunch of toing and froing and whys and wherefores… the storage team put the focus on another of our databases that uses the same disk groups.

That database didn’t seem to be doing very much:

              Snap Id      Snap Time      Sessions Curs/Sess
            --------- ------------------- -------- ---------
Begin Snap:      8271 28-May-12 21:00:33        71        .9
  End Snap:      8273 28-May-12 23:00:17        72        .9
   Elapsed:              119.74 (mins)
   DB Time:              155.20 (mins)

Although the few things it was doing were similarly slow:

                                                           Avg
                                                          wait   % DB
Event                                 Waits     Time(s)   (ms)   time Wait Class
------------------------------ ------------ ----------- ------ ------ ----------
control file sequential read        272,003       6,778     25   72.8 System I/O
db file sequential read              36,346       1,409     39   15.1 User I/O
db file parallel read                 1,127         797    707    8.6 User I/O
log file sync                           300         253    843    2.7 Commit
DB CPU                                               72            .8

But, hang on a sec. What’s that:

Time Model Statistics

Statistic Name                                       Time (s) % of DB Time
------------------------------------------ ------------------ ------------
RMAN cpu time (backup/restore)                       42,305.3        454.3

and

Background Wait Events
                                                             Avg
                                        %Time Total Wait    wait    Waits   % bg
Event                             Waits -outs   Time (s)    (ms)     /txn   time
-------------------------- ------------ ----- ---------- ------- -------- ------
RMAN backup & recovery I/O    1,166,261     0     72,242      62 2.65E+04   60.2

Turns out that it was an RMAN backup, running when it shouldn’t have been, on a 3.5T archive database on one of the other nodes of the cluster and which uses the same disk groups.

SQL> select to_char(start_time,'DD-MON-YYYY HH24:MI')
  2  ,      to_char(end_time,'DD-MON-YYYY HH24:MI')
  3  ,      input_bytes_display
  4  ,      output_bytes_display
  5  ,      time_taken_display  
  6  from v$rman_backup_job_details  
  7  where session_stamp = 784494089  
  8  order by start_time desc  
  9  /

TO_CHAR(START_TIM TO_CHAR(END_TIME, INPUT_BYTES_DISPLAY  OUTPUT_BYTES_DISPLAY TIME_TAKEN_DISPLAY
----------------- ----------------- -------------------- -------------------- --------------------
28-MAY-2012 19:01 28-MAY-2012 23:47     2.99T              532.82G            04:46:30

SQL> 

That correlation was the causation, not those queries and this backup finished just before those long-running queries finished and then everything was ok.

But the long-running queries were just a red-herring and another symptom of everything that was running on the live database was choking.

Sometimes it’s difficult to distinguish between cause and symptom.

Poll: SQL Plan Management

I’m interested in what SQL Plan Management features people are actively using.
Read more of this post

Test config changes for Veritas CIO and Linux IO Scheduler

Just a short note to report on the impact that some config changes have made on IO times in a specific environment for a specific workload.

I mentioned previously that I’m working on an upgrade of an application from 9.2.0.8 to 11.2.0.2

  • There are changes in pretty much every area – new hardware, different OS, etc, etc.
  • We’re using a full-volume UAT environment within the new set-up to compare new against old (production) and that will form the main basis for performance changes to the application required for this upgrade.
  • It’s pretty much an apples vs oranges comparison and not helped by the fact that UAT runs on tier 2 storage to be compared against the current tier 1 storage – UAT IO is slow.
  • In summary, not exactly best practice upgrade approach – but that’s just how it is sometimes in the real world.

Anyway, anyway, anyway… we’ve been waiting for recommendations and official go-ahead from the database engineering group who run the tests and control the builds of the machines and the following config changes have been made to the following:

  • Veritas VxFS CIO
  • Linux Deadline scheduler

Ideally such changes would be made individually to gauge their individual impact, however, as mentioned, it’s not always like that is it?.

And on the UAT environments above, based on a before-flashback-after run of the main application workload, the following IO times were observed:

Wait Event Average wait time (before) Average wait time (after)
db file sequential read 10 ms 7 ms
db file scattered read 21 ms 11 ms
db file parallel read 38 ms 60 ms
direct path read 95 ms 64 ms
direct path read temp 32 ms 9 ms
direct path write temp 34 ms 8 ms
log file sync 9 ms 3 ms

I’m not convinced that we have a level of control over the whole environment and time to deliver change incrementally to read too much into a single before/after comparison of the same workload, however these initial findings were better than I expected.
(Bottom line is that it’s still apples vs oranges)

So… it may be that someone finds this useful.

“The Correct Plan”

Part I – What is “The Correct Plan”?

How many SQL statements are genuinely straightforward enough that you can look at it and instantly say what “The Correct Plan” is?

When you make this judgement on “The Correct Plan”, do you take into account index clustering factors, single and multiblock read times, etc, etc? Should you?

What about new features, new access paths, new join methods, does “The Correct Plan” take these into account?

Or perhaps you think it is independent of them!?!?

What about if you’ve got a plan that you’re happy with on production? Is that “The Correct Plan”?

So maybe this is just semantics?

What people mean when they say “The Correct Plan” and “The Wrong Plan/s” is probably “An Acceptable Plan” and “An Unacceptable Plan”.

Maybe I need to get over the terminology, move on.

There might be something better, there might be something worse, but this plan is what they’re happy with at the moment.

This plan is acceptable.

Part II – “Hi, my name is Bob and I’m addicted to hinting.”

Why am I going on about this?

It’s a follow-up to my previous post on some frustrations on a SQL Tuning gig.

One of the reasons I blog is that it has carthartic properties. But it’s not carthartic enough – I can’t stop going on and on about the same stuff which drives me to distraction day in, day out.

I’m in a team that is addicted to hinting.

Developers addicted to hinting.

Managers addicted to developers hinting.

In fact, I can’t recall being somewhere with a greater affliction of hinting addiction.

And they are insistent on me hinting, all day, every day.

I’ve tried but they won’t give it up. They’ve got to want to give it up.

It is “The Silver Bullet” and “The Comfort Blanket”, whilst in reality being neither.

And hinting begets more hinting. See Rules for Hinting.

For a big statement, have you ever tried hinting it manually to get “The Correct Plan” consistently? There might be shortcuts by nicking the full set of hints from an outline or a profile or other_xml depending on version, but it’s hideous, it’s long winded, and at some point it’s likely to “go bad”.

I’ve tried to get the team to look at baselines as a way to lock in “The Correct Plan” but they have huge – quite possibly insurmountable (in terms of willingness to surmount them) – FUD about how these will fit into the development lifecycle, the release management process, change control.

The first nail in the coffin of getting them to experiment with baselines was a performance bug with recursive merge statements into the sqlobj$* tables. Mentioned that before. Several times.

That now having been fixed, the second nail in the coffin happened today.

There’s a lot of Pro*C code. I managed to get them to compile some of it with common_parser=yes so that, for the SQL that is inline in the Pro*C and not in a stored proc, we could use some of these new-fangled features that have come out since something like Oracle 8, you know like scalar subqueries, WITH, analytics even LISTAGG for example.

But this has had the nasty side-effect of materially changing some of the SQL statements – particularly around removing unnecessary parentheses around predicates – such that both the force and exact matching signatures of certain SQL statements changed. So, the few statements we were trying with baselines failed to use the baselined plans and “went bad”.

Not necessarily a problem in itself but adding to the insurmountable FUD…

Part III – Ignore the Plan

A while ago Doug Burns was talking about getting developers to ignore Cost.

Sometimes – not all the time, but most of the time – I would go further than that – “Ignore the plan”.

Obviously I don’t really mean “Ignore the plan”, at least not the whole plan.

When, for example, a statement performs acceptably or better than acceptably (which will then instantly become the new “acceptably”) – who cares about the plan at all (until it performs badly).

However, when performance problems set in, maybe there’s a tendency sometimes to get too hung up on whether the starting point should be a nested loop driven by an index range scan of index_1, etc?

A lot of the time, I bet that your idea of “The Correct Plan” is based on heuristics, a set of rules, a bit like the RBO but probably not as effective.

And there’s a reason why those similar heuristics in the RBO have been discarded.

So maybe Ignore the Plan?

Focus on the row estimates, the cardinalities.

If these are accurate, at least nine times out of ten, you’ll get an appropriate plan, “An Acceptable Plan”.

And if they’re not accurate, before looking at anything else, review the SQL.

Is that the best way to express the logic? There’s a good chance it’s not.

In the 9i to 11gR2 upgrade that I’m currently involved with, most of the SQL with the biggest performance problems and which deviate from “The Correct Plan”, can be rewritten, expressed in a slightly different, sometimes better, sometimes more natural, logical, set-based way.

And often, that’s enough.

Follow

Get every new post delivered to your Inbox.

Join 68 other followers