Flip Flop Hooray, Ho

(Title reference for any old skool hip hop fans?)

An investigation into a flip flopping sql statement – the downside of dynamic sampling – why it can be a problem for fast changing tables and frequently executed SQL.

Yesterday morning a sql statement which processes outstanding messages was having performance problems.

There were two child cursors being executed by the four concurrent sessions, depending on their optimizer settings.

I’m not going to detail the optimzer environment differences because it’s mainly irrelvant other than, because of the child cursors, there was a hard parse at time T0 and a hard parse at time T1 and then these cursors were executed repeatedly / regularly over a number of hours.

SQL statistics show this:

SQL> select to_char(sn.end_interval_time,'DD/MM/YYYY HH24:MI') dt
  2  ,      st.plan_hash_value plan_hash
  3  ,      rows_processed_delta rws
  4  ,      executions_delta     execs
  5  ,      elapsed_time_delta   elp
  6  ,      cpu_time_delta       cpu
  7  ,      buffer_gets_delta    gets
  8  ,      iowait_delta         io
  9  from   dba_hist_snapshot sn
 10  ,      dba_hist_sqlstat  st
 11  where  st.snap_id            = sn.snap_id
 12  and    st.sql_id             = '1s1vgw24g2u1k'
 13  and    st.elapsed_time_delta > 0
 14  and    sn.end_interval_time between to_timestamp('18/06/2012 09:00','DD/MM/YYYY HH24:MI')
 15                              and     to_timestamp('18/06/2012 10:00','DD/MM/YYYY HH24:MI')
 16  order by sn.snap_id, st.plan_hash_value; 

DT                PLAN_HASH        RWS      EXECS        ELP        CPU       GETS         IO
---------------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
18/06/2012 09:00 1081606725       1525        182 2163175676  101353586    3938349 2062115102
18/06/2012 09:00 3791039348         48        299     449781     380944      71472      68103
18/06/2012 09:30 1081606725       2032         24 3019992733  146890665   12995474 2886335201
18/06/2012 09:30 3791039348         43        289   73670959   73468837   56740610      24170

SQL> 

Whereas this is representative of “normal”:

DT                PLAN_HASH        RWS      EXECS        ELP        CPU       GETS         IO
---------------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
15/06/2012 09:00 3791039348       1289       1077  174434637   16030542    5470115     280865

SQL> 

I mentioned that this code was processing outstanding messages.
These different processes aren’t balanced and aren’t expected to be.

But plan hash 3791039348 is good for everyone and in general plan hash 1081606725 suits no-one.

Here’s the sanitised SQL – remember this isn’t so much a SQL tuning/rewriting exercise so much as a root cause of slowness exercise (where the root cause is not initially meant to be “because the SQL is not brilliant”)

      SELECT  ....
      FROM    cmsg,
              sint,
              setm,
              comp,
              trga,
		          seti,
		          trad,
              (SELECT COUNT(*) count,
                      setm_il.sett_mesg_tag sett_mesg_tag
               FROM   prco prco_il,
                      cmsg cmsg_il,
                      setm setm_il,
                      seti seti_il
               WHERE  cmsg_il.unacknowledged = prco_il.unack_code
               AND    cmsg_il.evnt_type      = 'SI'
               AND   (cmsg_il.evnt_tag       = seti_il.sett_istr_tag 
               OR     cmsg_il.evnt_tag       = seti_il.amending_si_tag)
               AND    setm_il.sett_istr_tag  = seti_il.sett_istr_tag
               GROUP BY setm_il.sett_mesg_tag) cmsg_si
      WHERE   cmsg.rowid IN (SELECT row_id
                             FROM (SELECT rowid row_id
                                   FROM   cmsg  cmsg2
                                   WHERE  unacknowledged = :p_unack_code
                                   AND    evnt_type     != 'SOD'
                                   ORDER BY seq_num)
                             WHERE rownum <= :p_batch_lim)
      AND     sint.sint_tag         (+) = cmsg.evnt_tag
      AND     sint.stus_ind         (+) = 'C'
      AND     setm.sett_mesg_tag    (+) = sint.sett_mesg_tag
      AND     comp.cmpy_num         (+) = setm.cmpy_num
      AND     trga.trga_code        (+) = comp.trga_code
      AND     seti.sett_istr_tag    (+) = setm.sett_istr_tag
      AND     trad.trad_tag         (+) = seti.trad_tag
      AND     trad.cmpy_num         (+) = seti.cmpy_num
      AND     cmsg_si.sett_mesg_tag (+) = sint.sett_mesg_tag
      ORDER by cmsg.seq_num;

Here is the relevant sections of the two execution plans:

The good:

SQL_ID 1s1vgw24g2u1k
 
Plan hash value: 3791039348
 
------------------------------------------------------------------------------------------------
| Id  | Operation                                         | Name       | Rows  | Cost  | Time  |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                                  |            |       |  9192 |       |
|   1 |  FOR UPDATE                                       |            |       |       |       |
|   2 |   SORT ORDER BY                                   |            |     1 |  9192 | 00:46 |
|   3 |    NESTED LOOPS OUTER                             |            |     1 |  9191 | 00:46 |
|   4 |     NESTED LOOPS OUTER                            |            |     1 |  9190 | 00:46 |
|   5 |      NESTED LOOPS OUTER                           |            |     1 |  9188 | 00:46 |
|   6 |       NESTED LOOPS OUTER                          |            |     1 |  9187 | 00:46 |
|   7 |        NESTED LOOPS OUTER                         |            |     1 |  9186 | 00:46 |
|   8 |         HASH JOIN OUTER                           |            |     1 |  9184 | 00:46 |
|   9 |          NESTED LOOPS OUTER                       |            |     1 |    10 | 00:01 |
|  10 |           NESTED LOOPS                            |            |     1 |     7 | 00:01 |
|  11 |            VIEW                                   | VW_NSO_1   |     3 |     5 | 00:01 |
|  12 |             SORT UNIQUE                           |            |     1 |       |       |
|  13 |              COUNT STOPKEY                        |            |       |       |       |
|  14 |               VIEW                                |            |     3 |     5 | 00:01 |
|  15 |                SORT ORDER BY STOPKEY              |            |     3 |     5 | 00:01 |
|  16 |                 TABLE ACCESS BY INDEX ROWID       | CMSG       |     3 |     4 | 00:01 |
|  17 |                  INDEX RANGE SCAN                 | CMSG3_IDX  |     4 |     3 | 00:01 |
|  18 |            TABLE ACCESS BY USER ROWID             | CMSG       |     1 |     1 | 00:01 |
|  19 |           TABLE ACCESS BY INDEX ROWID             | SETI       |     1 |     3 | 00:01 |
|  20 |            INDEX RANGE SCAN                       | SETI_1     |     1 |     2 | 00:01 |
|  21 |          VIEW                                     |            | 14490 |  9173 | 00:46 |
|  22 |           SORT GROUP BY                           |            | 14490 |       |       |
|  23 |            CONCATENATION                          |            |       |       |       |
|  24 |             NESTED LOOPS                          |            |       |       |       |
|  25 |              NESTED LOOPS                         |            |   562 |  8472 | 00:43 |
|  26 |               NESTED LOOPS                        |            |   428 |  6757 | 00:34 |
|  27 |                NESTED LOOPS                       |            |   304 |    30 | 00:01 |
|  28 |                 TABLE ACCESS BY INDEX ROWID       | CMSG       |   304 |    30 | 00:01 |
|  29 |                  INDEX FULL SCAN                  | CMSG4_IDX  |     1 |    29 | 00:01 |
|  30 |                 INDEX UNIQUE SCAN                 | PRCO_2_IDX |     1 |     0 |       |
|  31 |                PARTITION LIST ALL                 |            |     1 |    24 | 00:01 |
|  32 |                 TABLE ACCESS BY LOCAL INDEX ROWID | SETI       |     1 |    24 | 00:01 |
|  33 |                  INDEX RANGE SCAN                 | SETI_3_IDX |    36 |    22 | 00:01 |
|  34 |               INDEX RANGE SCAN                    | SETM_2_IDX |     1 |     2 | 00:01 |
|  35 |              TABLE ACCESS BY INDEX ROWID          | SETM       |     1 |     4 | 00:01 |
|  36 |             NESTED LOOPS                          |            |       |       |       |
|  37 |              NESTED LOOPS                         |            |    20 |   700 | 00:04 |
|  38 |               NESTED LOOPS                        |            |    15 |   640 | 00:04 |
|  39 |                NESTED LOOPS                       |            |    15 |   640 | 00:04 |
|  40 |                 TABLE ACCESS BY INDEX ROWID       | CMSG       |   304 |    30 | 00:01 |
|  41 |                  INDEX FULL SCAN                  | CMSG4_IDX  |     1 |    29 | 00:01 |
|  42 |                 TABLE ACCESS BY GLOBAL INDEX ROWID| SETI       |     1 |     2 | 00:01 |
|  43 |                  INDEX UNIQUE SCAN                | SETI_IDX   |     1 |     1 | 00:01 |
|  44 |                INDEX UNIQUE SCAN                  | PRCO_2_IDX |     1 |     0 |       |
|  45 |               INDEX RANGE SCAN                    | SETM_2_IDX |     1 |     2 | 00:01 |
|  46 |              TABLE ACCESS BY INDEX ROWID          | SETM       |     1 |     4 | 00:01 |
|  47 |         TABLE ACCESS BY INDEX ROWID               | SETM       |     1 |     2 | 00:01 |
|  48 |          INDEX UNIQUE SCAN                        | SETM_IDX   |     1 |     1 | 00:01 |
|  49 |        TABLE ACCESS BY INDEX ROWID                | COMP       |     1 |     1 | 00:01 |
|  50 |         INDEX UNIQUE SCAN                         | COMP_PK    |     1 |     0 |       |
|  51 |       TABLE ACCESS BY INDEX ROWID                 | TRGA       |     1 |     1 | 00:01 |
|  52 |        INDEX UNIQUE SCAN                          | TRGA_IDX   |     1 |     0 |       |
|  53 |      TABLE ACCESS BY GLOBAL INDEX ROWID           | SETI       |     1 |     2 | 00:01 |
|  54 |       INDEX UNIQUE SCAN                           | SETI_IDX   |     1 |     1 | 00:01 |
|  55 |     PARTITION LIST ITERATOR                       |            |     1 |     1 | 00:01 |
|  56 |      INDEX UNIQUE SCAN                            | TRAD_IDX   |     1 |     1 | 00:01 |
------------------------------------------------------------------------------------------------

And the downright bad and ugly:

SQL_ID 1s1vgw24g2u1k
 
Plan hash value: 1081606725
 
------------------------------------------------------------------------------------------------
| Id  | Operation                                         | Name       | Rows  | Cost  | Time  |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                                  |            |       |   436K|       |
|   1 |  FOR UPDATE                                       |            |       |       |       |
|   2 |   SORT ORDER BY                                   |            |     1 |   436K| 36:24 |
|   3 |    NESTED LOOPS OUTER                             |            |     1 |   436K| 36:24 |
|   4 |     NESTED LOOPS OUTER                            |            |     1 |   436K| 36:24 |
|   5 |      NESTED LOOPS OUTER                           |            |     1 |   436K| 36:23 |
|   6 |       NESTED LOOPS OUTER                          |            |     1 |   436K| 36:23 |
|   7 |        NESTED LOOPS OUTER                         |            |     1 |   436K| 36:23 |
|   8 |         HASH JOIN OUTER                           |            |     1 |   436K| 36:23 |
|   9 |          NESTED LOOPS OUTER                       |            |     1 |    10 | 00:01 |
|  10 |           NESTED LOOPS                            |            |     1 |     7 | 00:01 |
|  11 |            VIEW                                   | VW_NSO_1   |     2 |     5 | 00:01 |
|  12 |             SORT UNIQUE                           |            |     1 |       |       |
|  13 |              COUNT STOPKEY                        |            |       |       |       |
|  14 |               VIEW                                |            |     2 |     5 | 00:01 |
|  15 |                SORT ORDER BY STOPKEY              |            |     2 |     5 | 00:01 |
|  16 |                 TABLE ACCESS BY INDEX ROWID       | CMSG       |     2 |     4 | 00:01 |
|  17 |                  INDEX RANGE SCAN                 | CMSG3_IDX  |     2 |     3 | 00:01 |
|  18 |            TABLE ACCESS BY USER ROWID             | CMSG       |     1 |     1 | 00:01 |
|  19 |           TABLE ACCESS BY INDEX ROWID             | SETI       |     1 |     3 | 00:01 |
|  20 |            INDEX RANGE SCAN                       | SETI_1     |     1 |     2 | 00:01 |
|  21 |          VIEW                                     |            |   719K|   436K| 36:23 |
|  22 |           SORT GROUP BY                           |            |   719K|       |       |
|  23 |            CONCATENATION                          |            |       |       |       |
|  24 |             NESTED LOOPS                          |            |       |       |       |
|  25 |              NESTED LOOPS                         |            | 27212 |   403K| 33:40 |
|  26 |               HASH JOIN                           |            | 20520 |   301K| 25:06 |
|  27 |                HASH JOIN                          |            | 14473 |    35 | 00:01 |
|  28 |                 INDEX FULL SCAN                   | PRCO_2_IDX |   502 |     1 | 00:01 |
|  29 |                 TABLE ACCESS BY INDEX ROWID       | CMSG       | 14473 |    33 | 00:01 |
|  30 |                  INDEX FULL SCAN                  | CMSG4_IDX  |     1 |    32 | 00:01 |
|  31 |                PARTITION LIST ALL                 |            |    30M|   300K| 25:02 |
|  32 |                 TABLE ACCESS FULL                 | SETI       |    30M|   300K| 25:02 |
|  33 |               INDEX RANGE SCAN                    | SETM_2_IDX |     1 |     3 | 00:01 |
|  34 |              TABLE ACCESS BY INDEX ROWID          | SETM       |     1 |     5 | 00:01 |
|  35 |             NESTED LOOPS                          |            |       |       |       |
|  36 |              NESTED LOOPS                         |            |   960 | 32673 | 02:44 |
|  37 |               NESTED LOOPS                        |            |   724 | 29046 | 02:26 |
|  38 |                NESTED LOOPS                       |            |   724 | 29045 | 02:26 |
|  39 |                 TABLE ACCESS BY INDEX ROWID       | CMSG       | 14473 |    33 | 00:01 |
|  40 |                  INDEX FULL SCAN                  | CMSG4_IDX  |     1 |    32 | 00:01 |
|  41 |                 TABLE ACCESS BY GLOBAL INDEX ROWID| SETI       |     1 |     2 | 00:01 |
|  42 |                  INDEX UNIQUE SCAN                | SETI_IDX   |     1 |     1 | 00:01 |
|  43 |                INDEX UNIQUE SCAN                  | PRCO_2_IDX |     1 |     0 |       |
|  44 |               INDEX RANGE SCAN                    | SETM_2_IDX |     1 |     3 | 00:01 |
|  45 |              TABLE ACCESS BY INDEX ROWID          | SETM       |     1 |     5 | 00:01 |
|  46 |         TABLE ACCESS BY INDEX ROWID               | SETM       |     1 |     2 | 00:01 |
|  47 |          INDEX UNIQUE SCAN                        | SETM_IDX   |     1 |     1 | 00:01 |
|  48 |        TABLE ACCESS BY INDEX ROWID                | COMP       |     1 |     2 | 00:01 |
|  49 |         INDEX UNIQUE SCAN                         | COMP_PK    |     1 |     1 | 00:01 |
|  50 |       TABLE ACCESS BY INDEX ROWID                 | TRGA       |     1 |     1 | 00:01 |
|  51 |        INDEX UNIQUE SCAN                          | TRGA_IDX   |     1 |     0 |       |
|  52 |      TABLE ACCESS BY GLOBAL INDEX ROWID           | SETI       |     1 |     2 | 00:01 |
|  53 |       INDEX UNIQUE SCAN                           | SETI_IDX   |     1 |     1 | 00:01 |
|  54 |     PARTITION LIST ITERATOR                       |            |     1 |     1 | 00:01 |
|  55 |      INDEX UNIQUE SCAN                            | TRAD_IDX   |     1 |     1 | 00:01 |
------------------------------------------------------------------------------------------------
 
Note
-----
   - dynamic sampling used for this statement (level=4)
 

The trouble with specific examples is that sometimes they’re a bit longer than ideal (as well as being a bit out of context).

Let me highlight what I see as the problem:

  • There is a non-merged view with an OR (which is expanded to a CONCATENATION)
  • In the good plan, @ line 33, one part of the OR condition is dealt with via an INDEX RANGE SCAN on SETI_3_IDX.
  • In the bad plan, @ line 32, this is a TABLE ACCESS FULL on SETI.
  • There’s a dynamic sampling note in the bad plan

Good:

|  21 | VIEW                                    |            | 14490 |  9173   (1)| 00:00:46 |
...
|  26 |     NESTED LOOPS                        |            |   428 |  6757   (1)| 00:00:34 |
|  27 |      NESTED LOOPS                       |            |   304 |    30   (0)| 00:00:01 |
|  28 |       TABLE ACCESS BY INDEX ROWID       | CMSG       |   304 |    30   (0)| 00:00:01 |
|  29 |        INDEX FULL SCAN                  | CMSG4_IDX  |     1 |    29   (0)| 00:00:01 |
|  30 |       INDEX UNIQUE SCAN                 | PRCO_2_IDX |     1 |     0   (0)|          |
|  31 |      PARTITION LIST ALL                 |            |     1 |    24   (0)| 00:00:01 |
|  32 |       TABLE ACCESS BY LOCAL INDEX ROWID | SETI       |     1 |    24   (0)| 00:00:01 |
|  33 |        INDEX RANGE SCAN                 | SETI_3_IDX |    36 |    22   (0)| 00:00:01 |
...

Bad:

|  21 | VIEW                                    |            |   719K|   436K  (3)| 00:36:23 |
...
|  26 |     HASH JOIN                           |            | 20520 |   301K  (4)| 00:25:06 |
|  27 |      HASH JOIN                          |            | 14473 |    35   (3)| 00:00:01 |
|  28 |       INDEX FULL SCAN                   | PRCO_2_IDX |   502 |     1   (0)| 00:00:01 |
|  29 |       TABLE ACCESS BY INDEX ROWID       | CMSG       | 14473 |    33   (0)| 00:00:01 |
|  30 |        INDEX FULL SCAN                  | CMSG4_IDX  |     1 |    32   (0)| 00:00:01 |
|  31 |      PARTITION LIST ALL                 |            |    30M|   300K  (4)| 00:25:02 |
|  32 |       TABLE ACCESS FULL                 | SETI       |    30M|   300K  (4)| 00:25:02 |
...
Note
-----
   - dynamic sampling used for this statement (level=4)

It’s always nice to simplify a situation and because the problem seems to be in this non-merged view, we can actually eliminate half the sql and then see if we can just reproduce the problem by adjusting the estimates.

Which we can, and via trial and error, we see that as we approach an estimate of 14000 rows from CMSG, the plan changes to the FULL TABLE SCAN of SETI:

SQL> explain plan for
  2                 SELECT /*+
  3                          opt_estimate(table cmsg_il rows=14000)
  4                          */
  5                        COUNT(*) count,
  6                        setm_il.sett_mesg_tag sett_mesg_tag
  7                 FROM   prco    prco_il,
  8                        cmsg    cmsg_il,
  9                        setm    setm_il,
 10                        seti    seti_il
 11                 WHERE  cmsg_il.unacknowledged = prco_il.unack_code
 12                 AND    cmsg_il.evnt_type = 'SI'
 13                 AND   (cmsg_il.evnt_tag = seti_il.sett_istr_tag OR
 14                        cmsg_il.evnt_tag = seti_il.amending_si_tag)
 15                 AND    setm_il.sett_istr_tag = seti_il.sett_istr_tag
 16                 GROUP BY setm_il.sett_mesg_tag;

Explained.

SQL> select * from table(dbms_xplan.display);
Plan hash value: 273475595

-----------------------------------------------------------------------------------------------
| Id  | Operation                                | Name       | Rows  | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                         |            |   695K|   432K  (3)| 00:36:01 |
|   1 |  HASH GROUP BY                           |            |   695K|            |          |
|   2 |   CONCATENATION                          |            |       |            |          |
|   3 |    NESTED LOOPS                          |            |       |            |          |
|   4 |     NESTED LOOPS                         |            | 26323 |   400K  (3)| 00:33:23 |
|*  5 |      HASH JOIN                           |            | 19850 |   301K  (4)| 00:25:06 |
|*  6 |       HASH JOIN                          |            | 14000 |    35   (3)| 00:00:01 |
|   7 |        INDEX FULL SCAN                   | PRCO_2_IDX |   502 |     1   (0)| 00:00:01 |
|*  8 |        TABLE ACCESS BY INDEX ROWID       | CMSG       | 14000 |    33   (0)| 00:00:01 |
|*  9 |         INDEX FULL SCAN                  | CMSG4_IDX  |     1 |    32   (0)| 00:00:01 |
|  10 |       PARTITION LIST ALL                 |            |    30M|   300K  (4)| 00:25:02 |
|  11 |        TABLE ACCESS FULL                 | SETI       |    30M|   300K  (4)| 00:25:02 |
|* 12 |      INDEX RANGE SCAN                    | SETM_2_IDX |     1 |     3   (0)| 00:00:01 |
|  13 |     TABLE ACCESS BY INDEX ROWID          | SETM       |     1 |     5   (0)| 00:00:01 |
|  14 |    NESTED LOOPS                          |            |       |            |          |
|  15 |     NESTED LOOPS                         |            |   928 | 31605   (1)| 00:02:39 |
|  16 |      NESTED LOOPS                        |            |   700 | 28097   (1)| 00:02:21 |
|  17 |       NESTED LOOPS                       |            |   700 | 28097   (1)| 00:02:21 |
|* 18 |        TABLE ACCESS BY INDEX ROWID       | CMSG       | 14000 |    33   (0)| 00:00:01 |
|* 19 |         INDEX FULL SCAN                  | CMSG4_IDX  |     1 |    32   (0)| 00:00:01 |
|* 20 |        TABLE ACCESS BY GLOBAL INDEX ROWID| SETI       |     1 |     2   (0)| 00:00:01 |
|* 21 |         INDEX UNIQUE SCAN                | SETI_IDX   |     1 |     1   (0)| 00:00:01 |
|* 22 |       INDEX UNIQUE SCAN                  | PRCO_2_IDX |     1 |     0   (0)| 00:00:01 |
|* 23 |      INDEX RANGE SCAN                    | SETM_2_IDX |     1 |     3   (0)| 00:00:01 |
|  24 |     TABLE ACCESS BY INDEX ROWID          | SETM       |     1 |     5   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - access("CMSG_IL"."EVNT_TAG"="SETI_IL"."AMENDING_SI_TAG")
   6 - access("CMSG_IL"."UNACKNOWLEDGED"="PRCO_IL"."UNACK_CODE")
   8 - filter("CMSG_IL"."EVNT_TYPE"='SI')
   9 - filter("CMSG_IL"."UNACKNOWLEDGED" IS NOT NULL)
  12 - access("SETM_IL"."SETT_ISTR_TAG"="SETI_IL"."SETT_ISTR_TAG")
  18 - filter("CMSG_IL"."EVNT_TYPE"='SI')
  19 - filter("CMSG_IL"."UNACKNOWLEDGED" IS NOT NULL)
  20 - filter(LNNVL("CMSG_IL"."EVNT_TAG"="SETI_IL"."AMENDING_SI_TAG"))
  21 - access("CMSG_IL"."EVNT_TAG"="SETI_IL"."SETT_ISTR_TAG")
  22 - access("CMSG_IL"."UNACKNOWLEDGED"="PRCO_IL"."UNACK_CODE")
  23 - access("SETM_IL"."SETT_ISTR_TAG"="SETI_IL"."SETT_ISTR_TAG")

46 rows selected.

SQL> 

So, optimizer environment differences and bind variable peeking are pretty much ruled out as influencing factors.

What remains is that the bad plan has a higher estimate of rows than the good plan and the bad plan has dynamic sampling.

Both SQL statements are the same.

Dynamic sampling level is the same for both (4 set at system level).

Dynamic sampling should be being applied because of the parameter setting and because there are two single table access predicates
- one from the predicate on CMSG_IL.EVENT_TYPE and one implicit IS NOT NULL predicate on the join column UNACKNOWLEDGED.

So why would we see DS in one plan and not anohther?

If we do a 10053 trace of the distilled non-mergeable view, we see:

Access path analysis for CMSG
***************************************
SINGLE TABLE ACCESS PATH
  Single Table Cardinality Estimation for CMSG[CMSG_IL]

*** 2012-06-19 14:32:04.576
** Performing dynamic sampling initial checks. **
  Column (#4):
    NewDensity:0.000062, OldDensity:0.000000 BktCnt:8090, PopBktCnt:8084, PopValCnt:46, NDV:80
  Column (#4): EVNT_TYPE(
    AvgLen: 4 NDV: 80 Nulls: 0 Density: 0.000062
    Histogram: Freq  #Bkts: 52  UncompBkts: 8090  EndPtVals: 52
** Dynamic sampling initial checks returning TRUE (level = 4).

*** 2012-06-19 14:32:04.576
** Generated dynamic sampling query:
    query text :
SELECT /* OPT_DYN_SAMP */ 
       /*+ 
         ALL_ROWS 
         IGNORE_WHERE_CLAUSE 
         NO_PARALLEL(SAMPLESUB) 
         opt_param('parallel_execution_enabled', 'false') 
         NO_PARALLEL_INDEX(SAMPLESUB) 
         NO_SQL_TUNE 
         */ 
       NVL(SUM(C1),0), 
       NVL(SUM(C2),0) 
FROM (SELECT /*+ 
               IGNORE_WHERE_CLAUSE NO_PARALLEL("CMSG_IL") 
               FULL("CMSG_IL") 
               NO_PARALLEL_INDEX("CMSG_IL") 
               */ 
             1 AS C1, 
             CASE WHEN "CMSG_IL"."EVNT_TYPE"=:B1 
                  AND  "CMSG_IL"."UNACKNOWLEDGED" IS NOT NULL 
                  THEN 1 
                  ELSE 0 
             END AS C2 
FROM "CMSG" SAMPLE BLOCK (0.002711 , 1) SEED (1) "CMSG_IL") SAMPLESUB

*** 2012-06-19 14:32:04.612
** Executed dynamic sampling query:
    level : 4
    sample pct. : 0.002711
    actual sample size : 1756
    filtered sample card. : 0
    orig. card. : 75899119
    block cnt. table stat. : 1118909
    block cnt. for sampling: 1143461
    max. sample block cnt. : 32
    sample block cnt. : 31
    min. sel. est. : 0.00000001
** Not using dynamic sampling for single table sel. or cardinality.

So, the results of dynamic sampling are discarded as unreliable and the estimates fall back on using the stored object stats.

But at some point (one of the hard parses) the data returned in the sample must have found some relevant data and been considered reliable which led to the “bad” plan being executed repeatedly in inappropriate circumstances.

Two footnotes to this investigation.

Firstly, when you have tables with fast-changing data and row counts, there are usually two choices in approach for statistics
- either find a set of stats that gives the best (balance of) plans and lock them.
- or use dynamic sampling

The scenario above shows the downside of the dynamic sampling approach.
It may well be that the plan hash 1081606725 was good for the initial execution(s) of the statement when there maybe were a large number of outstanding messages but it very quickly became unsuitable given the frequent executions.

Secondly, I mentioned that this was not a SQL rewriting exercise, but, if it was, I’m not a great fan of the outer-joined inline view. It’s not merged but it seems sensible that a predicate be pushed into view.
Pushing the predicate, whether happening by default or hinted explicitly, would give a subsection of the plan like:

|  20 | VIEW PUSHED PREDICATE                  |            |     1 |    38   (0)| 00:00:01 |
|  21 |  SORT GROUP BY                         |            |     1 |    38   (0)| 00:00:01 |
|  22 |   NESTED LOOPS                         |            |     1 |    38   (0)| 00:00:01 |
|  23 |    NESTED LOOPS                        |            |     1 |    38   (0)| 00:00:01 |
|  24 |     NESTED LOOPS                       |            |     1 |     5   (0)| 00:00:01 |
|  25 |      TABLE ACCESS BY INDEX ROWID       | SETM       |     1 |     3   (0)| 00:00:01 |
|* 26 |       INDEX UNIQUE SCAN                | SETM_IDX   |     1 |     2   (0)| 00:00:01 |
|  27 |      TABLE ACCESS BY GLOBAL INDEX ROWID| SETI       |    30M|     2   (0)| 00:00:01 |
|* 28 |       INDEX UNIQUE SCAN                | SETI_IDX   |     1 |     1   (0)| 00:00:01 |
|* 29 |     TABLE ACCESS BY INDEX ROWID        | CMSG       |     1 |    33   (0)| 00:00:01 |
|* 30 |      INDEX FULL SCAN                   | CMSG_IDX   |     1 |    32   (0)| 00:00:01 |
|* 31 |    INDEX UNIQUE SCAN                   | PRCO_2_IDX |     1 |     0   (0)| 00:00:01 |

However, I’m not keen on explicitly hinting this nor leaving it to chance and as it’s outer joined, I would strongly consider using a scalar subselect:

      WITH cmsg AS
           (SELECT *
            FROM   (SELECT cmsg.evnt_tag,
                           cmsg.evnt_type,
                           cmsg.seq_num
                    FROM   cmsg
                    WHERE  unacknowledged = :p_unack_code
                    AND    evnt_type != 'SOD'
                    ORDER BY seq_num)
            WHERE rownum <= :p_batch_lim)
      SELECT  ...,
              (SELECT COUNT(*)
               FROM   proc    prco_il,
                      cmsg    cmsg_il,
                      setm    setm_il,
                      seti    seti_il
               WHERE  cmsg_il.unacknowledged = prco_il.unack_code
               AND    cmsg_il.evnt_type      = 'SI'
               AND   (cmsg_il.evnt_tag       = seti_il.sett_istr_tag
               OR     cmsg_il.evnt_tag       = seti_il.amending_si_tag)
               AND    setm_il.sett_istr_tag  = seti_il.sett_istr_tag
               AND    setm_il.sett_mesg_tag  = sint.sett_mesg_tag) count
      FROM    cmsg,
              sint,
              setm,
              comp,
              trga,
              seti,
              trad
      WHERE   sint.sint_tag      (+) = cmsg.evnt_tag
      AND     sint.stus_ind      (+) = 'C'
      AND     setm.sett_mesg_tag (+) = sint.sett_mesg_tag
      AND     comp.cmpy_num      (+) = setm.cmpy_num
      AND     trga.trga_code     (+) = comp.trga_code
      AND     seti.sett_istr_tag (+) = setm.sett_istr_tag
      AND     trad.trad_tag      (+) = seti.trad_tag
      AND     trad.cmpy_num      (+) = seti.cmpy_num
      ORDER by cmsg.seq_num;

Flippin’ Witch Hunt – adaptive direct path read

Plan flips… here seems to be general impression that most application performance issues are going to be caused by some sort of “plan flip” and that “plan flips” are bad.

  • If you don’t use plan stability features, should you really expect plan stability?
  • Even outside your stats job window whenever that is?
  • What if you’re using bind variables with histograms?
  • And you’ve got features like dynamic sampling which will probably get a slightly different sample for each hard parse?
  • And if you’ve also got cardinality feedback kicking in?

The old battle between stability and optimisation, I suppose.

I made a throwaway comment on a forum thread about having seen a performance problem caused by a sql statement which changed to doing direct path IO.

I thought it might be vaguely interesting to do a quick blog post and back that up with a tiny bit of “evidence”.

This was getting on for three weeks ago now (29th May) and there were bigger fish to fry at the time so some of the evidence and some of the accuracy of the evidence might wont a little but I love having a long AWR retention and the outputs below are what ASH & AWR can tell us now about what happened back then.

Problem:
A particular sql statment was running slowly.
Initial analysis had identified a change in the plan and that change was caused by cardinality feedback and changes in plans are bad of course ;)

Looking at DBA_HIST_SQLSTAT, I could see that the SQL regularly flip-flopped between two plans, one of which was the plan normally received after a hard parse, the other a change brought on by cardinality feedback.

But there really wasn’t much to choose between the plans in terms of statistics and averages.

In the execution plan, we can also skip specifics, the only relevant detail that there was a FULL TABLE SCAN which was the inner part (i.e. executed multiple times for each row in the driving / outer rowsource) of a NESTED LOOP – something which is rarely going to be a good thing, i.e.

NESTED LOOP
  some row source
  FULL TABLE SCAN

However, if we cross-reference DBA_HIST_ACTIVE_SESS_HISTORY, then we see the critical differences between the executions of this FULL TABLE SCAN on the 28th / 29th May and other days.

If you don’t like averages then I don’t blame you but you won’t like this query:

with ash as
(select trunc(ash.sample_time) snap_day
 ,      min(ash.snap_id) min_snap_id
 ,      max(ash.snap_id) max_snap_id
 ,      ash.current_obj# obj_id
 ,      ash.sql_id
 ,      ash.sql_plan_hash_value
 ,      ash.sql_plan_operation
 ,      ash.sql_plan_options
 ,      ash.current_obj#
 ,      ash.event
 ,      count(distinct ash.sql_exec_id) ash_cnt
 from   dba_hist_active_sess_history ash
 where  ash.sql_id              = 'drbkfkfxg6a6a'
 and    ash.sql_plan_operation  = 'TABLE ACCESS'
 and    ash.sql_plan_options    = 'FULL'
 and    ash.current_obj#        = 169748
 and    ash.sample_time        >= to_date('01/05/2012','DD/MM/YYYY')
 and    ash.event              is not null
 group by
        trunc(ash.sample_time)
 ,      ash.sql_plan_hash_value
 ,      ash.sql_id
 ,      ash.sql_plan_operation
 ,      ash.sql_plan_options
 ,      ash.current_obj#
 ,      ash.event)
select ash.snap_day
,      ash.event
,      ash.ash_cnt
,      (select round(sum(elapsed_time_delta) / sum(executions_delta) /1000/1000)
        from   dba_hist_sqlstat   st
        where  st.sql_id        = ash.sql_id
        and    st.snap_id between ash.min_snap_id
                          and     ash.max_snap_id) avg_elp
,      (select round(sum(rows_processed_delta) / sum(executions_delta))
        from   dba_hist_sqlstat   st
        where  st.sql_id        = ash.sql_id
        and    st.snap_id between ash.min_snap_id
                          and     ash.max_snap_id) avg_rws
,      round((1275068416 -
       (select sum(space_allocated_delta)
        from   dba_hist_seg_stat st
        where  st.obj#          = ash.obj_id
        and    st.snap_id      >= ash.min_snap_id))
      / 1024/1024) "SPACE_USED (ISH)"
,      (select min(bytes)/1024/1024
        from   dba_hist_sgastat st
        where  name             = 'buffer_cache'
        and    st.snap_id between ash.min_snap_id
                          and     ash.max_snap_id) min_bc
from   ash
order by snap_day;

What am I trying to see:

  • Historic running of a particular sql statment
  • Focusing on this FULL TABLE SCAN operation which was a problem on the 28th / 29th
  • Occurences in ASH (be very careful of reading too much into samples of samples and aggregates and averages)
  • What the statement tended to wait on and a feel for how much waiting (beware of sampled samples)
  • And finally a rough approximation of how big the object was relative to the buffer cache (we know what size it is now (which I’ve plugged into the query) so work back through the growth in segment stats. It’s using space allocated because I don’t have access to dbms_space to get space used and space allocated is going to be a high estimate of actual space used – it certainly shouldn’t be more than that)
SNAP_DAY  EVENT                               ASH_CNT    AVG_ELP SPACE_USED (ISH)     MIN_BC
--------- -------------------------------- ---------- ---------- ---------------- -------
01-MAY-12 db file scattered read                    1         19             1152      61696
01-MAY-12 db file sequential read                   1         19             1152      61696
01-MAY-12 db file sequential read                   1         25             1152      61696
02-MAY-12 db file sequential read                   1          8             1152      61696
02-MAY-12 db file scattered read                    1          8             1152      61696
07-MAY-12 db file scattered read                    1          7             1152      61696
08-MAY-12 db file scattered read                    1         24             1152      61696
08-MAY-12 db file sequential read                   1         24             1152      61696
09-MAY-12 db file sequential read                   1        100             1152      61696
09-MAY-12 db file scattered read                    1        100             1152      61696
10-MAY-12 db file sequential read                   1          6             1152      61696
10-MAY-12 db file scattered read                    1          6             1152      61696
11-MAY-12 db file sequential read                   1          7             1152      61696
11-MAY-12 db file scattered read                    1         13             1152      61696
11-MAY-12 db file sequential read                   1         13             1152      61696
15-MAY-12 db file scattered read                    1        148             1152      62208
16-MAY-12 db file scattered read                    1          8             1152      62464
16-MAY-12 db file sequential read                   2          5             1152      62464
17-MAY-12 db file scattered read                    1         17             1152      62208
17-MAY-12 db file sequential read                   1         17             1152      62208
18-MAY-12 db file sequential read                   2         17             1152      62208
18-MAY-12 db file scattered read                    2         17             1152      62208
21-MAY-12 db file scattered read                    1         16             1216      62208
21-MAY-12 db file scattered read                    1          6             1216      62208
22-MAY-12 db file scattered read                    1         19             1216      62208
23-MAY-12 db file scattered read                    3         17             1216      62208
23-MAY-12 read by other session                     1         15             1216      62208
24-MAY-12 db file scattered read                    1         28             1216      61696
25-MAY-12 db file scattered read                    1         67             1216      62208
25-MAY-12 db file sequential read                   1         67             1216      62208
28-MAY-12 direct path read                          3         45             1216      62464
28-MAY-12 direct path read                         96         70             1216      62464
29-MAY-12 direct path read                         10        484             1216      62464
29-MAY-12 direct path read                        137        643             1216      62464
29-MAY-12 enq: KO - fast object checkpoint          1        545             1216      62464

35 rows selected.

SQL> 

The FULL TABLE SCAN as the outer rowsource of a NESTED LOOP is a performance threat anyway but in summary, a performance problem, irrelevant plan stability and a key switch to direct path IO, not obviously caused by a significant growth in object size.

Copy_table_stats

An observation on DBMS_STATS.COPY_TABLE_STATS in 11.2.0.3

Prior to 10.2.0.5, 11.1.0.7 (I think) and 11.2, COPY_TABLE_STATS copied the high/low value of the partition key from the source to the destination rather than use the implicit boundaries of that partition. Doug mentions it in his series on statistics on partitioned objects.

Looking at it in 11.2.0.3, this issue is still addressed in 11.2.0.3 but there’s a “but” if we copy at the SUBPARTITION level:

Setup:

SQL> create table t1
  2  (dt    DATE
  3  ,col1  VARCHAR2(10)
  4  ,stuff VARCHAR2(100))
  5  PARTITION BY RANGE (DT)
  6  SUBPARTITION BY LIST (COL1)
  7  SUBPARTITION TEMPLATE
  8  (SUBPARTITION A VALUES ('A'),
  9   SUBPARTITION B VALUES ('B'),
 10   SUBPARTITION C VALUES ('C'))
 11  (PARTITION T1_WK1 VALUES LESS THAN (TO_DATE('07-01-2012','DD-MM-YYYY')),
 12   PARTITION T1_WK2 VALUES LESS THAN (TO_DATE('14-01-2012','DD-MM-YYYY')));

Table created.

SQL> insert into t1
  2  values (TO_DATE('01-01-2012','DD-MM-YYYY'),'A',RPAD('X',100,'X'));

1 row created.

SQL> insert into t1
  2  values (TO_DATE('02-01-2012','DD-MM-YYYY'),'B',RPAD('X',100,'X'));

1 row created.

SQL> insert into t1
  2  values (TO_DATE('03-01-2012','DD-MM-YYYY'),'C',RPAD('X',100,'X'));

1 row created.

SQL> exec dbms_stats.gather_table_stats(USER,'T1','T1_WK1');

PL/SQL procedure successfully completed.

SQL> 

So, in one partition, we have 1 row per subpartition. The other partition is empty.

And we’ve gathered stats on one partition.
(Why just one partition? It’s not really important but I’ll touch on it at the end).

You should see the following statistics:

SQL> select global_stats
  2  ,      column_name
  3  ,      case
  4              when column_name = 'DT'   then display_raw(low_value,'DATE')
  5              when column_name = 'COL1' then display_raw(low_value,'VARCHAR2')
  6         end lo
  7  ,      case
  8              when column_name = 'DT'   then display_raw(high_value,'DATE')
  9              when column_name = 'COL1' then display_raw(high_value,'VARCHAR2')
 10         end hi
 11  from   user_tab_col_statistics
 12  where  table_name = 'T1'
 13  and    column_name IN ('DT','COL1');

GLO COLUMN_NAME          LO         HI
--- -------------------- ---------- ----------
YES DT                   01-JAN-12  03-JAN-12
YES COL1                 A          C

SQL> select global_stats
  2  ,      partition_name
  3  ,      column_name
  4  ,      case
  5              when column_name = 'DT'   then display_raw(low_value,'DATE')
  6              when column_name = 'COL1' then display_raw(low_value,'VARCHAR2')
  7         end lo
  8  ,      case
  9              when column_name = 'DT'   then display_raw(high_value,'DATE')
 10              when column_name = 'COL1' then display_raw(high_value,'VARCHAR2')
 11         end hi
 12  from   user_part_col_statistics
 13  where  table_name  = 'T1'
 14  and    column_name IN ('DT','COL1')
 15  order by partition_name, column_name;

GLO PARTITION_NAME       COLUMN_NAME          LO         HI
--- -------------------- -------------------- ---------- ----------
YES T1_WK1               COL1                 A          C
YES T1_WK1               DT                   01-JAN-12  03-JAN-12
NO  T1_WK2               COL1
NO  T1_WK2               DT

SQL> select subpartition_name
  2  ,      column_name
  3  ,      case
  4              when column_name = 'DT'   then display_raw(low_value,'DATE')
  5              when column_name = 'COL1' then display_raw(low_value,'VARCHAR2')
  6         end lo
  7  ,      case
  8              when column_name = 'DT'   then display_raw(high_value,'DATE')
  9              when column_name = 'COL1' then display_raw(high_value,'VARCHAR2')
 10         end hi
 11  from   user_subpart_col_statistics
 12  where  table_name  = 'T1'
 13  and    column_name IN ('DT','COL1')
 14  order by subpartition_name, column_name;

SUBPARTITION_NAME    COLUMN_NAME          LO         HI
-------------------- -------------------- ---------- ----------
T1_WK1_A             COL1                 A          A
T1_WK1_A             DT                   01-JAN-12  01-JAN-12
T1_WK1_B             COL1                 B          B
T1_WK1_B             DT                   02-JAN-12  02-JAN-12
T1_WK1_C             COL1                 C          C
T1_WK1_C             DT                   03-JAN-12  03-JAN-12
T1_WK2_A             COL1
T1_WK2_A             DT
T1_WK2_B             COL1
T1_WK2_B             DT
T1_WK2_C             COL1
T1_WK2_C             DT

12 rows selected.

SQL> 

Hopefully no surprises there.

BTW: display_raw

Now, if we use COPY_TABLE_STATS at a partition level:

SQL> exec dbms_stats.copy_table_stats(USER,'T1','T1_WK1','T1_WK2');

PL/SQL procedure successfully completed.

SQL>

the table level stats are unaffected as you might expect:

SQL> select global_stats
  2  ,      column_name
  3  ,      case
  4              when column_name = 'DT'   then display_raw(low_value,'DATE')
  5              when column_name = 'COL1' then display_raw(low_value,'VARCHAR2')
  6         end lo
  7  ,      case
  8              when column_name = 'DT'   then display_raw(high_value,'DATE')
  9              when column_name = 'COL1' then display_raw(high_value,'VARCHAR2')
 10         end hi
 11  from   user_tab_col_statistics
 12  where  table_name = 'T1'
 13  and    column_name IN ('DT','COL1');

GLO COLUMN_NAME          LO         HI
--- -------------------- ---------- ----------
YES DT                   01-JAN-12  03-JAN-12
YES COL1                 A          C

SQL> 

and, most importantly, the partition key high/low values are no longer just copied:

SQL> select global_stats
  2  ,      partition_name
  3  ,      column_name
  4  ,      case
  5              when column_name = 'DT'   then display_raw(low_value,'DATE')
  6              when column_name = 'COL1' then display_raw(low_value,'VARCHAR2')
  7         end lo
  8  ,      case
  9              when column_name = 'DT'   then display_raw(high_value,'DATE')
 10              when column_name = 'COL1' then display_raw(high_value,'VARCHAR2')
 11         end hi
 12  from   user_part_col_statistics
 13  where  table_name  = 'T1'
 14  and    column_name IN ('DT','COL1')
 15  order by partition_name, column_name;

GLO PARTITION_NAME       COLUMN_NAME          LO         HI
--- -------------------- -------------------- ---------- ----------
YES T1_WK1               COL1                 A          C
YES T1_WK1               DT                   01-JAN-12  03-JAN-12
YES T1_WK2               COL1                 A          C
YES T1_WK2               DT                   07-JAN-12  14-JAN-12

The statistics above show that the high/low of the destination partition match the partition boundaries which nearly makes sense but note the high value is the LESS THAN limit of the partition.
(Robin Moffat also mentioned this in a post)

Now copying the statistics at a PARTITION level, does nothing to the SUBPARTITION statistics, never has done.
That’s not changed nor was it expected to:

SQL>  select subpartition_name
  2   ,      column_name
  3   ,      case
  4               when column_name = 'DT'   then display_raw(low_value,'DATE')
  5               when column_name = 'COL1' then display_raw(low_value,'VARCHAR2')
  6          end lo
  7   ,      case
  8               when column_name = 'DT'   then display_raw(high_value,'DATE')
  9               when column_name = 'COL1' then display_raw(high_value,'VARCHAR2')
 10          end hi
 11   from   user_subpart_col_statistics
 12   where  table_name  = 'T1'
 13   and    column_name IN ('DT','COL1')
 14   order by subpartition_name, column_name;

SUBPARTITION_NAME    COLUMN_NAME          LO                   HI
-------------------- -------------------- -------------------- --------------------
T1_WK1_A             COL1                 A                    A
T1_WK1_A             DT                   01-JAN-12            01-JAN-12
T1_WK1_B             COL1                 B                    B
T1_WK1_B             DT                   02-JAN-12            02-JAN-12
T1_WK1_C             COL1                 C                    C
T1_WK1_C             DT                   03-JAN-12            03-JAN-12
T1_WK2_A             COL1
T1_WK2_A             DT
T1_WK2_B             COL1
T1_WK2_B             DT
T1_WK2_C             COL1
T1_WK2_C             DT

12 rows selected.

Perhaps then, if we value our subpartition statitics, we might think it’s a good idea to do our operations at a subpartition level instead.

What happens if we reset these statistics and do a COPY_TABLE_STATS at a subpartition level?

Reset:

SQL> exec dbms_stats.delete_table_stats(USER,'T1');

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.gather_table_stats(USER,'T1','T1_WK1');

PL/SQL procedure successfully completed.

And COPY_TABLE_STATS on the subpartitions:

SQL>  begin
  2    dbms_stats.copy_table_stats(USER,'T1','T1_WK1_A','T1_WK2_A');
  3    dbms_stats.copy_table_stats(USER,'T1','T1_WK1_B','T1_WK2_B');
  4    dbms_stats.copy_table_stats(USER,'T1','T1_WK1_C','T1_WK2_C');
  5   end;
  6  /

PL/SQL procedure successfully completed.

The table level stats are still unaffected.

But at the subpartition level, the RANGE PARTITION KEY (DT) is back to just being copied, not set to the boundary of the PARTITION KEY:

SQL>  select subpartition_name
  2   ,      column_name
  3   ,      case
  4               when column_name = 'DT'   then display_raw(low_value,'DATE')
  5               when column_name = 'COL1' then display_raw(low_value,'VARCHAR2')
  6          end lo
  7   ,      case
  8               when column_name = 'DT'   then display_raw(high_value,'DATE')
  9               when column_name = 'COL1' then display_raw(high_value,'VARCHAR2')
 10          end hi
 11   from   user_subpart_col_statistics
 12   where  table_name  = 'T1'
 13   and    column_name IN ('DT','COL1')
 14   order by subpartition_name, column_name;

SUBPARTITION_NAME    COLUMN_NAME          LO                   HI
-------------------- -------------------- -------------------- --------------------
T1_WK1_A             COL1                 A                    A
T1_WK1_A             DT                   01-JAN-12            01-JAN-12
T1_WK1_B             COL1                 B                    B
T1_WK1_B             DT                   02-JAN-12            02-JAN-12
T1_WK1_C             COL1                 C                    C
T1_WK1_C             DT                   03-JAN-12            03-JAN-12
T1_WK2_A             COL1                 A                    A
T1_WK2_A             DT                   01-JAN-12            01-JAN-12
T1_WK2_B             COL1                 B                    B
T1_WK2_B             DT                   02-JAN-12            02-JAN-12
T1_WK2_C             COL1                 C                    C
T1_WK2_C             DT                   03-JAN-12            03-JAN-12

12 rows selected.

And, at the partition level, any aggregation of these incorrect values depends on whether we already have GLOBAL STATS or not (hence the initial stats gathering on just the partition).

If no global stats then the error is propagated up from the subpartition:

SQL>  select global_stats
  2   ,      partition_name
  3   ,      column_name
  4   ,      case
  5               when column_name = 'DT'   then display_raw(low_value,'DATE')
  6               when column_name = 'COL1' then display_raw(low_value,'VARCHAR2')
  7          end lo
  8   ,      case
  9               when column_name = 'DT'   then display_raw(high_value,'DATE')
 10               when column_name = 'COL1' then display_raw(high_value,'VARCHAR2')
 11          end hi
 12   from   user_part_col_statistics
 13   where  table_name  = 'T1'
 14   and    column_name IN ('DT','COL1')
 15   order by partition_name, column_name;

GLO PARTITION_NAME       COLUMN_NAME          LO                   HI
--- -------------------- -------------------- -------------------- --------------------
YES T1_WK1               COL1                 A                    C
YES T1_WK1               DT                   01-JAN-12            03-JAN-12
NO  T1_WK2               COL1                 A                    C
NO  T1_WK2               DT                   01-JAN-12            03-JAN-12

Or if instead of gathering statistics at the partition level initially (and at reset), we do a table level gather, then
the PARTITION stats are not aggregated, i.e.

GLO PARTITION_NAME       COLUMN_NAME          LO         HI
--- -------------------- -------------------- ---------- ----------
YES T1_WK1               COL1                 A          C
YES T1_WK1               DT                   01-JAN-12  03-JAN-12
YES T1_WK2               COL1
YES T1_WK2               DT

So, in summary, I just wanted to show that whilst the issue with COPY_TABLE_STATS and HIGH/LOW has been addressed at the PARTITION level, there still seem to be issues for the PARTITION KEY at the SUBPARTITION LEVEL.

Partitioned table statistics are more complicated than you might expect.

Here are some of the excellent resources that can guide you through:

HugePages – Overhead of Not Using

I’ve read a fair bit about HugePages and their importance especially for Oracle databases with large SGAs and a large number of dedicated connections.

Admittedly, some of what I read is over my head – I’m not a production DBA and quite often I don’t even have access to the box unfortunately – but I know who the experts are and I trust them (doesn’t mean that an expert’s opinion shouldn’t be tested of course):

Now, the title in Kevin’s posts asks ( rhetorically) “Configuring Linux Hugepages for Oracle Database Is Just Too Difficult! Isn’t It?”

And I think a lot of the time, it is. Not because of the topic but because of the structure of modern IT departments.

When I tried to tackle the subject before at a client, it was awkward – it wasn’t a subject I could talk about with any great authority although I could refer them to the expert material above and … well, to cut a short story shorter and keep me out of trouble… we didn’t. I was overruled.  No-one else at the client was using HugePages on 11g. So no HugePages and no chance to test it.

Well, in the end, we didn’t go with AMM, because for our applications, according to our tests, we were better suited to ASMM (damn these acronyms).

Statements like this from Kevin’s articles above are easy to understand:

Reasons for Using Hugepages

  1. Use hugepages if OLTP or ERP. Full stop.
  2. Use hugepages if DW/BI with large numbers of dedicated connections or a large SGA. Full stop.
  3. Use hugepages if you don’t like the amount of memory page tables are costing you (/proc/meminfo). Full stop.

and

“Large number of dedicated connections” and “large SGA” in the context of hugepages can only be quantified by the amount of memory wasted in page tables and whether the administrator is satisfied with that cost.”

Those are pretty clear cut.

So, if I understand everything correctly, I should not like this, right?:

$ grep "PageTables" /proc/meminfo
PageTables:   41622228 kB
$

That’s 40 gig of overhead for an SGA of 60 gig and a whole bunch of connections.
So, don’t need HugePages, eh?
Bit late now.

Regexp hungry for CPU? Real time sql monitoring shows just how

Not exactly a snappy blog post title…

First up, an execution plan showing the old problem of how the costs of a scalar subquery are not properly accounted for, even in the latest 11gR2 releases.
Read more of this post

Poll: SQL Plan Management

I’m interested in what SQL Plan Management features people are actively using.
Read more of this post

ORA-00600 [kkqtSetOp.1] – Join Factorization

Just a quick note about an ORA-00600 that recently occurred following an upgrade to 11.2.0.3.

I’ve not been able to distill an isolated test case from the specific production code generating this error so I’ll skip the application-specific example.

The error message reported was:
Read more of this post

Deadlock trace file enhancements

Just browsing some deadlock trace files on 11gR2.

I normally look at sections:

  • Deadlock Graph
  • Information for the OTHER waiting session
  • Information for THIS session

and very rarely go beyond section PROCESS_STATE.

But, casually browsing, I came across these sections relating to the killed session:

  • Session Wait History
  • Sampled Session History of session

Nice additions. When did that happen?

Upgrade 11.2.0.3 Days 1 and 2

AKA being knee deep in …

Go-live at the weekend from 9i Solaris to 11.2.0.3 Linux.

There were a couple of minor issues on arrival Monday morning but first critical situation started @ 8:29 am Monday morning and is still ongoing Tuesday evening.

More details later but key points are:
1. Degradation of various durations, longest so far about two hours.
2. Three deliberate bounces, one crash.
3. Unrelated to above, ongoing ORA-00600s at a rate of a couple per hour, ongoing ORA-07445s at a peak rate of about 4 per minute.

Technical considerations from above:
1. Library Cache : Mutex X Contention and change to DBMS_ALERT.REGISTER
If you make significant use of DBMS_ALERT then beware a signature change in 11.2 to DBMS_ALERT.REGISTER which relates to cleanup of orphaned pipes, the default value of which is TRUE.

This new cleanup behaviour, introduced due to some other bug that I don’t have detail of right now, is apparently quite an expensive operation and in practice for us, this has resulted in critical hanging scenarios on “library cache: mutex x” effectively bringing the db to its knees.

The key recursive SQL that seems to be at the heart of this mutex issue is:

SELECT DISTINCT SUBSTR(KGLNAOBJ,11) SID FROM X$KGLOB WHERE KGLHDNSP = 7 AND KGLNAOBJ LIKE 'ORA$ALERT$%' AND BITAND(KGLHDFLG,128)!=0 UNION SELECT DISTINCT SID FROM DBMS_ALERT_INFO

Changing calls to DBMS_ALERT.REGISTER to make an explicit FALSE value for cleanup is currently the advised approach but raises the question of how you should deal with the cleanup that you’d think must be required if it was introduced as default behaviour.

At the moment, I am unconvinced by the declaration that this is expected behaviour of DBMS_ALERT.REGISTER. An expensive operation is not one that causes a massive chain of “library cache:mutex x” waits for 2 hours until the database is effectively down and has to be bounced. That smacks of bug to me.

Observations from the issue – firstly that 11.2.0.3 seems to be reporting mutex waits better than previous versions but whilst all blocking indicators (in ASH, V$SESSION and V$MUTEX_SLEEP_HISTORY) were clear about who they thought was the suspect – i.e. they all identified the same session – that session itself was waiting on the same mutex with a blocking_session of UNKNOWN. And killing that session at the head of the chain just resulted in someone higher up the chain taking their place and everyone else pointing the finger at the new #1.

2. ORA-00600 and ORA-07445s
We have multiple ORA-00600 and ORA-07445 errors, most of which seem to be related to the same area – application contexts and the introduction of the caching of instantiation objects (iobs) for application contexts in order to reduce library cache mutex contention. As mentioned this has caused one crash from PMON which inevitably crashed the instance and a steady stream of application errors. Not all the below are related to the same issue – those that are will “probably” be fixed by the same patch – but this is the current list of ongoing:

ORA-600: [kgldafr-bad-ext]
ORA-600: [kglbrk-bad-lock]
ORA-07445: [kglbrk()+45]
ORA-07445: [kglLockHandle()+23]
ORA-07445: [kksampp()+97]
ORA-07445: [qertbStart()+119]
ORA-07445: [keswxCurPushPlanMonitoring()+510]
ORA-07445: [ktspsrchsc_free()+321]

Just a brief summary for visibility, more details another time…

As I was coming in on Monday morning, I thought that maybe I was being too pessimistic with my expectations and that it might turn out to be a quiet day. Tuns out I wasn’t pessimistic enough. And we spent months testing this…

Fun, fun…

SQL Patch IV – Why is the SQL Patch applied but not effective?

(or Part II Addendum)

In a comment on one of my previous articles about SQL Patches, a reader (hurrah!) effectively asks the question “Why is my SQL Patch being applied but the hint not obeyed?”

The original article itself was long enough without me adding a long comment so I’d like to break out into a fresh post.

Here is the test case.

Setup:

SQL> create table a_test as select * from dba_tables;

Table created.

SQL> create index a_test_idx on a_test(table_name);

Index created.

SQL> exec dbms_stats.gather_table_stats(user, 'a_test');

PL/SQL procedure successfully completed.

SQL> select table_name from a_test where table_name='xxx';

no rows selected

SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
SQL_ID  d4knkaxjhqpgw, child number 0
-------------------------------------
select table_name from a_test where table_name='xxx'

Plan hash value: 2434419982

-------------------------------------------------------------------------------
| Id  | Operation        | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT |            |       |       |     1 (100)|          |
|*  1 |  INDEX RANGE SCAN| A_TEST_IDX |     1 |    18 |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("TABLE_NAME"='xxx')

Create a patch to hint a FULL TABLE SCAN:

SQL> begin
  2     sys.dbms_sqldiag_internal.i_create_patch
  3     (sql_text  => 'select table_name from a_test a where table_name=''xxx''',
  4      hint_text => 'full(a)',
  5      name      => 'patch_test');
  6  end;
  7  /

PL/SQL procedure successfully completed.

See that the patch is applied but the hint apparently not obeyed:

SQL> select table_name from a_test a where table_name='xxx';

no rows selected

SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------
SQL_ID  frmrwzdcc9p65, child number 0
-------------------------------------
select table_name from a_test a where table_name='xxx'

Plan hash value: 2434419982

-------------------------------------------------------------------------------
| Id  | Operation        | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT |            |       |       |     1 (100)|          |
|*  1 |  INDEX RANGE SCAN| A_TEST_IDX |     1 |    18 |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("TABLE_NAME"='xxx')

Note
-----
   - SQL patch "patch_test" used for this statement

The issue is that to use a SQL Patch to hint a statement, it seems that you have to be very careful to use the complete hint specification.

The complete hint specification is an ugly beast and should be considered a strong indicator that you really want to think twice before doing any hinting, and especially as Oracle versions march onwards and the Optimizer gets more complicated, the complete hint specification becomes ever more important.
See Jonathan Lewis’s How to Hint post for a good illustration.

Back to the test case…

If we drop the existing sql patch:

SQL> begin
  2   sys.dbms_sqldiag.drop_sql_patch('patch_test');
  3  end;
  4  /

PL/SQL procedure successfully completed.

Use the +OUTLINE format option for DBMS_XPLAN to get a nudge in the right direction:

SQL> select table_name from a_test a where table_name='xxx';

no rows selected

SQL> select * from table(dbms_xplan.display_cursor(format=> '+OUTLINE'));

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------
SQL_ID  frmrwzdcc9p65, child number 0
-------------------------------------
select table_name from a_test a where table_name='xxx'

Plan hash value: 2434419982

-------------------------------------------------------------------------------
| Id  | Operation        | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT |            |       |       |     1 (100)|          |
|*  1 |  INDEX RANGE SCAN| A_TEST_IDX |     1 |    18 |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------

Outline Data
-------------

  /*+
      BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('11.2.0.3')
      DB_VERSION('11.2.0.3')
      OPT_PARAM('optimizer_dynamic_sampling' 4)
      ALL_ROWS
      OUTLINE_LEAF(@"SEL$1")
      INDEX(@"SEL$1" "A"@"SEL$1" ("A_TEST"."TABLE_NAME"))
      END_OUTLINE_DATA
  */

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("TABLE_NAME"='xxx')

And create our patch with the complete specification:

SQL> begin
  2     sys.dbms_sqldiag_internal.i_create_patch
  3     (sql_text  => 'select table_name from a_test a where table_name=''xxx''',
  4      hint_text => 'FULL(@"SEL$1" "A"@"SEL$1")',
  5      name      => 'patch_test');
  6  end;
  7  /

PL/SQL procedure successfully completed.

We should find that it is more effective:

SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------
SQL_ID  frmrwzdcc9p65, child number 0
-------------------------------------
select table_name from a_test a where table_name='xxx'

Plan hash value: 3679270240

----------------------------------------------------------------------------
| Id  | Operation         | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |        |       |       |    40 (100)|          |
|*  1 |  TABLE ACCESS FULL| A_TEST |     1 |    18 |    40   (0)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("TABLE_NAME"='xxx')

Note
-----
   - SQL patch "patch_test" used for this statement


22 rows selected.

SQL> 

Bear in mind that the above is a test case only and consider what I said in the original article about what I consider to be the difference between a SQL Patch and a SQL Plan Baseline is and the different use cases.

Follow

Get every new post delivered to your Inbox.

Join 62 other followers