An example of where the optimizer should ignore a hint

Here’s an example of where the optimizer really should ignore a hint.

I’m testing an implementation of deliberately unusable index partitions – i.e. some empty index partitions, some usable.

This is 11.2.0.3 so whilst there are enhancements in 12c, they’re no use here.

SQL> create table t1
  2  (col1 number)
  3  partition by list(col1)
  4  (partition p0 values(0),
  5   partition pdef values(default));

Table created.

SQL> create index i1 on t1 (col1) local unusable;

Index created.

SQL> alter index i1 rebuild partition pdef;

Index altered.

SQL> select * from t1;

no rows selected

SQL> select * from t1 partition (p0);

no rows selected

SQL> select /*+ index(t1 i1) */ * from t1 partition (p0);

no rows selected

SQL> select * from t1 where col1 = 0;

no rows selected

SQL> select /*+ index(t1 i1) */ * from t1 where col1 = 0;
select /*+ index(t1 i1) */ * from t1 where col1 = 0
*
ERROR at line 1:
ORA-01502: index 'I1' or partition of such index is in unusable state

SQL>

ORs, IN lists and LNNVL

I’ve previously written about manually rewriting an OR condition into a UNION ALL using LNNVL.

This is a description of a performance issue observed in the real world from the optimizer coming up with a CONCATENATION operation against many child operations including an INLIST operator and other children which then have to use LNNVL as a filter.

Really this is nothing more than an illustration of our old performance killing friend the function call from within SQL, particularly from within predicates, and the problem of context switching.

Ok. So, the real world example is a crazy query.
It contains 1091 OR clauses each stipulating a unique key lookup.
Yes, “1091″ OR clauses.
I know, I know.
Probably itself a workaround to avoid the 1000 limit on an IN list, who knows?
Anyway.

I’m going to use a table just to illustrate the execution plan and then use some metrics from the execution of the real world example.

drop table t1;

create table t1
(col1 number
,col2 number
,col3 varchar2(10)
,constraint pk_t1 primary key (col1, col2));

insert into t1
select mod(rownum,10000)+1 col1
,      ceil(rownum/10000)  col2
,      rpad('X',10,'X')    col3
from   dual
connect by rownum <= 100000
order by col1, col2;

commit;

If we use a simple multi-column OR condition, we don’t get the desired LNNVLs.

explain plan for
select *
from   t1
where (col1 = 1
and    col2 = 1)
or    (col1 = 2
and    col2 = 2)
or    (col1 = 3
and    col2 = 3)
or    (col1 = 3
and    col2 = 4);

select * from table(dbms_xplan.display);

The optimizer just uses an INLIST iterator:

--------------------------------------------------------------------------------------
| Id  | Operation                    | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |       |  1129 | 37257 |     5   (0)| 00:00:01 |
|   1 |  INLIST ITERATOR             |       |       |       |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID| T1    |  1129 | 37257 |     5   (0)| 00:00:01 |
|*  3 |    INDEX UNIQUE SCAN         | PK_T1 |     7 |       |     4   (0)| 00:00:01 |
--------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   3 - access(("COL1"=1 AND "COL2"=1 OR "COL1"=2 AND "COL2"=2 OR "COL1"=3 AND 
              "COL2"=3 OR "COL1"=3 AND "COL2"=4))

However, if we combine an IN list with some of those ORs:

explain plan for
select *
from   t1
where (col1 = 1
and    col2 = 1)
or    (col1 = 2
and    col2 = 2)
or    (col1 = 3
and    col2 IN (3,4));

select * from table(dbms_xplan.display);

Which gives

---------------------------------------------------------------------------------------
| Id  | Operation                     | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |       |  1140 | 37620 |     2   (0)| 00:00:01 |
|   1 |  CONCATENATION                |       |       |       |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID | T1    |    11 |   363 |     1   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN           | PK_T1 |     1 |       |     2   (0)| 00:00:01 |
|   4 |   INLIST ITERATOR             |       |       |       |            |          |
|   5 |    TABLE ACCESS BY INDEX ROWID| T1    |  1129 | 37257 |     1   (0)| 00:00:01 |
|*  6 |     INDEX UNIQUE SCAN         | PK_T1 |     1 |       |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   3 - access("COL1"=3)
       filter("COL2"=3 OR "COL2"=4)
   6 - access(("COL1"=1 AND "COL2"=1 OR "COL1"=2 AND "COL2"=2))
       filter(LNNVL("COL1"=3) OR LNNVL("COL2"=3) AND LNNVL("COL2"=4))

We get the CONCATENATION of the simpler OR conditions which are combined into a single INLIST ITERATOR operations with another child operation which applies LNNVLs in the filter predicate.

Now imagine we have a much larger number of the simpler OR predicates combined with liberal scatterings of such IN clauses throughout our 1091 OR’d predicates.

Perhaps we’d be a bit concerned about the context switching and increased cpu usage?

Looking at my real world example right here… there’s no point me pasting in the thousands of lines.

If I run the real world behemoth:

Client elapsed time 108 seconds

DBMS_XPLAN.DISPLAY_CURSOR tells me it ran in 1.26 seconds:

plan hash value: 4262066066  
 
-------------------------------------------------------------------------------------------------------------------- 
| Id  | Operation                     | Name            | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  | 
--------------------------------------------------------------------------------------------------------------------  
|   0 | SELECT STATEMENT              |                 |      1 |        |   1628 |00:00:01.26 |    3352 |    418 | 
|   1 |  CONCATENATION                |                 |      1 |        |   1628 |00:00:01.26 |    3352 |    418 |  
|   2 |   INLIST ITERATOR             |                 |      1 |        |    966 |00:00:00.94 |    2735 |    389 | 
|   3 |    TABLE ACCESS BY INDEX ROWID| XXXXXXXXXX      |    966 |    624 |    966 |00:00:00.94 |    2735 |    389 | 
|*  4 |     INDEX RANGE SCAN          | XXXXXXXXXX_UK01 |    966 |    624 |    966 |00:00:00.86 |    1915 |    380 | 
|   5 |   TABLE ACCESS BY INDEX ROWID | XXXXXXXXXX      |      1 |      1 |      2 |00:00:00.01 |       4 |      0 |  
|*  6 |    INDEX RANGE SCAN           | XXXXXXXXXX_UK01 |      1 |      1 |      2 |00:00:00.01 |       3 |      0 |  
....
| 253 |   TABLE ACCESS BY INDEX ROWID | XXXXXXXXXX      |      1 |      2 |     47 |00:00:00.01 |      14 |      1 |  
|*254 |    INDEX RANGE SCAN           | XXXXXXXXXX_UK01 |      1 |      2 |     47 |00:00:00.01 |       3 |      1 |   
--------------------------------------------------------------------------------------------------------------------  

Predicate Information (identified by operation id):
--------------------------------------------------- 
   4 - access(((("XXXXXXXXXX"."VERSION"=1 AND "XXXXXXXXXX"."ID_XXXXXXXXXX"=1611722) OR             
              ("XXXXXXXXXX"."VERSION"=1 AND "XXXXXXXXXX"."ID_XXXXXXXXXX"=1611795) OR ("XXXXXXXXXX"."VERSION"=1 AND                                       
              "XXXXXXXXXX"."ID_XXXXXXXXXX"=1611863) OR ("XXXXXXXXXX"."VERSION"=1 AND
              "XXXXXXXXXX"."ID_XXXXXXXXXX"=1612023) OR ("XXXXXXXXXX"."VERSION"=1 AND 			  .....
  6 - access "XXXXXXXXXX"."ID_XXXXXXXXXX"=2046939) 
       filter(((LNNVL("XXXXXXXXXX"."VERSION"=1) OR LNNVL("XXXXXXXXXX"."ID_XXXXXXXXXX"=1611722)) AND 
           (LNNVL("XXXXXXXXXX"."VERSION"=1) OR LNNVL("XXXXXXXXXX"."ID_XXXXXXXXXX"=1611795)) AND 
              (LNNVL("XXXXXXXXXX"."VERSION"=1) OR LNNVL("XXXXXXXXXX"."ID_XXXXXXXXXX"=1611863)) AND 
.....

DBMS_MONITOR tells me the query ran in 1.68 seconds:

Global Information
 Status              :  DONE (ALL ROWS)     
 Instance ID         :  2                   
 Session ID          :  1220                
 SQL ID              :  5fvt3tfbgmqp3       
 SQL Execution ID    :  33554432            
 Plan Hash Value     :  4262066066          
 Execution Started   :  04/05/2013 14:53:42 
 First Refresh Time  :  04/05/2013 14:53:42 
 Last Refresh Time   :  04/05/2013 14:53:45 

--------------------------------------------------------------------
| Elapsed |   Cpu   |    IO    | Cluster  | Fetch | Buffer | Reads |
| Time(s) | Time(s) | Waits(s) | Waits(s) | Calls |  Gets  |       |
--------------------------------------------------------------------
|    1.68 |    0.50 |     1.13 |     0.05 |     4 |   3352 |   418 |
--------------------------------------------------------------------

Fortunately V$SQL tells me the real story:

select sql_id, executions, rows_processed, elapsed_time/1000/1000, cpu_time/1000/1000 
from v$sql where sql_id = '5fvt3tfbgmqp3';

Gives

SQL_ID        EXECUTIONS ROWS_PROCESSED ELAPSED_TIME/1000/1000 CPU_TIME/1000/1000
------------- ---------- -------------- ---------------------- ------------------
5fvt3tfbgmqp3          1           1628              105.81511         104.581102

Then again, if you write nasty code, nasty things can happen.

Sql tuning request

Without knowing anything about the problem in advance, I thought it would be good to do a walkthrough post of a sql tuning request.

But now that I’m done I’m unconvinced as it’s probably too long, the query too meaningless and the real time sql monitoring text output too unreadable in a blog post.

I’m always reluctant to post real, specific application issues because I’m never sure how well they translate and illustrate the desired points unless you convert them to a standalone test case.

Let’s see how it goes.

  • The idea is to touch on the broad strategies that I’m going through.
  • The scope of the solution should match the scope of the problem – so for a single problem query, table design and current indexes, stats and histograms should be considered as set in stone.
  • Ideally we want to avoid hinting as much as possible or at least stick to acceptable hints.
  • And if we’re going to manually intervene, we’re not particularly interested in what the current production plan is nor the usage of any plan stability features to preserve it (unless it’s better than we can do ourselves).

This is from an 11gR2 testing environment, 11.2.0.3 to be specific.

Having done the latest merge of a production code release (9.2.0.8) into our 11gR2 environment (upgrade ETA March), a report has been reported as slow, taking about 1.5 minutes in production and some 20 minutes in the 11gR2 environment.

So, having traced the report and found that the driving query is the prime suspect, let’s get some feedback on the performance of the SQL statement.

Here’s the original sql statement to give some context – I’ve commented out most of the columns because they don’t add much other than just length to the post.

SELECT ... some columns ...,
       ... a function call ...,
       ... some more columns ...
FROM   isbk,
       inst,
       bsta,
       isco,
       sdol,
       borg,
       book,
       isdm,
       rule
WHERE  book.book_num = rule.book_num
AND    rule.rule_type_code = 'IBK'
AND    isbk.inst_num = inst.inst_num
AND    inst.inst_num = bsta.inst_num
AND    inst.inst_num = isco.inst_num
AND    isco.inst_num = isdm.inst_num
AND    isco.xcod_code = 'SEDL'
AND    isco.inst_num = sdol.inst_num
AND    sdol.xcod_code = 'ISIN'
AND    borg.borg_num = inst.issuer_num
AND    book.book_num = isbk.book_num
AND    bsta.current_mat_date >= TO_DATE(v_bus_date, 'yyyymmdd')
UNION ALL
SELECT ... some columns ...
FROM   inst, 
       bsta,
       isco,
       sdol,
       borg,
       isdm,
       inix
WHERE  inst.inst_num = bsta.inst_num
AND    inst.inst_num = isco.inst_num
AND    isco.inst_num = isdm.inst_num
AND    isco.xcod_code = 'SEDL'
AND    isco.inst_num = sdol.inst_num
AND    sdol.xcod_code = 'ISIN'
AND    borg.borg_num = inst.issuer_num
AND    bsta.current_mat_date >= TO_DATE(v_bus_date, 'yyyymmdd')
AND    isco.inst_num = inix.inst_num
AND    inix.xcod_code = 'NIX'
AND    NOT EXISTS (SELECT 1
                   FROM   isbk,
                          rule,
                          book,
                          bsta
                   WHERE  rule.rule_type_code = 'IBK'
                   AND    book.book_num = rule.book_num
                   AND    book.book_num = isbk.book_num
                   AND    inst.inst_num = isbk.inst_num
                   AND    inst.inst_num = bsta.inst_num
                   AND    bsta.current_mat_date >= TO_DATE(v_bus_date, 'yyyymmdd'));

Looking at the query and the repetition of tables and joins between the two UNION ALL parts, it looks like a classic case of two sets of disparate driving data that then need to be joined to the same additional tables.

Ironically, expanding ORs out into UNIONs is a common initial tuning step for performance problems with OR predicates.

Here’s what little extra schema knowledge might be useful:

  • INST has a pk of INST_NUM.
  • ISCO has an n:1 relationship with INST.
  • ISDM has an n:1 relationship with INST.
  • BORG has a 1:1 relationship with INST.ISSUER_NUM.
  • BSTA has a 1:1 relationship with INST.
  • ISBK has a n:1 relationship with INST and a n:1 relationship with BOOK.
  • BOOK has a pk of BOOK_NUM.
  • RULE is a table used for generic filtering rules, in this case related to BOOK.BOOK_NUM.

Let’s use Real-Time SQL Monitoring (usual license caveats apply) to see what the current performance story is.

SELECT dbms_sqltune.report_sql_monitor(<sql_id>) FROM DUAL;

If there’s one downside to RTSM, it’s a bit too wide for these blog posts :(
Maybe the RTSM pictures would have been better?

Anyway, I’ve sacrificed some of the columns in the report and tried to shrink the font (there’s a scrollbar at the bottom of the plan window).


Global Stats
===========================================================
| Elapsed |   Cpu   | PL/SQL  |  Other   | Fetch | Buffer |
| Time(s) | Time(s) | Time(s) | Waits(s) | Calls |  Gets  |
===========================================================
|    1094 |    1094 |    0.29 |     0.20 |    60 |   257K |
===========================================================

SQL Plan Monitoring Details (Plan Hash Value=1036829859)
===============================================================================================================================
| Id |                Operation                |       Name         |  Rows   | Execs |   Rows   | Activity | Activity Detail |
|    |                                         |                    | (Estim) |       | (Actual) |   (%)    |   (# samples)   |
===============================================================================================================================
|  0 | SELECT STATEMENT                        |                    |         |     1 |    29225 |          |                 |
|  1 |   NESTED LOOPS                          |                    |       1 |     1 |    29225 |          |                 |
|  2 |    NESTED LOOPS                         |                    |       1 |     1 |    29205 |          |                 |
|  3 |     NESTED LOOPS                        |                    |       1 |     1 |    29205 |          |                 |
|  4 |      VIEW                               | VW_JF_SET$AD8EBC08 |    6147 |     1 |    29205 |          |                 |
|  5 |       UNION-ALL                         |                    |         |     1 |    29205 |          |                 |
|  6 |        NESTED LOOPS                     |                    |       9 |     1 |    28719 |          |                 |
|  7 |         NESTED LOOPS                    |                    |      53 |     1 |    28722 |          |                 |
|  8 |          NESTED LOOPS                   |                    |     370 |     1 |    31674 |          |                 |
|  9 |           NESTED LOOPS                  |                    |     370 |     1 |    31674 |          |                 |
| 10 |            NESTED LOOPS                 |                    |       3 |     1 |        1 |          |                 |
| 11 |             INDEX RANGE SCAN            | RULE_IDX_2         |       3 |     1 |        1 |          |                 |
| 12 |             TABLE ACCESS BY INDEX ROWID | BOOK               |       1 |     1 |        1 |          |                 |
| 13 |              INDEX UNIQUE SCAN          | BOOK_IDX           |       1 |     1 |        1 |          |                 |
| 14 |            INDEX RANGE SCAN             | ISBK_IDX           |     116 |     1 |    31674 |          |                 |
| 15 |           TABLE ACCESS BY INDEX ROWID   | INST               |       1 | 31674 |    31674 |          |                 |
| 16 |            INDEX UNIQUE SCAN            | ISTR_PK            |       1 | 31674 |    31674 |          |                 |
| 17 |          TABLE ACCESS BY INDEX ROWID    | BSTA               |       1 | 31674 |    28722 |          |                 |
| 18 |           INDEX UNIQUE SCAN             | BSTAC_IDX          |       1 | 31674 |    31654 |          |                 |
| 19 |         INDEX RANGE SCAN                | ISCO_4_IDX         |       1 | 28722 |    28719 |          |                 |
| 20 |        HASH JOIN                        |                    |    6138 |     1 |      486 |          |                 |
| 21 |         HASH JOIN ANTI                  |                    |    5716 |     1 |    54961 |    67.16 | Cpu (732)       |
| 22 |          HASH JOIN                      |                    |   24385 |     1 |    83680 |          |                 |
| 23 |           INDEX RANGE SCAN              | ISCO_4_IDX         |    170K |     1 |     377K |          |                 |
| 24 |           HASH JOIN                     |                    |    144K |     1 |     266K |          |                 |
| 25 |            TABLE ACCESS FULL            | BSTA               |    144K |     1 |     266K |     0.09 | Cpu (1)         |
| 26 |            TABLE ACCESS FULL            | INST               |      1M |     1 |       1M |          |                 |
| 27 |          VIEW                           | VW_SQ_1            |     53M |     1 |       4G |     5.60 | Cpu (61)        |
| 28 |           HASH JOIN                     |                    |     53M |     1 |       4G |    27.06 | Cpu (295)       |
| 29 |            MERGE JOIN CARTESIAN         |                    |    458K |     1 |     266K |          |                 |
| 30 |             NESTED LOOPS                |                    |       3 |     1 |        1 |          |                 |
| 31 |              INDEX RANGE SCAN           | RULE_IDX_2         |       3 |     1 |        1 |          |                 |
| 32 |              INDEX UNIQUE SCAN          | BOOK_IDX           |       1 |     1 |        1 |          |                 |
| 33 |             BUFFER SORT                 |                    |    144K |     1 |     266K |          |                 |
| 34 |              INDEX FAST FULL SCAN       | BSTA_IDX_2         |    144K |     1 |     266K |          |                 |
| 35 |            TABLE ACCESS FULL            | ISBK               |      2M |     1 |       2M |          |                 |
| 36 |         INDEX RANGE SCAN                | ISCO_4_IDX         |    170K |     1 |    32983 |          |                 |
| 37 |      TABLE ACCESS BY INDEX ROWID        | BORG               |       1 | 29205 |    29205 |          |                 |
| 38 |       INDEX UNIQUE SCAN                 | BORG_PK            |       1 | 29205 |    29205 |          |                 |
| 39 |     INDEX RANGE SCAN                    | ISCO_4_IDX         |       1 | 29205 |    29205 |          |                 |
| 40 |    INDEX RANGE SCAN                     | ISDM_IDX           |       1 | 29205 |    29225 |          |                 |
===============================================================================================================================

There’s a lot in this report.

First thing you might have spotted is the Join Factorisation going on @ step 4 as indicated by the name VW_JF*.

Maybe you’re always wary of the MERGE JOIN CARTESIAN … BUFFER SORT? Not always a problem of course but where there’s trouble you’ll often find her (in real production scenarios, I find that this mechanism is a problem not so much when there are missing join conditions – because these are rarely found in production code – but rather as a valid join mechanism but where the rowsource estimates are significantly inaccurate).

If you look at the SQL, you’ll see a function call as well.

There are also a whole bunch of estimates that are significantly off.

Where to start?

There are four main areas of questioning:

  1. What’s taking all the time? Are there some particular steps in the plan which are more problematic than others?
  2. If estimates are inaccurate, where do they go most wrong or go wrong first?
  3. Which predicates eliminate the most data? i.e. it’s rarely a good thing to join thousands upon thousands of rows only to do a late filter in the plan to reduce it down to a few handfuls. Aka eliminate early.
  4. What is the simplest / quickest / least invasive change that can be made to significantly improve performance? And will it actually be sufficient?

From “Activity %”, I hope it’s clear from the report that all the time is taken up by the bottom half of the UNION ALL.

  • The rowsource cardinality estimates are not that accurate.
  • Plus we find in that second half our old friend the MJC+BS.
  • And we’re just burning CPU down there.

So, I’d like to isolate that bottom half of the UNION ALL and run it standalone.

However, in a clear indication of the issues with it, it won’t run standalone – It’s just blown 30+ gig of temp space.

But it returns only 486 rows so let’s try to get a runnable standalone version.

Before we start looking at possible solutions, let’s start to ask questions about the query logic itself, keeping an eye out for redundant tables and joins and asking ourselves whether this is the best way to word the question we think is being asked.

Best way to start that is

So, we’ve got a few “filters” predicates, a couple of “join predicates” and a NOT EXISTS correlated subquery.

The logic of the subquery is the first thing that jumps out at me.
1. BOOK seems redundant
We join RULE to BOOK and BOOK to ISBK all by BOOK_NUM and ISBK is correlated to the outer INST by INST_NUM.
We do no filtering by any BOOK attribute so it serves no purpose so, let’s remove BOOK, join RULE straight to ISBK.

2. The subquery filtering by BSTA.CURRENT_MAT_DATE is irrelevant.
This is the same filter as in the outer select.
It doesn’t make sense.
Ignoring the actual evaluation order of all these predicates, we should effectively only be checking the NOT EXISTS against INST_NUMS that have passed the outer BSTA filter. So, by definition, this particular predicate in the subquery will always be true. Why repeat it? This is a mistake.
The only thing this subquery should be doing is checking they’re not in the RULE/ISBK combo. So, let’s remove that.

If we run just comment out the tables and joins as per suggestion above, then that bottom query runs in a couple of seconds:


Global Stats
=================================================
| Elapsed |   Cpu   |  Other   | Fetch | Buffer |
| Time(s) | Time(s) | Waits(s) | Calls |  Gets  |
=================================================
|    1.12 |    1.12 |     0.00 |     2 |  73879 |
=================================================

SQL Plan Monitoring Details (Plan Hash Value=2734204492)
==========================================================================================================
| Id |         Operation          |   Name     |  Rows   | Execs |   Rows   | Activity | Activity Detail |
|    |                            |            | (Estim) |       | (Actual) |   (%)    |   (# samples)   |
==========================================================================================================
|  0 | SELECT STATEMENT           |            |         |     1 |      489 |          |                 |
|  1 |   HASH JOIN RIGHT ANTI     |            |   38162 |     1 |      489 |          |                 |
|  2 |    VIEW                    | VW_SQ_1    |     555 |     1 |    31674 |          |                 |
|  3 |     NESTED LOOPS           |            |     555 |     1 |    31674 |          |                 |
|  4 |      INDEX RANGE SCAN      | RULE_IDX_2 |       3 |     1 |        1 |          |                 |
|  5 |      INDEX RANGE SCAN      | ISBK_IDX   |     176 |     1 |    31674 |          |                 |
|  6 |    HASH JOIN               |            |   38189 |     1 |    28700 |          |                 |
|  7 |     HASH JOIN              |            |   37626 |     1 |    28700 |          |                 |
|  8 |      HASH JOIN             |            |   35036 |     1 |    86956 |   100.00 | Cpu (1)         |
|  9 |       HASH JOIN            |            |   32625 |     1 |    87212 |          |                 |
| 10 |        HASH JOIN           |            |   31565 |     1 |    83593 |          |                 |
| 11 |         INDEX RANGE SCAN   | ISCO_4_IDX |    170K |     1 |     377K |          |                 |
| 12 |         HASH JOIN          |            |    144K |     1 |     266K |          |                 |
| 13 |          TABLE ACCESS FULL | BSTA       |    144K |     1 |     266K |          |                 |
| 14 |          TABLE ACCESS FULL | INST       |    796K |     1 |     796K |          |                 |
| 15 |        TABLE ACCESS FULL   | ISDM       |      1M |     1 |       1M |          |                 |
| 16 |       INDEX RANGE SCAN     | ISCO_4_IDX |    170K |     1 |     861K |          |                 |
| 17 |      INDEX RANGE SCAN      | ISCO_4_IDX |    170K |     1 |    32983 |          |                 |
| 18 |     TABLE ACCESS FULL      | BORG       |    328K |     1 |     328K |          |                 |
==========================================================================================================

Note that I’ve had to hint this with the /*+ monitor */ hint because by default this now executes beneath the default threshold for monitoring.

It looks like we’ve arrived at our quickest/simplest change just by going through the query quickly trying to understand the question it’s asking. No hints required.

So this is would be a good place to stop, thoroughly test the change, validate the original and the changed results and move on to something else.

We should drop it back into the UNION ALL and see what we get:


Global Stats
===========================================================
| Elapsed |   Cpu   | PL/SQL  |  Other   | Fetch | Buffer |
| Time(s) | Time(s) | Time(s) | Waits(s) | Calls |  Gets  |
===========================================================
|    4.68 |    4.68 |    0.38 |     0.00 |    60 |   230K |
===========================================================

SQL Plan Monitoring Details (Plan Hash Value=4171599168)
===============================================================================================================================
| Id |                Operation                |       Name         |  Rows   | Execs |   Rows   | Activity | Activity Detail |
|    |                                         |                    | (Estim) |       | (Actual) |   (%)    |   (# samples)   |
===============================================================================================================================
|  0 | SELECT STATEMENT                        |                    |         |     1 |    29225 |          |                 |
|  1 |   NESTED LOOPS                          |                    |       1 |     1 |    29225 |          |                 |
|  2 |    NESTED LOOPS                         |                    |       1 |     1 |    29205 |          |                 |
|  3 |     HASH JOIN                           |                    |       1 |     1 |    29205 |          |                 |
|  4 |      TABLE ACCESS FULL                  | BORG               |    328K |     1 |     328K |          |                 |
|  5 |      VIEW                               | VW_JF_SET$AD8EBC08 |   26182 |     1 |    29205 |          |                 |
|  6 |       UNION-ALL                         |                    |         |     1 |    29205 |          |                 |
|  7 |        NESTED LOOPS                     |                    |       9 |     1 |    28719 |          |                 |
|  8 |         NESTED LOOPS                    |                    |      53 |     1 |    28722 |          |                 |
|  9 |          NESTED LOOPS                   |                    |     370 |     1 |    31674 |          |                 |
| 10 |           NESTED LOOPS                  |                    |     370 |     1 |    31674 |          |                 |
| 11 |            NESTED LOOPS                 |                    |       3 |     1 |        1 |          |                 |
| 12 |             INDEX RANGE SCAN            | RULE_IDX_2         |       3 |     1 |        1 |          |                 |
| 13 |             TABLE ACCESS BY INDEX ROWID | BOOK               |       1 |     1 |        1 |          |                 |
| 14 |              INDEX UNIQUE SCAN          | BOOK_IDX           |       1 |     1 |        1 |          |                 |
| 15 |            INDEX RANGE SCAN             | ISBK_IDX           |     116 |     1 |    31674 |          |                 |
| 16 |           TABLE ACCESS BY INDEX ROWID   | INST               |       1 | 31674 |    31674 |          |                 |
| 17 |            INDEX UNIQUE SCAN            | ISTR_PK            |       1 | 31674 |    31674 |          |                 |
| 18 |          TABLE ACCESS BY INDEX ROWID    | BSTA               |       1 | 31674 |    28722 |          |                 |
| 19 |           INDEX UNIQUE SCAN             | BSTA_IDX           |       1 | 31674 |    31654 |          |                 |
| 20 |         INDEX RANGE SCAN                | ISCO_4_IDX         |       1 | 28722 |    28719 |          |                 |
| 21 |        HASH JOIN RIGHT ANTI             |                    |   26173 |     1 |      486 |          |                 |
| 22 |         VIEW                            | VW_SQ_1            |     555 |     1 |    31674 |          |                 |
| 23 |          NESTED LOOPS                   |                    |     555 |     1 |    31674 |          |                 |
| 24 |           INDEX RANGE SCAN              | RULE_IDX_2         |       3 |     1 |        1 |          |                 |
| 25 |           INDEX RANGE SCAN              | ISBK_IDX           |     176 |     1 |    31674 |          |                 |
| 26 |         HASH JOIN                       |                    |   26187 |     1 |    28693 |          |                 |
| 27 |          HASH JOIN                      |                    |   24385 |     1 |    83680 |    33.33 | Cpu (1)         |
| 28 |           INDEX RANGE SCAN              | ISCO_4_IDX         |    170K |     1 |     377K |          |                 |
| 29 |           HASH JOIN                     |                    |    144K |     1 |     266K |          |                 |
| 30 |            TABLE ACCESS FULL            | BSTA               |    144K |     1 |     266K |          |                 |
| 31 |            TABLE ACCESS FULL            | INST               |      1M |     1 |       1M |          |                 |
| 32 |          INDEX RANGE SCAN               | ISCO_4_IDX         |    170K |     1 |    32983 |          |                 |
| 33 |     INDEX RANGE SCAN                    | ISCO_4_IDX         |       1 | 29205 |    29205 |          |                 |
| 34 |    INDEX RANGE SCAN                     | ISDM_IDX           |       1 | 29205 |    29225 |          |                 |
===============================================================================================================================

So we could leave it there.

However, I wouldn’t blame you if you wanted to go further, even if we’re bordering on Compulsive Tuning Disorder:

  • I don’t really like leaving behind a plan that’s got significantly inaccurate estimates – it leaves behind too much of a future threat.
  • And maybe we can also look at the original query and the direction the Join Factorisation was indicating and take it further? How about doing the UNION ALL a bit earlier and then do one lot of joining to the shared tables from the original SQL?
  • Let’s use some dynamic sampling to improve some single table cardinality estimates but then perhaps let’s go too far and add some join estimate adjustments (I say too far because I’m going to use opt_estimate but it’s undocumented and I’m not recommending it but if you do use it, then use it in conjunction with qb_name).
  • And let’s get rid of the function call which, take it from me, in this case is effectively a single table outer join lookup anyway.
SELECT  /*+
          find_me
          monitor
          qb_name(main)
          dynamic_sampling(sdol@main 4)
          dynamic_sampling(isin@main 4)
          */
        ... some columns ...
FROM   (SELECT /*+
                 qb_name(union1)
                 opt_estimate(join(isbk@union1 rule@union1) scale_rows=100)
                 */
               isbk.inst_num  inst_num
        ,      inix.inst_code nix_code
        ,      (SELECT book.name
                FROM   book
                WHERE  book.book_num = rule.book_num) book_name
        FROM   isbk
        ,      rule
        ,      inix
        WHERE  rule.rule_type_code    = 'IBK'
        AND    rule.book_num          = isbk.book_num 
        AND    inix.inst_num      (+) = isbk.inst_num
        AND    inix.xcod_code     (+) = 'NIX'
        UNION ALL
        SELECT /*+
                 qb_name(union2)
                 dynamic_sampling(inix@union2 4)
                 */
               inix.inst_num  inst_num
        ,      inix.inst_code nix_code
        ,      NULL           book_name
        FROM   inix
        WHERE  inix.xcod_code         = 'NIX'
        AND    NOT EXISTS (SELECT /*+ 
                                    qb_name(sub1)
                                    opt_estimate(join(isbk@sub1 rule@sub1) scale_rows=100)
                                    */
                                  1
                           FROM   isbk
                           ,      rule
                           WHERE  rule.rule_type_code    = 'IBK'
                           AND    rule.book_num          = isbk.book_num
                           AND    isbk.inst_num         = inix.inst_num))
       xxxx
,      bsta
,      isin
,      sdol
,      inst
,      borg
,      isdm
WHERE  bsta.inst_num          = xxxx.inst_num
AND    bsta.current_mat_date >= TO_DATE(:v_bus_date, 'yyyymmdd')
AND    isin.inst_num          = xxxx.inst_num 
AND    isin.xcod_code         = 'SEDL'
AND    sdol.inst_num          = xxxx.inst_num
AND    sdol.xcod_code         = 'ISIN'
AND    inst.inst_num          = xxxx.inst_num
AND    borg.borg_num          = inst.issuer_num
AND    isdm.inst_num          = xxxx.inst_num;

Which gives:


Global Stats
=================================================
| Elapsed |   Cpu   |  Other   | Fetch | Buffer |
| Time(s) | Time(s) | Waits(s) | Calls |  Gets  |
=================================================
|    1.40 |    1.40 |     0.00 |    60 |  79273 |
=================================================

SQL Plan Monitoring Details (Plan Hash Value=2392527494)
=======================================================================================================================
| Id |               Operation                |   Name      |  Rows   | Execs |   Rows   | Activity | Activity Detail |
|    |                                        |             | (Estim) |       | (Actual) |   (%)    |   (# samples)   |
=======================================================================================================================
|  0 | SELECT STATEMENT                       |             |         |     1 |    29225 |          |                 |
|  1 |   HASH JOIN                            |             |   94337 |     1 |    29225 |          |                 |
|  2 |    INDEX RANGE SCAN                    | ISCO_4_IDX  |    373K |     1 |     377K |          |                 |
|  3 |    HASH JOIN                           |             |   80747 |     1 |    29629 |          |                 |
|  4 |     HASH JOIN                          |             |   80747 |     1 |    29629 |          |                 |
|  5 |      HASH JOIN                         |             |   80747 |     1 |    29629 |          |                 |
|  6 |       HASH JOIN                        |             |   78122 |     1 |    29604 |          |                 |
|  7 |        MERGE JOIN                      |             |   78003 |     1 |    33947 |          |                 |
|  8 |         INDEX RANGE SCAN               | ISCO_4_IDX  |    818K |     1 |     861K |          |                 |
|  9 |         SORT JOIN                      |             |   55765 |  861K |    33947 |          |                 |
| 10 |          VIEW                          |             |   55765 |     1 |    33960 |          |                 |
| 11 |           UNION-ALL                    |             |         |     1 |    33960 |          |                 |
| 12 |            TABLE ACCESS BY INDEX ROWID | BOOK        |       1 |     1 |        1 |          |                 |
| 13 |             INDEX UNIQUE SCAN          | BOOK_IDX    |       1 |     1 |        1 |          |                 |
| 14 |            HASH JOIN OUTER             |             |   55485 |     1 |    31674 |          |                 |
| 15 |             NESTED LOOPS               |             |   55485 |     1 |    31674 |          |                 |
| 16 |              INDEX RANGE SCAN          | RULE_IDX_2  |       3 |     1 |        1 |          |                 |
| 17 |              INDEX RANGE SCAN          | ISBK_IDX    |   17609 |     1 |    31674 |          |                 |
| 18 |             INDEX RANGE SCAN           | ISCO_4_IDX  |    170K |     1 |    32983 |          |                 |
| 19 |            HASH JOIN ANTI              |             |     280 |     1 |     2286 |          |                 |
| 20 |             INDEX RANGE SCAN           | ISCO_4_IDX  |   28028 |     1 |    32983 |          |                 |
| 21 |             VIEW                       | VW_SQ_1     |   55485 |     1 |    31674 |          |                 |
| 22 |              NESTED LOOPS              |             |   55485 |     1 |    31674 |          |                 |
| 23 |               INDEX RANGE SCAN         | RULE_IDX_2  |       3 |     1 |        1 |          |                 |
| 24 |               INDEX RANGE SCAN         | ISBK_IDX    |   17609 |     1 |    31674 |          |                 |
| 25 |        TABLE ACCESS FULL               | BSTA        |    144K |     1 |     266K |          |                 |
| 26 |       TABLE ACCESS FULL                | ISDM        |      1M |     1 |       1M |          |                 |
| 27 |      TABLE ACCESS FULL                 | INST        |    796K |     1 |     796K |   100.00 | Cpu (1)         |
| 28 |     TABLE ACCESS FULL                  | BORG        |    328K |     1 |     328K |          |                 |
=======================================================================================================================

It might not be perfect but it’s certainly a lot better.

An acceptable compromise might be to refactor the UNION ALL as per above but omit the undocumented opt_estimate hints.

All we need to do is some more testing to validate the results and to also triple check the performance when the data is not cached and we’re done.

If you made it this far then Wow! I probably wouldn’t have done.

So, to summarise, what have we done?

  • Well, we haven’t needed to know much about the original intention of the query nor the schema.
  • We’ve improved the accuracies of some of the estimates.
  • We’ve found some redundant tables and joins.
  • And we’ve reordered the query slightly to better phrase the question that we think was being asked, in the process moving the UNION ALL earlier in the processing so that some of the joining tables only needed to be referenced once (an advantage that might not be preserved if the CBO decides to merge the UNION ALL but that would be the opposite of the recently developed join factorisation mechanism).

And these simple, quick steps effectively reduced the execution time of a query from 20 minutes to a few seconds.

In hindsight, a good example because of the gains realised but a bad example because of the length of query and the associated detail, particularly the redundant tables.

Scalar Subselect Costing

This issue is an oldie but deserving of a quick post to stop me going off on a tangent in another post.

It is an oddity of scalar subselects/subqueries that their cost is not taken into account in the top level cost of a query.

In older versions of Oracle, it used to be the case that you didn’t even see the scalar subquery in the execution plan.

However, even in the latest versions, the cost still isn’t accounted for.

Always something to keep in mind.

For example:

SQL> create table t1
  2  (col1 number not null);

Table created.

SQL> 
SQL> insert into t1
  2  select rownum
  3  from   dual
  4  connect by rownum <= 10000;

10000 rows created.

SQL> 
SQL> commit;

Commit complete.

SQL> 
SQL> create table t2
  2  (col1 number not null primary key);

Table created.

SQL> 
SQL> 
SQL> insert into t2
  2  select rownum
  3  from   dual
  4  connect by rownum <= 10000;

10000 rows created.

SQL> 
SQL> commit;

Commit complete.

SQL> 

Let’s do a scalar subselect to do an index lookup on t2 for every row in t1:

SQL> explain plan for
  2  select t1.col1
  3  ,      (select t2.col1 from t2 where t2.col1 = t1.col1)
  4  from   t1;

Explained.

SQL> 
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------
Plan hash value: 2339000913

----------------------------------------------------------------------------------
| Id  | Operation         | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |              | 10000 |   126K|     8   (0)| 00:00:01 |
|*  1 |  INDEX UNIQUE SCAN| SYS_C0078310 |     1 |    13 |     1   (0)| 00:00:01 |
|   2 |  TABLE ACCESS FULL| T1           | 10000 |   126K|     8   (0)| 00:00:01 |
----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("T2"."COL1"=:B1)

You can see that the cost of the scalar subquery is 1 per execution and it’s not accounted for at the top level.

Let’s force a full table scan of the row-by-row lookup:

SQL> explain plan for
  2  select t1.col1
  3  ,      (select /*+ full(t2) */ t2.col1 from t2 where t2.col1 = t1.col1)
  4  from   t1;

Explained.

SQL> 
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------
Plan hash value: 637946564

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      | 10000 |   126K|     8   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T2   |     1 |    13 |     2   (0)| 00:00:01 |
|   2 |  TABLE ACCESS FULL| T1   | 10000 |   126K|     8   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("T2"."COL1"=:B1)

Obviously a much more expensive operation but, again, not properly accounted for in the overall costing.

Wouldn’t it be preferable that as the optimizer has estimated the number of rows in the top level select:

|   0 | SELECT STATEMENT  |      | 10000 |   126K|     8   (0)| 00:00:01 |

and it has estimated the cost per execution of the scalar subselect:

|*  1 |  TABLE ACCESS FULL| T2   |     1 |    13 |     2   (0)| 00:00:01 |

that the top level cost include to some degree the cost of scalar subselect per execution * estimated executions?

For example, if we code a join roughly equivalent to the scalar subselect then:

SQL> explain plan for
  2  select /*+ 
  3           full(t2) 
  4           use_nl(t2)
  5           */
  6         t1.col1
  7  ,      t2.col1
  8  from   t1
  9  ,      t2
 10  where t2.col1 (+) = t1.col1;

Explained.

SQL> 
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------
Plan hash value: 2453408398

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      | 10000 |   253K| 66919   (7)| 00:05:35 |
|   1 |  NESTED LOOPS OUTER|      | 10000 |   253K| 66919   (7)| 00:05:35 |
|   2 |   TABLE ACCESS FULL| T1   | 10000 |   126K|     8   (0)| 00:00:01 |
|*  3 |   TABLE ACCESS FULL| T2   |     1 |    13 |     7  (15)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter("T2"."COL1"(+)="T1"."COL1")

Also see:
http://jonathanlewis.wordpress.com/2007/10/12/scalar-subqueries/

http://oracle-randolf.blogspot.com/2010/01/when-your-projection-is-not-cost-free.html

http://blog.sydoracle.com/2005/09/explain-plans-and-scalar-subqueries.html

Cardinality Feedback

True to form, Kerry Osborne has another excellent post, this time on Cardinality Feedback.

If the mark of a good post is that it sets you off thinking and investigating then he’s nailed it again.

I started to comment on his post, but it was getting long and messy and I thought it easy to make a comment with a post of my own. Apart from anything else, I always seem to cock up the formatting when making longer comments with examples, etc.

So, hopefully, you’ve read Kerry’s article and he asked where Oracle stored the opt_estimate fudge factor feedback that he could see in his trace and that came through when using a baseline created from the plan changed by cardinality feedback.

My initial reaction was that I thought they would be independent and so began this little investigation (which isn’t finished yet).

First tangent of the day … on the subject of OPT_ESTIMATE – it was my belief that this was one of the principal mechanisms used by sql plan baselines. I know it was very popular with standalone sql profiles. I’ve got a controlled number of baselines in my environment – a couple of hundred – none of them use OPT_ESTIMATE in SQLOBJ$DATA.COMP_DATA. Surprising.

Anyway, I thought in terms of having a standalone demo on Cardinality Feedback, why reinvent the wheel?

On his blog article on the subject, Tom Kyte has a standalone demo. Perfect I thought, I’ll reuse it.

And it goes a little something like this:

1. Create table and package:

create or replace type str2tblType as table of varchar2(30);
/
create or replace function str2tbl( 
  p_str in varchar2, 
  p_delim in varchar2 default ',' )
return str2tblType
PIPELINED
as
  l_str      long default p_str || p_delim;
  l_n        number;
begin
   loop
       l_n := instr( l_str, p_delim );
       exit when (nvl(l_n,0) = 0);
       pipe row ( ltrim(rtrim(substr(l_str,1,l_n-1))) );
       l_str := substr( l_str, l_n+1 );
   end loop;
   return;
end;
/

2. Set up variable in sql*plus and execute sql statement once, get plan

var in_list varchar2(255)
exec :in_list := 'DBMS_PIPE,DBMS_OUTPUT,UTL_FILE';

select * from TABLE(cast( str2tbl(:in_list ) as str2tblType) ) t;

select * from table(dbms_xplan.display_cursor);

Which produced the following plan:

SQL> select * from TABLE(cast( str2tbl(:in_list ) as str2tblType) ) t;

COLUMN_VALUE
------------------------------
DBMS_PIPE
DBMS_OUTPUT
UTL_FILE

3 rows selected.

SQL> 
SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
SQL_ID  d7bc3g2ccymkt, child number 0
-------------------------------------
select * from TABLE(cast( str2tbl(:in_list ) as str2tblType) ) t

Plan hash value: 2407808827

---------------------------------------------------------------------------------------------
| Id  | Operation                         | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                  |         |       |       |    34 (100)|          |
|   1 |  COLLECTION ITERATOR PICKLER FETCH| STR2TBL |  8168 | 16336 |    34   (3)| 00:00:01 |
---------------------------------------------------------------------------------------------


13 rows selected.

SQL> 

Hopefully, no surprises there.

According to the original demo, if I rinse and repeat and should get some feedback kicking right in there.

SQL> select * from TABLE(cast( str2tbl(:in_list ) as str2tblType) ) t;

COLUMN_VALUE
------------------------------
DBMS_PIPE
DBMS_OUTPUT
UTL_FILE

3 rows selected.

SQL> 
SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------
SQL_ID  d7bc3g2ccymkt, child number 0
-------------------------------------
select * from TABLE(cast( str2tbl(:in_list ) as str2tblType) ) t

Plan hash value: 2407808827

---------------------------------------------------------------------------------------------
| Id  | Operation                         | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                  |         |       |       |    34 (100)|          |
|   1 |  COLLECTION ITERATOR PICKLER FETCH| STR2TBL |  8168 | 16336 |    34   (3)| 00:00:01 |
---------------------------------------------------------------------------------------------


13 rows selected.

SQL> 

Oh!

Nope.

And doesn’t matter how often I repeated, same difference.

So, this probably the right time to mention version, right?

SQL> select * from v$version;

BANNER
------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
PL/SQL Release 11.2.0.2.0 - Production
CORE    11.2.0.2.0      Production
TNS for Linux: Version 11.2.0.2.0 - Production
NLSRTL Version 11.2.0.2.0 - Production

At this point, something I’d read by the Optimzer Development Group rang a bell:

For similar reasons, execution statistics for queries containing bind variables can be problematic. In cardinality feedback, we limit the feedback to portions of the plan whose estimates are not affected by bind variables. For instance, if a table has a filter comparing a column to a bind value, the cardinality of that table will not be used. However, cardinality feedback can still be used for cardinalities elsewhere in the plan.

Sounds relevant, doesn’t it?

Let’s substitute the bind with a literal.

So, we’re going to run this:

select * from TABLE(cast( str2tbl('DBMS_PIPE,DBMS_OUTPUT,UTL_FILE' ) as str2tblType) ) t;

select * from table(dbms_xplan.display_cursor);

which gave me:

SQL> select * from TABLE(cast( str2tbl('DBMS_PIPE,DBMS_OUTPUT,UTL_FILE' ) as str2tblType) ) t;

COLUMN_VALUE
------------------------------
DBMS_PIPE
DBMS_OUTPUT
UTL_FILE

3 rows selected.

SQL> 
SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------
SQL_ID  8udfh089hyh6d, child number 0
-------------------------------------
select * from TABLE(cast( str2tbl('DBMS_PIPE,DBMS_OUTPUT,UTL_FILE' ) as
str2tblType) ) t

Plan hash value: 2407808827

---------------------------------------------------------------------------------------------
| Id  | Operation                         | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                  |         |       |       |    34 (100)|          |
|   1 |  COLLECTION ITERATOR PICKLER FETCH| STR2TBL |  8168 | 16336 |    34   (3)| 00:00:01 |
---------------------------------------------------------------------------------------------


14 rows selected.

And again

SQL> select * from TABLE(cast( str2tbl('DBMS_PIPE,DBMS_OUTPUT,UTL_FILE' ) as str2tblType) ) t;

COLUMN_VALUE
------------------------------
DBMS_PIPE
DBMS_OUTPUT
UTL_FILE

3 rows selected.

SQL> 
SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------
SQL_ID  8udfh089hyh6d, child number 1
-------------------------------------
select * from TABLE(cast( str2tbl('DBMS_PIPE,DBMS_OUTPUT,UTL_FILE' ) as
str2tblType) ) t

Plan hash value: 2407808827

---------------------------------------------------------------------------------------------
| Id  | Operation                         | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                  |         |       |       |    33 (100)|          |
|   1 |  COLLECTION ITERATOR PICKLER FETCH| STR2TBL |     6 |    12 |    33   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------

Note
-----
   - cardinality feedback used for this statement


18 rows selected.

SQL> 

So, cardinality feedback successfully demoed with cardinality estimates reduced from the blocksize-related default for a collection to something altogether more accurate.

Now we need to move on to a case where these improved cardinality estimates actually affect the plan.

Fortunately, Tom’s demo does just that with a new table:

create table data  as
select * from all_objects;

create index data_idx on data(object_name);

exec dbms_stats.gather_table_stats( user, 'DATA' );

And a new statement:

with T as
( select /*+ find_me */ 
         distinct * 
  from   TABLE(cast( str2tbl( 'DBMS_PIPE,DBMS_OUTPUT,UTL_FILE' ) as str2tblType) ) t 
  where rownum > 0 )
select * from data, t 
where data.object_name = t.column_value;

select * from table(dbms_xplan.display_cursor(format=>'-bytes'));

which gives the following (actual sql output snipped):

SQL> with T as
  2  ( select /*+ find_me */
  3           distinct *
  4    from   TABLE(cast( str2tbl( 'DBMS_PIPE,DBMS_OUTPUT,UTL_FILE' ) as str2tblType) ) t
  5    where rownum > 0 )
  6  select * from data, t
  7  where data.object_name = t.column_value;

.......

5 rows selected.

SQL> 
SQL> select * from table(dbms_xplan.display_cursor(format=>'-bytes'));

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
SQL_ID  c4m3t9jwvs3ht, child number 0
-------------------------------------
with T as ( select /*+ find_me */          distinct *   from
TABLE(cast( str2tbl( 'DBMS_PIPE,DBMS_OUTPUT,UTL_FILE' ) as str2tblType)
) t   where rownum > 0 ) select * from data, t where data.object_name =
t.column_value

Plan hash value: 892089582

------------------------------------------------------------------------------------------
| Id  | Operation                              | Name    | Rows  | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                       |         |       |   318 (100)|          |
|*  1 |  HASH JOIN                             |         | 15014 |   318   (5)| 00:00:02 |
|   2 |   VIEW                                 |         |  8168 |    36   (9)| 00:00:01 |
|   3 |    HASH UNIQUE                         |         |  8168 |    36   (9)| 00:00:01 |
|   4 |     COUNT                              |         |       |            |          |
|*  5 |      FILTER                            |         |       |            |          |
|   6 |       COLLECTION ITERATOR PICKLER FETCH| STR2TBL |  8168 |    34   (3)| 00:00:01 |
|   7 |   TABLE ACCESS FULL                    | DATA    | 58781 |   279   (3)| 00:00:02 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("DATA"."OBJECT_NAME"="T"."COLUMN_VALUE")
   5 - filter(ROWNUM>0)


28 rows selected.

SQL> 

As expected, no cardinality feedback yet, collection estimate back to default.

And I’m just going to monitor what’s in V$SQL:

SQL> select sql_id, child_number, plan_hash_value, executions, is_shareable, sql_plan_baseline 
  2  from v$sql 
  3  where sql_id = 'c4m3t9jwvs3ht';

SQL_ID        CHILD_NUMBER PLAN_HASH_VALUE EXECUTIONS I SQL_PLAN_BASELINE
------------- ------------ --------------- ---------- - ------------------------------
c4m3t9jwvs3ht            0       892089582          1 Y

1 row selected.

SQL> 

Now re-run:

SQL> with T as
  2  ( select /*+ find_me */ 
  3           distinct * 
  4    from   TABLE(cast( str2tbl( 'DBMS_PIPE,DBMS_OUTPUT,UTL_FILE' ) as str2tblType) ) t 
  5    where rownum > 0 )
  6  select * from data, t 
  7  where data.object_name = t.column_value;

...............

5 rows selected.

SQL> 
SQL> select * from table(dbms_xplan.display_cursor(format=>'-bytes'));

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
SQL_ID  c4m3t9jwvs3ht, child number 1
-------------------------------------
with T as ( select /*+ find_me */          distinct *   from
TABLE(cast( str2tbl( 'DBMS_PIPE,DBMS_OUTPUT,UTL_FILE' ) as str2tblType)
) t   where rownum > 0 ) select * from data, t where data.object_name =
t.column_value

Plan hash value: 3947981921

--------------------------------------------------------------------------------------------
| Id  | Operation                               | Name     | Rows  | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                        |          |       |    43 (100)|          |
|   1 |  NESTED LOOPS                           |          |       |            |          |
|   2 |   NESTED LOOPS                          |          |     6 |    43   (3)| 00:00:01 |
|   3 |    VIEW                                 |          |     3 |    34   (3)| 00:00:01 |
|   4 |     HASH UNIQUE                         |          |     3 |    34   (3)| 00:00:01 |
|   5 |      COUNT                              |          |       |            |          |
|*  6 |       FILTER                            |          |       |            |          |
|   7 |        COLLECTION ITERATOR PICKLER FETCH| STR2TBL  |     6 |    33   (0)| 00:00:01 |
|*  8 |    INDEX RANGE SCAN                     | DATA_IDX |     2 |     2   (0)| 00:00:01 |
|   9 |   TABLE ACCESS BY INDEX ROWID           | DATA     |     2 |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   6 - filter(ROWNUM>0)
   8 - access("DATA"."OBJECT_NAME"="T"."COLUMN_VALUE")

Note
-----
   - cardinality feedback used for this statement


34 rows selected.

SQL> 

And in V$SQL:

SQL> select sql_id, child_number, plan_hash_value, executions, is_shareable, sql_plan_baseline 
  2  from v$sql 
  3  where sql_id = 'c4m3t9jwvs3ht';

SQL_ID        CHILD_NUMBER PLAN_HASH_VALUE EXECUTIONS I SQL_PLAN_BASELINE
------------- ------------ --------------- ---------- - ------------------------------
c4m3t9jwvs3ht            0       892089582          1 N
c4m3t9jwvs3ht            1      3947981921          1 Y

2 rows selected.

SQL> 

So, cardinality feedback has come to the rescue, adjusted our estimates and this time we’ve got a better plan for the numbers concerned with NESTED LOOPS rather than a HASH JOIN.

What happens if, as Kerry did, I create a baseline for this improved statement using his create_baseline.sql script?

SQL> @create_baseline
Enter value for sql_id: c4m3t9jwvs3ht
Enter value for plan_hash_value: 3947981921
Enter value for fixed (NO): 
Enter value for enabled (YES): 
Enter value for plan_name (ID_sqlid_planhashvalue): my_cf_baseline
old  16: l_sql_id := '&&sql_id';
new  16: l_sql_id := 'c4m3t9jwvs3ht';
old  17: l_plan_hash_value := to_number('&&plan_hash_value');
new  17: l_plan_hash_value := to_number('3947981921');
old  18: l_fixed := '&&fixed';
new  18: l_fixed := 'NO';
old  19: l_enabled := '&&enabled';
new  19: l_enabled := 'YES';
old  40:     decode('&&plan_name','X0X0X0X0','SQLID_'||'&&sql_id'||'_'||'&&plan_hash_value','&&plan_
new  40:     decode('my_cf_baseline','X0X0X0X0','SQLID_'||'c4m3t9jwvs3ht'||'_'||'3947981921','my_cf_
Baseline MY_CF_BASELINE created.
SQL> 

And now I re-run the statement:

SQL> with T as
  2  ( select /*+ find_me */ 
  3           distinct * 
  4    from   TABLE(cast( str2tbl( 'DBMS_PIPE,DBMS_OUTPUT,UTL_FILE' ) as str2tblType) ) t 
  5    where rownum > 0 )
  6  select * from data, t 
  7  where data.object_name = t.column_value;

....

5 rows selected.

SQL> 
SQL> select * from table(dbms_xplan.display_cursor(format=>'-bytes'));

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
SQL_ID  c4m3t9jwvs3ht, child number 2

An uncaught error happened in prepare_sql_statement : ORA-01403: no data found

NOTE: cannot fetch plan for SQL_ID: c4m3t9jwvs3ht, CHILD_NUMBER: 2
      Please verify value of SQL_ID and CHILD_NUMBER;
      It could also be that the plan is no longer in cursor cache (check v$sql_plan)


8 rows selected.

SQL> 

Oh dear. Can we just ignore that and come back to it at a later date? Let’s stick a pin in it, as they say.

I’ll just try again:

SQL> with T as
  2  ( select /*+ find_me */ 
  3           distinct * 
  4    from   TABLE(cast( str2tbl( 'DBMS_PIPE,DBMS_OUTPUT,UTL_FILE' ) as str2tblType) ) t 
  5    where rownum > 0 )
  6  select * from data, t 
  7  where data.object_name = t.column_value;

........

5 rows selected.

SQL> 
SQL> select * from table(dbms_xplan.display_cursor(format=>'-bytes'));

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
SQL_ID  c4m3t9jwvs3ht, child number 2
-------------------------------------
with T as ( select /*+ find_me */          distinct *   from
TABLE(cast( str2tbl( 'DBMS_PIPE,DBMS_OUTPUT,UTL_FILE' ) as str2tblType)
) t   where rownum > 0 ) select * from data, t where data.object_name =
t.column_value

Plan hash value: 3947981921

--------------------------------------------------------------------------------------------
| Id  | Operation                               | Name     | Rows  | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                        |          |       | 24591 (100)|          |
|   1 |  NESTED LOOPS                           |          |       |            |          |
|   2 |   NESTED LOOPS                          |          | 15014 | 24591   (1)| 00:02:03 |
|   3 |    VIEW                                 |          |  8168 |    36   (9)| 00:00:01 |
|   4 |     HASH UNIQUE                         |          |  8168 |    36   (9)| 00:00:01 |
|   5 |      COUNT                              |          |       |            |          |
|*  6 |       FILTER                            |          |       |            |          |
|   7 |        COLLECTION ITERATOR PICKLER FETCH| STR2TBL  |  8168 |    34   (3)| 00:00:01 |
|*  8 |    INDEX RANGE SCAN                     | DATA_IDX |     2 |     2   (0)| 00:00:01 |
|   9 |   TABLE ACCESS BY INDEX ROWID           | DATA     |     2 |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   6 - filter(ROWNUM>0)
   8 - access("DATA"."OBJECT_NAME"="T"."COLUMN_VALUE")

Note
-----
   - SQL plan baseline SQL_PLAN_6w2j9bx7xvu6h5268a54a used for this statement


34 rows selected.

SQL> 

And what do we have in V$SQL?

SQL> select sql_id, child_number, plan_hash_value, executions, is_shareable, sql_plan_baseline 
  2  from v$sql 
  3  where sql_id = 'c4m3t9jwvs3ht';

SQL_ID        CHILD_NUMBER PLAN_HASH_VALUE EXECUTIONS I SQL_PLAN_BASELINE
------------- ------------ --------------- ---------- - ------------------------------
c4m3t9jwvs3ht            0       892089582          1 N
c4m3t9jwvs3ht            2      3947981921          1 Y SQL_PLAN_6w2j9bx7xvu6h5268a54a

2 rows selected.

SQL> 

So we can see that the baseline worked.

We can also see that the baseline, in this example, operated independently of the cardinality feedback so whilst we got our nice plan from the more accurate cardinality, we got the estimates from the default estimate of the collection.

Which is not exactly what I expected.

From Kerry’s post, he managed to get a statement using both a baseline and cardinality feedback as illustrated by his dbms_xplan output of:

Note
-----
   - SQL plan baseline SQLID_0CM4R08VJ075R_1255158658 used for this statement
   - cardinality feedback used for this statement

This is what I expected to get and then I was going to flush the shared pool in the hope of having a baseline and the original non-feedback estimates.

But no matter how many times I re-ran the statement I couldn’t get the cardinality feedback to kick back in.

So, unfortunately, I got to that end result without the all-important bit in the middle. Which is a bit weird. And warrants further investigation at some point.

I thought maybe the MONITOR hint might be the key difference to getting feedback working with baselines but not according to what I’ve tried so far. I couldn’t get the two features to work together.

But the bottom line is that I think this shows that the baseline did not use anything to maintain that those nice accurate estimates from cardinality feedback, only the end result plan.

And that’s backed up by the set of hints in the baseline which doesn’t include any OPT_ESTIMATE.

Suffice to say that this is really fledgling functionality and, as such, you expect subtle and no-so-subtle tweaks along the way as illustrated by the possible change of behaviour with bind variables when comparing my results above with the ones in Tom’s original article.

JF – Join Factorization

Just a quickie. No nice isolated demo etc.

I was just looking at rewriting a query that someone asked me to review.

I was playing with what should be the initial driving logic of a bigger query, changing it from a DECODE to a UNION ALL (or an OR) to make use of a FILTER condition.

i.e. from something like:

SELECT t.*
FROM   yyyyy c
,      xxxxx t
WHERE  c.trga_code   = 'LON'
AND    t.cmpy_num    = c.cmpy_num
AND    DECODE(:b1,'S',t.value_date,t.tran_date) >= TO_DATE(:b2,'YYYYMMDD')
AND    DECODE(:b1,'S',t.value_date,t.tran_date) <= TO_DATE(:b4,'YYYYMMDD');

to something not dissimilar to:

SELECT t.*
FROM   yyyyy c
,      xxxxx t
WHERE  c.trga_code   = 'LON'
AND    t.cmpy_num    = c.cmpy_num
AND    :b1           = 'S'
AND    t.value_date >= TO_DATE(:b2,'YYYYMMDD')
AND    t.value_date <= TO_DATE(:b4,'YYYYMMDD')
UNION ALL
SELECT t.*
FROM   yyyyy c
,      xxxxx t 
WHERE  c.trga_code  = 'LON'
AND    t.cmpy_num   = c.cmpy_num
AND    :b1         != 'S'
AND    t.tran_date >= TO_DATE(:b2,'YYYYMMDD')
AND    t.tran_date <= TO_DATE(:b4,'YYYYMMDD');

And I got an unexpected execution plan:

---------------------------------------------------------------------------------------------
| Id  | Operation                              | Name               | Rows  | Pstart| Pstop |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                       |                    | 13271 |       |       |
|*  1 |  HASH JOIN                             |                    | 13271 |       |       |
|*  2 |   INDEX RANGE SCAN                     | YYYYY_3_IDX        |   268 |       |       |
|   3 |   VIEW                                 | VW_JF_SET$49BA79CF | 13304 |       |       |
|   4 |    UNION-ALL                           |                    |       |       |       |
|*  5 |     FILTER                             |                    |       |       |       |
|   6 |      PARTITION LIST ALL                |                    |  6652 |     1 |    11 |
|*  7 |       TABLE ACCESS BY LOCAL INDEX ROWID| XXXXX              |  6652 |     1 |    11 |
|*  8 |        INDEX RANGE SCAN                | XXXXX_16_IDX       |   112K|     1 |    11 |
|*  9 |     FILTER                             |                    |       |       |       |
|  10 |      PARTITION LIST ALL                |                    |  6652 |     1 |    11 |
|* 11 |       TABLE ACCESS BY LOCAL INDEX ROWID| XXXXX              |  6652 |     1 |    11 |
|* 12 |        INDEX RANGE SCAN                | XXXXX_4_IDX        | 67411 |     1 |    11 |
---------------------------------------------------------------------------------------------

That’s not quite what I was expecting – weird.

I was sitting here thinking “what’s that? … vw_jf? vw_jf?… hang on … vw_jf … jf … jf rings a bell… join factorization”.
Bingo.

See the Oracle Optimizer Blog for more information on Join Factorization and where they summarise the feature as

Join factorization is a cost-based transformation.
It can factorize common computations from branches in a UNION ALL query which can lead to huge performance improvement.

Fair enough. But I don’t want it. Not just yet. Let’s turn it off.
(Although I’ve mentioned before that I don’t tend to like being too prescriptive regarding a “correct plan”, in this case, for my purposes, I don’t want it doing that. I might let the CBO reconsider it later once I piece everything back together but for now…)

V$SQL_HINT mentions a NO_FACTORIZE_JOIN hint.

Unfortunately (or perhaps fortunately given the vast array of undocumented hints exposed here), it doesn’t tell us how to use it and I fiddled around for a bit but couldn’t get it to work.

So, the proper approach is to use ALTER SESSION or OPT_PARAM to change the setting of _optimizer_join_factorization (don’t go changing the setting of hidden parameters without the approval of Oracle Support, etc, etc, etc).

Which gave me what I wanted (for now).

--------------------------------------------------------------------------------------------- 
| Id  | Operation                             | Name                | Rows  | Pstart| Pstop | 
--------------------------------------------------------------------------------------------- 
|   0 | SELECT STATEMENT                      |                     | 13272 |       |       | 
|   1 |  UNION-ALL                            |                     |       |       |       | 
|*  2 |   FILTER                              |                     |       |       |       | 
|*  3 |    HASH JOIN                          |                     |  6636 |       |       | 
|*  4 |     INDEX RANGE SCAN                  | YYYYY_3_IDX         |   268 |       |       | 
|   5 |     PARTITION LIST SUBQUERY           |                     |  6652 |KEY(SQ)|KEY(SQ)| 
|*  6 |      TABLE ACCESS BY LOCAL INDEX ROWID| XXXXX               |  6652 |KEY(SQ)|KEY(SQ)| 
|*  7 |       INDEX RANGE SCAN                | XXXXX_16_IDX        |   112K|KEY(SQ)|KEY(SQ)| 
|*  8 |   FILTER                              |                     |       |       |       | 
|*  9 |    HASH JOIN                          |                     |  6636 |       |       | 
|* 10 |     INDEX RANGE SCAN                  | YYYYY_3_IDX         |   268 |       |       | 
|  11 |     PARTITION LIST SUBQUERY           |                     |  6652 |KEY(SQ)|KEY(SQ)| 
|* 12 |      TABLE ACCESS BY LOCAL INDEX ROWID| XXXXX               |  6652 |KEY(SQ)|KEY(SQ)| 
|* 13 |       INDEX RANGE SCAN                | XXXXX_4_IDX         | 67411 |KEY(SQ)|KEY(SQ)| 
--------------------------------------------------------------------------------------------- 

P.S. This is not meant to be an investigation into the whys and wherefores of why JF was picked and whether it was a good choice, nor any reflection of whether JF is a good thing and whether there are any issues with it.

Further sources on Join Factorization:
Oracle Optimizer Blog – Join Factorization
Jože Senegačnik – Query Transformations
Timur Akhmadeev – Join Factorization

Further information on those Internal View names:
Jonathan Lewis – Internal Views

Index Skip Scan skipping specified column

Everything I can remember reading about the Index Skip Scan access path has always talked about it being used in queries where the leading column in the index is not specified in a query AND where that leading column has a relatively low number of distinct values in that leading column.

For example:

SQL> drop table t1;

Table dropped.

SQL> 
SQL> create table t1
  2  as
  3  select case when mod(rownum,2) = 0
  4              then 'Y'
  5              else 'N'
  6         end flagyn
  7  ,      rownum id
  8  ,      rpad(rownum,200,'0') col2
  9  from   dual 
 10  connect by rownum <=100000;

Table created.

SQL> 
SQL> create index i1 on t1 (flagyn, id);

Index created.

SQL> 
SQL> exec dbms_stats.gather_table_stats(USER,'T1');

PL/SQL procedure successfully completed.

SQL> explain plan for
  2  select *
  3  from   t1
  4  where  id   = 3222;

Explained.

SQL>         
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------

-------------------------------------------------------------------------
| Id  | Operation                   | Name | Rows  | Bytes | Cost (%CPU)|
-------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |      |     1 |   208 |     4   (0)|
|   1 |  TABLE ACCESS BY INDEX ROWID| T1   |     1 |   208 |     4   (0)|
|*  2 |   INDEX SKIP SCAN           | I1   |     1 |       |     3   (0)|
-------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------

   2 - access("ID"=3222)
       filter("ID"=3222)


18 rows selected.

SQL> 

So, I was slightly surprised to the see this access path being used on a query where the leading column WAS specified. I can’t remember having noticed this before in similar circumstances.

Here is the output from the actual situation (test case not supplied).

SQL> explain plan for
  2  SELECT  *
  3  FROM   daily_accr_pos accr
  4  WHERE  accr.inst_num       = 243961
  5  AND    accr.accr_date      >= to_date(20110221120000,'YYYYMMDDHHMISS')
  6  AND    accr.accr_date      <= to_date(20110303120000,'YYYYMMDDHHMISS');

Explained.

SQL>         
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------

-----------------------------------------------------------------------------------------
| Id  | Operation                   | Name                 | Rows  | Bytes | Cost (%CPU)|
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                      |     1 |    86 |    11  (19)|
|   1 |  TABLE ACCESS BY INDEX ROWID| DAILY_ACCR_POS       |     1 |    86 |    11  (19)|
|*  2 |   INDEX SKIP SCAN           | DAILY_ACCR_POS_8_IDX |     1 |       |    10  (20)|
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------

   2 - access("ACCR"."ACCR_DATE">=TO_DATE(' 2011-02-21 12:00:00', 'syyyy-mm-dd
              hh24:mi:ss') AND "ACCR"."INST_NUM"=243961 AND "ACCR"."ACCR_DATE"<=TO_DATE('
              2011-03-03 12:00:00', 'syyyy-mm-dd hh24:mi:ss'))
       filter("ACCR"."INST_NUM"=243961)

20 rows selected.

Note that this index is defined on (ACCR_DATE, INST_NUM), both of which are supplied literals in the query above and also noted as ACCESS PREDICATES in the predicate section.

Some extra metadata about the index:

SQL> select index_name iname
  2  ,      uniqueness uq
  3  ,      blevel
  4  ,      leaf_blocks num_lf
  5  ,      distinct_keys dst_ky
  6  ,      avg_leaf_blocks_per_key avg_lf
  7  ,      avg_data_blocks_per_key avg_dt 
  8  ,      clustering_factor cf
  9  ,      num_rows  
 10  from   dba_indexes 
 11  where  index_name = 'DAILY_ACCR_POS_8_IDX';

INAME                UQ            BLEVEL     NUM_LF     DST_KY     AVG_LF     AVG_DT         CF
-------------------- --------- ---------- ---------- ---------- ---------- ---------- ----------
  NUM_ROWS
----------
DAILY_ACCR_POS_8_IDX NONUNIQUE          3     167844   30386593          1          1   44902638
  50323152


SQL> 
SQL> select column_name
  2  ,      column_position
  3  from   dba_ind_columns 
  4  where  index_name = 'DAILY_ACCR_POS_8_IDX';

COLUMN_NAME     COLUMN_POSITION
--------------- ---------------
ACCR_DATE                     1
INST_NUM                      2

SQL> 
SQL> select column_name
  2  ,      data_type
  3  ,      num_distinct num_dis
  4  ,      low_value, display_raw(low_value, data_type) lo_val
  5  ,      high_value, display_raw(high_value, data_type) hi_val
  6  ,      density
  7  ,      num_nulls
  8  from   dba_tab_columns 
  9  where  table_name = 'DAILY_ACCR_POS' 
 10  and    column_name in ('ACCR_DATE','INST_NUM');

COLUMN_NAME     DATA_TYP    NUM_DIS LOW_VALUE       LO_VAL          HIGH_VALUE      HI_VAL
--------------- -------- ---------- --------------- --------------- --------------- ---------------
   DENSITY  NUM_NULLS
---------- ----------
INST_NUM        NUMBER        91824 C112            17              C3646453        999982
 .00001089          0

ACCR_DATE       DATE           5693 77C30B07010101  07-NOV-1995     78950C1F010101  31-DEC-2049
.000175654          0


SQL> 

SQL_PLAN – other_xml

Just a quick follow-up to my previous post on investigating bad plans using AWR & ASH.

In that article, I mentioned

  1. that it was a shame that the ACCESS_PREDICATES and FILTER_PREDICATES columns were not populated (how this cannot be classified as a bug is beyond comprehension) and
  2. that I was using DBA_HIST_SQLBIND to investigate various peeked binds for that particular SQL based on the LAST_CAPTURED column

However, I should also have mentioned that, from 10gR2 onwards, populated in both V$SQL_PLAN and DBA_HIST_SQL_PLAN, there is a very useful column called OTHER_XML.

From the documentation:

OTHER_XML: “Provides extra information specific to an execution step of the execution plan. The content of this column is structured using XML because it allows multiple pieces of information to be stored, including the following:

  • Name of the schema against which the query was parsed.
  • Release number of the Oracle Database that produced the explain plan.
  • Hash value associated with the execution plan.
  • Name (if any) of the outline or the SQL profile used to build the execution plan.
  • Indication of whether or not dynamic sampling was used to produce the plan.
  • The outline data, a set of optimizer hints that can be used to regenerate the same plan.

Here is a sample of the peeked bind section from my sql with the problem plan:


  <peeked_binds>
    <bind nam=”:B3″ pos=”3″ dty=”2″ pre=”0″ scl=”0″ mxl=”22″>c3023102</bind>
    <bind nam=”:B2″ pos=”4″ dty=”1″ csi=”873″ frm=”1″ mxl=”32″>524d46</bind>
    <bind nam=”:B1″ pos=”5″ dty=”12″ mxl=”7″>786d0a1f010101</bind>
    <bind nam=”:B3″ pos=”13″ ppo=”3″ dty=”2″ pre=”0″ scl=”0″ mxl=”22″/>
    <bind nam=”:B2″ pos=”14″ ppo=”4″ dty=”1″ csi=”873″ frm=”1″ mxl=”32″/>
    <bind nam=”:B1″ pos=”15″ ppo=”5″ dty=”12″ mxl=”7″/>
    <bind nam=”:B3″ pos=”16″ ppo=”3″ dty=”2″ pre=”0″ scl=”0″ mxl=”22″/>
    <bind nam=”:B1″ pos=”19″ ppo=”5″ dty=”12″ mxl=”7″/>
    <bind nam=”:B3″ pos=”20″ ppo=”3″ dty=”2″ pre=”0″ scl=”0″ mxl=”22″/>
    <bind nam=”:B5″ pos=”21″ ppo=”1″ dty=”2″ pre=”0″ scl=”0″ mxl=”22″/>
    <bind nam=”:B3″ pos=”22″ ppo=”3″ dty=”2″ pre=”0″ scl=”0″ mxl=”22″/>
    <bind nam=”:B6″ pos=”23″ dty=”2″ pre=”0″ scl=”0″ mxl=”22″>c10d</bind>
    <bind nam=”:B1″ pos=”24″ ppo=”5″ dty=”12″ mxl=”7″/>
    <bind nam=”:B6″ pos=”26″ ppo=”23″ dty=”2″ pre=”0″ scl=”0″ mxl=”22″/>
    <bind nam=”:B1″ pos=”27″ ppo=”5″ dty=”12″ mxl=”7″/>
    <bind nam=”:B1″ pos=”29″ ppo=”5″ dty=”12″ mxl=”7″/>
    <bind nam=”:B3″ pos=”31″ ppo=”3″ dty=”2″ pre=”0″ scl=”0″ mxl=”22″/>
    <bind nam=”:B6″ pos=”32″ ppo=”23″ dty=”2″ pre=”0″ scl=”0″ mxl=”22″/>
    <bind nam=”:B1″ pos=”33″ ppo=”5″ dty=”12″ mxl=”7″/>
  </peeked_binds>

I’m not sure if the repetition of bind names and positions but not values is normal – one for further investigation.

But bind variable 1 is what I was talking about previously as my date bind with the data skew and we can convert the values back out of their raw value using, for example, the overloaded DBMS_STATS.CONVERT_RAW_VALUE.

So, from above, taking my B1 bind of “786d0a1f010101″:

declare
l_date date;
begin
dbms_stats.convert_raw_value(rawval => '786d0a1f010101', resval => l_date);
dbms_output.put_line('dt: '||to_char(l_date,'DD/MM/YYYY HH24:MI:SS'));
end;
/

dt: 31/10/2009 00:00:00

Some other posts regarding OTHER_XML:
Kerry Osborne using OTHER_XML to create test scripts with bind variables
Randolf Geist using OTHER_XML to get outlines / create profiles

Top 3 Oracle Features of the Decade

Yesterday I made an early break for the annual New Year resolutions post.

However, there’s another end-of-year post that’s up for consideration in this particular year and, rather than looking forward to the coming year, that’s a look back over the past 10, 2000-2010.

So, I ask you what are your top 3 Oracle RDBMS features over the past decade.

Oracle 8i was 1999 so we’re really talking about:

  1. 9iR1 in 2001
  2. 9iR2 in 2002
  3. 10gR1 in 2003
  4. 10gR2 in 2005
  5. 11gR1 in 2007
  6. 11gR2 in 2009

A Top 3 is very, very challenging when you think of some of the enhancements which have come along.

Obviously it depends on your own, personal perspective – naturally, mine are going to have a Development bias.

Each version fixes a significant number of bugs from the previous release, but I seem to remember 8i being a rush job for the buzzwords ahead of the new millenium (the early patches at least), 9iR1 a hatchet job and 9iR2 being a relative bastion of stability.

Then again it’s also a long time ago now and my memories may be unreliable.

The New Features Guides of 9i (links above) lists amongst others:

  • LogMiner improvements, DataGuard, RAC, Flashback Query, Some online redefinition & reorganisation features, VPD enhancements, Automatic Undo Management, dynamic memory management, spfles, RMAN improvements, native XML functionality and XMLDB, the MERGE statement, TIMESTAMPs and INTERVALs, CASE statement, External Tables, associative arrays/index-by tables indexed by VARCHAR2, Streams, CDC, Index Skip Scans, ANSI SQL, OMF, multiple blocksizes, dynamic sampling, table compression, subquery factoring, pipelined table functions, etc.

A list of new features from 10g (or my list at least) is shorter, but even so it seems a bigger hitter in terms of weighty marketing-savvy acronyms and features:

  • ADDM, ASH, ASM, AWR, Automatic SQL Tuning, DataPump, Job Scheduler, SQL Access Advisor, HTMLDB, Online Table Redefinition, Oracle XE, DBMS_XPLAN…

11g has always felt like 10gR3. Nevertheless, there have been still some impressive features therein:

  • Adaptive Cursor Sharing, Result Cache, Database Resident Connection Pool Caching, Invisible Indexes, SecureFiles, Binary XML…

And then recently Exadata and the Oracle Database Machine (although I think it’s too soon to make any judgement on these and if they have a big impact then it will be mostly in the next decade).

For the summaries above, I’ve missed out a lot (probably accidentally – let me know – but I was rushing towards the end).

But I’ve focused on the initial introductions of headline new features.

However, I would argue that the gradual evolution of some of the main features and functionality has had some of the biggest impact as we are now at the end of the decade compared to the end of 1999. In addition, some of the tweaks and internals have also had a big impact whilst avoiding the headlines – mutexes for example.

Think about how the CBO has moved on over the past ten years.

Or how HTMLDB has morphed and moved on to APEX.

Or how parallel capabilities have developed. Or partitoning.

Maybe you like your GUIs like OEM.

Or the evolution of IDEs has been revolutionary for you e.g. SQL Developer, or the explosion of functionality in third party tools like Toad or Quest Code Tester for Oracle.

Or the combination of several features – external tables plus pipelined functions are pretty cool for loading in data, for example. Or partitioning plus parallel.

All of these are up for consideration.

What’s not up for consideration is stuff that predates the decade. There are several features which I thought of while doing this and which turned out to be introduced in 8i or even before. Man, tempus fugit…

So, my Top 3 is based on what I do or use in Oracle most days (or even what I don’t have to do anymore) because of features introduced or evolved over the past decade. The evolution of collection functionality in SQL and PLSQL came close. A bit further behind was some of the XML capabilities. A whole host of ineligible features from older versions were scrubbed out. However, in no particular order, my vote goes to:

  • Analytics – It’s amazing how many time I end up using analytic functions. As Tom Kyte says “Analytics rock, Analytics roll”. Countless multipasses of data have been avoided with these babies.(Analytics ruled ineligible as an 8i introduction, dammit)
  • The WITH clause aka Subquery Factoring – Most of my sql, if it has any sort of complexity, ends up using this feature; fantastic for breaking up and understanding a complex bit of sql that you’ve never seen before; great at encouraging you to think in SETS.
  • AWR / ASH / Wait model evolution – Shame AWR & ASH are licensed separately and there are alternatives for earlier Oracle versions or if you don’t want to fork out the big bucks. But so useful for diagnosing performance problems, particularly retrospectively and/or identifying session problems which might have drowned previously in a system-level Statspack.
  • DBMS_XPLAN – Another feature that I use day in, day out. So easy to get explain plans, actual plans from memory or from AWR. Invaluable!

I would imagine that if you were a production DBA then the evolution of RMAN and Flashback has been fantastically useful and time saving.

These are not things that I’ve tended to use much in my role.

But if something’s revolutionised your decade, let me know.

_replace_virtual_columns = false

A follow-up to my post yesterday.

A quick recap on the situation:

A sql statement such as this:


SELECT col1
FROM my_schema.table1 tab1
WHERE …
AND NVL(col2,’UNASSIGNED’) = SYS_CONTEXT(’MY_CONTEXT’,'MY_COL2_VALUE’)

is raising an error such as this:


ORA-00904: “MY_SCHEMA”.”TAB1″.”SYS_NC00062$”: invalid identifier

where SYS_NC00062$ is not a column directly specified in the sql but a virtual column related to a function-based index.

This matches a bug which is currently under development and not yet published (Bug 7026584).

The explanation is that the 11g optimizer uses a new mechanism to work with function-based indexes at parse time, a new mechanism which as yet not fully robust.


The problem comes when virtual column processing is done from qkacol where it clones the predicates and later tries to resolve the copied columns in the predicate with the supplied fro. The root cause here, the unnesting driver generates an alias for table and adds it for a correlated column. [sic]

In addition to the 10046 and 10053 trace files, a 904 errorstack trace file provided the information required:


alter system set events='904 trace name errorstack';
...
alter system set events='904 trace name errorstack off';

Oracle’s currently (at the time of writing this) recommended solution to that bug to set undocumeted parameter _replace_virtual_columns to false.

I’m still awaiting an explanation on that cost_io virtual column adjustment.

Follow

Get every new post delivered to your Inbox.

Join 62 other followers