Outer Join with OR and Lateral View Decorrelation

Use of ANSI SQL is a personal thing.

Historically I have not been a fan apart from where it makes things easier/possible.

This reticence was mainly due to optimizer bugs and limitations in the earlier days.

Recently I have been using it much more because I find that the developers I interact with prefer it / understand it better.

You might/should be aware that Oracle will rewrite ANSI SQL to an Oracle syntax representation, this transformation being listed in the optimizer trace file.

You might/should also be aware that Oracle outer join syntax does not allow OR or IN:

drop table t1;
drop table t2;

create table t1
as
select floor((rownum+1)/2) col1
,      case when mod(rownum,2) = 0 then 1 else 2 end col2
,      10 col3
from   dual
connect by rownum <= 20;

create table t2
as
select rownum col1
,      case when mod(rownum,2) = 0 then 2 else 1 end col3
from   dual
connect by rownum <= 10;

select *
from   t1
,      t2
where  t1.col1 = t2.col1 (+) 
and  ((t1.col2 = 1
and    t2.col3 (+) > t1.col3)
or    (t1.col2 = 2
and    t2.col3 (+) < t1.col3));

ORA-01719: outer join operator (+) not allowed in operand of OR or IN

ANSI SQL remedies this:

alter session tracefile_identifier = 'domlg1';
alter session set events 'trace[rdbms.SQL_Optimizer.*]';
select *
from   t1
left join t2
on    t1.col1 = t2.col1
and ((t1.col2 = 1
and   t2.col3 > t1.col3)
or   (t1.col2 = 2
and   t2.col3 < t1.col3));

alter session set events 'trace off';

But it comes at a price.

Note the execution plan:

----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |    20 |  1300 |    42   (0)| 00:00:01 |
|   1 |  NESTED LOOPS OUTER |      |    20 |  1300 |    42   (0)| 00:00:01 |
|   2 |   TABLE ACCESS FULL | T1   |    20 |   780 |     2   (0)| 00:00:01 |
|   3 |   VIEW              |      |     1 |    26 |     2   (0)| 00:00:01 |
|*  4 |    TABLE ACCESS FULL| T2   |     1 |    26 |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - filter("T1"."COL1"="T2"."COL1" AND ("T1"."COL2"=1 AND
              "T2"."COL3">"T1"."COL3" OR "T1"."COL2"=2 AND "T2"."COL3"<"T1"."COL3"))   

Now, maybe you will have better luck than me but no matter what I try I cannot change the NESTED LOOPS OUTER operation (I could rewrite it to do two joins rather than one and remove the OR).

So, if that lateral view involves some full table scans or other significant operations, they might be very expense on the outer operation of a nested loop.

The reason is in the optimizer trace.

Query after View Removal
******* UNPARSED QUERY IS ********
SELECT "T1."COL1" "COL1", "T1."COL2" "COL2", "T1."COL3" "COL3", "VW_LAT_AE9E49E8"."ITEM_1_0" "COL1", "VW_LAT_AE9E49E8"."ITEM_2_1" "COL3" FROM "DOM"."T1" "T1", LATERAL( (SELECT "T2"."COL1" "ITEM_1_0", "T2"."COL3" "ITEM_2_1" FROM "DOM"."T2" "T2" WHERE "T1"."COL1"="T2"."COL1" AND ("T1"."COL2"=1 AND "T2"."COL3">"T1"."COL3" OR "T1"."COL2"=2 AND "T2"."COL3" < "T1"."COL3"))) (+) "VW_LAT_AE9E49E8"
DCL:Checking validity of lateral view decorrelation SEL$BCD4421C (#1)
DCL: Bypassed: view has non-well-formed predicate
DCL: Failed decorrelation validity for lateral view block SEL$BCD4421C (#1)

The OR prevents the decorrelation which seems to mean that we’re stuck with a NESTED LOOP for now.

Further Reading on ANSI:
Oracle Optimizer Blog
Jonathan Lewis on ANSI Outer
Jonathan Lewis on ANSI

Did it really fix it 1: OFE & nasty FIRST_ROWS plan on 11.2.0.4

Plan degradations on upgrade are normal.

This one’s no different.

On further investigation, turned out application was setting optimizer_mode = first_rows somewhere.

First point about this is that first_rows really shouldn’t be used either as a hint or as an optimizer_mode.

What does FIRST_ROWS mean?
From 11g doco:

The optimizer uses a mix of costs and heuristics to find a best plan for fast delivery of the first few rows.

If any sort of FIRST_ROWS type intervention is required, then it might be better to be using a more modern, fully-costed FIRST_ROWS_n (where n = 1 | 10 | 100 | 1000).
If it’s even required.

The 12c documentation is clearer about this:

FIRST_ROWS is available for backward compatibility and plan stability; use FIRST_ROWS_n instead.

But I don’t think we should attach too much credence to that statement not being in the 11g documentation.

Here’s the problem plan.

Pre upgrade, snippet of the plan looked as per below.
Please note position of T1.

-------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                   | Name                        | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                            |                             |    34 | 13668 |  3003   (1)| 00:00:22 |       |       |
|   1 |  SORT ORDER BY                              |                             |    34 | 13668 |  3003   (1)| 00:00:22 |       |       |
|   2 |   NESTED LOOPS OUTER                        |                             |    34 | 13668 |  3002   (1)| 00:00:22 |       |       |
|   3 |    NESTED LOOPS OUTER                       |                             |    34 | 12648 |  2968   (1)| 00:00:22 |       |       |
|   4 |     NESTED LOOPS OUTER                      |                             |    34 | 12274 |  2935   (1)| 00:00:22 |       |       |
|   5 |      NESTED LOOPS OUTER                     |                             |    34 | 11900 |  2867   (1)| 00:00:21 |       |       |
|   6 |       NESTED LOOPS OUTER                    |                             |    34 | 11628 |  2833   (1)| 00:00:21 |       |       |
|   7 |        NESTED LOOPS OUTER                   |                             |    34 | 10404 |  2799   (1)| 00:00:21 |       |       |
|   8 |         NESTED LOOPS OUTER                  |                             |    34 |  8228 |  2765   (1)| 00:00:20 |       |       |
|   9 |          NESTED LOOPS                       |                             |    34 |  6732 |  2731   (1)| 00:00:20 |       |       |
|  10 |           NESTED LOOPS                      |                             |    69 | 12282 |  2524   (1)| 00:00:19 |       |       |
|  11 |            NESTED LOOPS                     |                             |    69 |  9591 |  2386   (1)| 00:00:18 |       |       |
|* 12 |             TABLE ACCESS FULL               | T1                          |  1658 |   213K|   725   (3)| 00:00:06 |       |       |
|* 13 |             TABLE ACCESS BY INDEX ROWID     | T2                          |     1 |     7 |     1   (0)| 00:00:01 |       |       |
|* 14 |              INDEX UNIQUE SCAN              | T2_U1                       |     1 |       |     0   (0)| 00:00:01 |       |       |
|  15 |            TABLE ACCESS BY INDEX ROWID      | T3                          |     1 |    39 |     2   (0)| 00:00:01 |       |       |
|* 16 |             INDEX UNIQUE SCAN               | T3_U1                       |     1 |       |     1   (0)| 00:00:01 |       |       |
|  17 |           TABLE ACCESS BY GLOBAL INDEX ROWID| T4                          |     1 |    20 |     3   (0)| 00:00:01 | ROWID | ROWID |
|* 18 |            INDEX RANGE SCAN                 | T4_N1                       |     1 |       |     2   (0)| 00:00:01 |       |       |

Post upgrade, plan has degraded to:

--------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                    | Name                        | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                             |                             |    69 | 27738 |   508M  (3)|999:59:59 |       |       |
|   1 |  NESTED LOOPS OUTER                          |                             |    69 | 27738 |   508M  (3)|999:59:59 |       |       |
|   2 |   NESTED LOOPS OUTER                         |                             |    69 | 26979 |   508M  (3)|999:59:59 |       |       |
|   3 |    NESTED LOOPS OUTER                        |                             |    69 | 26220 |   508M  (3)|999:59:59 |       |       |
|   4 |     NESTED LOOPS OUTER                       |                             |    69 | 23184 |   508M  (3)|999:59:59 |       |       |
|   5 |      NESTED LOOPS OUTER                      |                             |    69 | 22632 |   508M  (3)|999:59:59 |       |       |
|   6 |       NESTED LOOPS OUTER                     |                             |    69 | 20562 |   508M  (3)|999:59:59 |       |       |
|   7 |        NESTED LOOPS OUTER                    |                             |    69 | 16146 |   508M  (3)|999:59:59 |       |       |
|   8 |         NESTED LOOPS                         |                             |    69 | 13662 |   508M  (3)|999:59:59 |       |       |
|   9 |          NESTED LOOPS                        |                             |  1675 |   312K|   508M  (3)|999:59:59 |       |       |
|  10 |           NESTED LOOPS                       |                             |   703K|    39M|  3476K  (1)| 06:54:55 |       |       |
|* 11 |            TABLE ACCESS BY GLOBAL INDEX ROWID| T4                          |   703K|    13M|  2068K  (1)| 04:06:54 | ROWID | ROWID |
|  12 |             INDEX FULL SCAN                  | T4_N2                       |  2240K|       |  7943   (1)| 00:00:57 |       |       |
|  13 |            TABLE ACCESS BY INDEX ROWID       | T3                          |     1 |    39 |     2   (0)| 00:00:01 |       |       |
|* 14 |             INDEX UNIQUE SCAN                | T3_U1                       |     1 |       |     1   (0)| 00:00:01 |       |       |
|* 15 |           TABLE ACCESS FULL                  | T1                          |     1 |   132 |   718   (3)| 00:00:06 |       |       |

In the post-upgrade plan, the most pertinent part was step 15.

A FULL TABLE SCAN as the inner rowsource of a NESTED LOOP, in other words for every row in the driving rowsource do a full table scan.”

Standard reaction to such a degradation is to test the SQL with OPTIMIZER_FEATURES_ENABLE (OFE).

With OFE set either via OPT_PARAM hint or via ALTER SESSION then the SQL reverted to pre-upgrade plan.

This is good!

But… how much confidence can we have that this actually fixed the problem?

It would be great if we could find a bug which was definitively responsible for this change in execution plan.

OFE is an umbrella but it doesn’t turn off ALL fixes/optimizer features in a release.

Many of those fixes covered by the OFE umbrella are represented in V$SYSTEM_FIX_CONTROL

So, I went looking for anything which sounded related and went back to default 11.2.0.4 OFE and starting turning off the BUGNOs one-by-one – no difference, still a bad plan

Then still at OFE 11.2.0.4, I turned off all the BUGNOs…. no difference, still a bad plan.

At this point, my confidence in OFE as a fix for this problem was starting to waiver.

So, I tried to reduce the problem SQL to it’s smallest form representing the problem.

alter session set optimizer_mode = first_rows;
alter session set optimizer_features_enable = '11.2.0.4';

 explain plan for
 SELECT *
 FROM   T1
 ,      T3
 ,      T4
 WHERE  T1.COL1             = T3.COL1
 AND    T3.COL2             = T4.COL2
 AND    T1.DATE_COL        '+OUTLINE'));

This distilled statement reproduced the issue at the heart of the original statement:

----------------------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name    | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |         |  6029 |  4533K|  1612M  (3)|999:59:59 |       |       |
|   1 |  NESTED LOOPS                        |         |  6029 |  4533K|  1612M  (3)|999:59:59 |       |       |
|   2 |   NESTED LOOPS                       |         |  2237K|   795M|  6551K  (1)| 13:01:53 |       |       |
|   3 |    TABLE ACCESS BY GLOBAL INDEX ROWID| T4      |  2240K|   188M|  2068K  (1)| 04:06:53 | ROWID | ROWID |
|   4 |     INDEX FULL SCAN                  | T4_N2   |  2240K|       |  7943   (1)| 00:00:57 |       |       |
|   5 |    TABLE ACCESS BY INDEX ROWID       | T3      |     1 |   285 |     2   (0)| 00:00:01 |       |       |
|*  6 |     INDEX UNIQUE SCAN                | T3_U1   |     1 |       |     1   (0)| 00:00:01 |       |       |
|*  7 |   TABLE ACCESS FULL                  | T1      |     1 |   397 |   718   (3)| 00:00:06 |       |       |
----------------------------------------------------------------------------------------------------------------

Is this test case a 100% indication that OFE wasn’t applicable to the original problem?
Not sure.
But at this point I had littleconfidence that OFE provided any sort of reliable fix to the original SQL statement.

I don’t know what the internal heuristics are/were for FIRST_ROWS.

But that is a stupid place for a full table scan regardless of the number of rows expected to return.

If we remove one of the T1 predicates above and the various row estimates go up then the FTS moves to a driving position.

I could/should get a 10053 trace but FIRST_ROWS shouldn’t be in use in the first place.

Solution – hunt down where the app code is doing the FIRST_ROWS thing and exterminate.

All the other possible OPTIMIZER_MODE modes gave a plan for the original SQL which started like this:

--------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                   | Name | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                            |      |    35 | 14105 |  1386   (2)| 00:00:10 |       |       |
|   1 |  SORT ORDER BY                              |      |    35 | 14105 |  1386   (2)| 00:00:10 |       |       |
|   2 |   NESTED LOOPS OUTER                        |      |    35 | 14105 |  1385   (2)| 00:00:10 |       |       |
|*  3 |    HASH JOIN OUTER                          |      |    35 | 12565 |  1350   (2)| 00:00:10 |       |       |
|*  4 |     HASH JOIN OUTER                         |      |    35 | 11305 |  1315   (2)| 00:00:10 |       |       |
|   5 |      NESTED LOOPS OUTER                     |      |    35 |  9065 |  1280   (2)| 00:00:10 |       |       |
|   6 |       NESTED LOOPS OUTER                    |      |    35 |  8015 |  1245   (2)| 00:00:09 |       |       |
|   7 |        NESTED LOOPS OUTER                   |      |    35 |  7735 |  1210   (2)| 00:00:09 |       |       |
|   8 |         NESTED LOOPS OUTER                  |      |    35 |  7350 |  1176   (2)| 00:00:09 |       |       |
|   9 |          NESTED LOOPS                       |      |    35 |  6965 |  1106   (2)| 00:00:08 |       |       |
|  10 |           NESTED LOOPS                      |      |    69 | 12351 |   899   (3)| 00:00:07 |       |       |
|* 11 |            HASH JOIN                        |      |    69 |  9660 |   761   (3)| 00:00:06 |       |       |
|* 12 |             TABLE ACCESS FULL               | T2   |   237 |  1659 |    36   (6)| 00:00:01 |       |       |
|* 13 |             TABLE ACCESS FULL               | T1   |  1671 |   217K|   725   (3)| 00:00:06 |       |       |

So…. when optimizer upgrade problems strike, you should always try OFE.

But it’s also worth further investigation to confirm that OFE really is a reliable fix for the problem.

Plan Instability

There seems to me to be a relatively simple choice.

Either you except that the Oracle Optimizer has a wealth of complicated strategies and, in this complex effort to get the best executions it can, will inevitably either get it wrong sometimes (or take some extra executions to realise it’s wrong).

Or you stick your head in the sand and raise a bug for every unexpected poor execution or plan flip.

But let’s say that above else you wanted Plan Stability.

This is an widespread desire.

What would be the best strategy?
And to what lengths would you take it?

SQL Plan Management features are designed to give you some degree of stability.

You could baseline statements which have given you a problem in the past.

What if that is not deemed adequate?

So, on the one hand, you could baseline everything you could find.

Taking a long history of AWR and taking regularly snaps of V$SQL, you could put all in a SQL Plan Baseline.

But, depending on your application, you might regularly get new SQL appear unprotected by a baseline.
In one Hibernate application I know, a change in the application domain model often results in the table aliases changing, meaning that there is lots of brand new, but very familiar, SQL.
So, you then become dependant on having a reliable testing infrastructure which will generate nearly all the SQL you’re likely to get in advance of production.

In addition, you might want multiple plans in your baseline – for ACS for example – and then, once that extra bindset awareness is flushed out of memory, you then need a couple of executions to rediscover it (unless you use a SQL Patch to inject BIND_AWARE into the specific sql statements).

It’s certainly no guarantee of stability.

What is the alternative?

I like to argue that most problematic executions stem from two sources:

  1. Badly written SQL
  2. Using poor combinations of features – e.g. bind variable peeking + histograms

The other day I made a comment to someone that “if you’re heavily reliant on ACS, perhaps you should rethink whether you should be using literals rather than bind variables”.

So, you might then take the position that plan instability stems from increasing complexity in the optimizer.

In which case, maybe a viable strategy might be to turn off much of the complexity:

  • Bind Variable Peeking
  • Cardinality feedback
  • Adaptive Cursor Sharing
  • Dynamic Sampling
  • Adaptive direct path reads
  • 12c adaptive execution plans
  • Anything with the word “adaptive” in it?
  • Default Stats job and/or default method_opt histogram behaviour

This seems quite radical to many. Perhaps justifiably.

Perhaps it would be nice if there were new optimizer strategies – OPTIMIZER_MODE? – perhaps “AGGRESSIVE_PERFORMANCE” (maybe “ADAPTIVE_PERFORMANCE_WITH_THE_ODD_HICCUP” ?) vs “PERFORMANCE_STABILITY” which were an umbrella of many of the features above.

Thoughts?

To what extent are you concerned with performance stability, above all else?

How many of the optimizer features in each release are actually aligned with that widespread concern?

An example of where the optimizer should ignore a hint

Here’s an example of where the optimizer really should ignore a hint.

I’m testing an implementation of deliberately unusable index partitions – i.e. some unusable index partitions, some usable.

This is 11.2.0.3 so whilst there are enhancements in 12c, they’re no use here.

SQL> create table t1
  2  (col1 number)
  3  partition by list(col1)
  4  (partition p0 values(0),
  5   partition pdef values(default));

Table created.

SQL> create index i1 on t1 (col1) local unusable;

Index created.

SQL> alter index i1 rebuild partition pdef;

Index altered.

SQL> select * from t1;

no rows selected

SQL> select * from t1 partition (p0);

no rows selected

SQL> select /*+ index(t1 i1) */ * from t1 partition (p0);

no rows selected

SQL> select * from t1 where col1 = 0;

no rows selected

SQL> select /*+ index(t1 i1) */ * from t1 where col1 = 0;
select /*+ index(t1 i1) */ * from t1 where col1 = 0
*
ERROR at line 1:
ORA-01502: index 'I1' or partition of such index is in unusable state

SQL>

ORs, IN lists and LNNVL

I’ve previously written about manually rewriting an OR condition into a UNION ALL using LNNVL.

This is a description of a performance issue observed in the real world from the optimizer coming up with a CONCATENATION operation against many child operations including an INLIST operator and other children which then have to use LNNVL as a filter.

Really this is nothing more than an illustration of our old performance killing friend the function call from within SQL, particularly from within predicates, and the problem of context switching.

Ok. So, the real world example is a crazy query.
It contains 1091 OR clauses each stipulating a unique key lookup.
Yes, “1091” OR clauses.
I know, I know.
Probably itself a workaround to avoid the 1000 limit on an IN list, who knows?
Anyway.

I’m going to use a table just to illustrate the execution plan and then use some metrics from the execution of the real world example.

drop table t1;

create table t1
(col1 number
,col2 number
,col3 varchar2(10)
,constraint pk_t1 primary key (col1, col2));

insert into t1
select mod(rownum,10000)+1 col1
,      ceil(rownum/10000)  col2
,      rpad('X',10,'X')    col3
from   dual
connect by rownum <= 100000
order by col1, col2;

commit;

If we use a simple multi-column OR condition, we don’t get the desired LNNVLs.

explain plan for
select *
from   t1
where (col1 = 1
and    col2 = 1)
or    (col1 = 2
and    col2 = 2)
or    (col1 = 3
and    col2 = 3)
or    (col1 = 3
and    col2 = 4);

select * from table(dbms_xplan.display);

The optimizer just uses an INLIST iterator:

--------------------------------------------------------------------------------------
| Id  | Operation                    | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |       |  1129 | 37257 |     5   (0)| 00:00:01 |
|   1 |  INLIST ITERATOR             |       |       |       |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID| T1    |  1129 | 37257 |     5   (0)| 00:00:01 |
|*  3 |    INDEX UNIQUE SCAN         | PK_T1 |     7 |       |     4   (0)| 00:00:01 |
--------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   3 - access(("COL1"=1 AND "COL2"=1 OR "COL1"=2 AND "COL2"=2 OR "COL1"=3 AND 
              "COL2"=3 OR "COL1"=3 AND "COL2"=4))

However, if we combine an IN list with some of those ORs:

explain plan for
select *
from   t1
where (col1 = 1
and    col2 = 1)
or    (col1 = 2
and    col2 = 2)
or    (col1 = 3
and    col2 IN (3,4));

select * from table(dbms_xplan.display);

Which gives

---------------------------------------------------------------------------------------
| Id  | Operation                     | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |       |  1140 | 37620 |     2   (0)| 00:00:01 |
|   1 |  CONCATENATION                |       |       |       |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID | T1    |    11 |   363 |     1   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN           | PK_T1 |     1 |       |     2   (0)| 00:00:01 |
|   4 |   INLIST ITERATOR             |       |       |       |            |          |
|   5 |    TABLE ACCESS BY INDEX ROWID| T1    |  1129 | 37257 |     1   (0)| 00:00:01 |
|*  6 |     INDEX UNIQUE SCAN         | PK_T1 |     1 |       |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   3 - access("COL1"=3)
       filter("COL2"=3 OR "COL2"=4)
   6 - access(("COL1"=1 AND "COL2"=1 OR "COL1"=2 AND "COL2"=2))
       filter(LNNVL("COL1"=3) OR LNNVL("COL2"=3) AND LNNVL("COL2"=4))

We get the CONCATENATION of the simpler OR conditions which are combined into a single INLIST ITERATOR operations with another child operation which applies LNNVLs in the filter predicate.

Now imagine we have a much larger number of the simpler OR predicates combined with liberal scatterings of such IN clauses throughout our 1091 OR’d predicates.

Perhaps we’d be a bit concerned about the context switching and increased cpu usage?

Looking at my real world example right here… there’s no point me pasting in the thousands of lines.

If I run the real world behemoth:

Client elapsed time 108 seconds

DBMS_XPLAN.DISPLAY_CURSOR tells me it ran in 1.26 seconds:

plan hash value: 4262066066  
 
-------------------------------------------------------------------------------------------------------------------- 
| Id  | Operation                     | Name            | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  | 
--------------------------------------------------------------------------------------------------------------------  
|   0 | SELECT STATEMENT              |                 |      1 |        |   1628 |00:00:01.26 |    3352 |    418 | 
|   1 |  CONCATENATION                |                 |      1 |        |   1628 |00:00:01.26 |    3352 |    418 |  
|   2 |   INLIST ITERATOR             |                 |      1 |        |    966 |00:00:00.94 |    2735 |    389 | 
|   3 |    TABLE ACCESS BY INDEX ROWID| XXXXXXXXXX      |    966 |    624 |    966 |00:00:00.94 |    2735 |    389 | 
|*  4 |     INDEX RANGE SCAN          | XXXXXXXXXX_UK01 |    966 |    624 |    966 |00:00:00.86 |    1915 |    380 | 
|   5 |   TABLE ACCESS BY INDEX ROWID | XXXXXXXXXX      |      1 |      1 |      2 |00:00:00.01 |       4 |      0 |  
|*  6 |    INDEX RANGE SCAN           | XXXXXXXXXX_UK01 |      1 |      1 |      2 |00:00:00.01 |       3 |      0 |  
....
| 253 |   TABLE ACCESS BY INDEX ROWID | XXXXXXXXXX      |      1 |      2 |     47 |00:00:00.01 |      14 |      1 |  
|*254 |    INDEX RANGE SCAN           | XXXXXXXXXX_UK01 |      1 |      2 |     47 |00:00:00.01 |       3 |      1 |   
--------------------------------------------------------------------------------------------------------------------  

Predicate Information (identified by operation id):
--------------------------------------------------- 
   4 - access(((("XXXXXXXXXX"."VERSION"=1 AND "XXXXXXXXXX"."ID_XXXXXXXXXX"=1611722) OR             
              ("XXXXXXXXXX"."VERSION"=1 AND "XXXXXXXXXX"."ID_XXXXXXXXXX"=1611795) OR ("XXXXXXXXXX"."VERSION"=1 AND                                       
              "XXXXXXXXXX"."ID_XXXXXXXXXX"=1611863) OR ("XXXXXXXXXX"."VERSION"=1 AND
              "XXXXXXXXXX"."ID_XXXXXXXXXX"=1612023) OR ("XXXXXXXXXX"."VERSION"=1 AND 			  .....
  6 - access "XXXXXXXXXX"."ID_XXXXXXXXXX"=2046939) 
       filter(((LNNVL("XXXXXXXXXX"."VERSION"=1) OR LNNVL("XXXXXXXXXX"."ID_XXXXXXXXXX"=1611722)) AND 
           (LNNVL("XXXXXXXXXX"."VERSION"=1) OR LNNVL("XXXXXXXXXX"."ID_XXXXXXXXXX"=1611795)) AND 
              (LNNVL("XXXXXXXXXX"."VERSION"=1) OR LNNVL("XXXXXXXXXX"."ID_XXXXXXXXXX"=1611863)) AND 
.....

DBMS_MONITOR tells me the query ran in 1.68 seconds:

Global Information
 Status              :  DONE (ALL ROWS)     
 Instance ID         :  2                   
 Session ID          :  1220                
 SQL ID              :  5fvt3tfbgmqp3       
 SQL Execution ID    :  33554432            
 Plan Hash Value     :  4262066066          
 Execution Started   :  04/05/2013 14:53:42 
 First Refresh Time  :  04/05/2013 14:53:42 
 Last Refresh Time   :  04/05/2013 14:53:45 

--------------------------------------------------------------------
| Elapsed |   Cpu   |    IO    | Cluster  | Fetch | Buffer | Reads |
| Time(s) | Time(s) | Waits(s) | Waits(s) | Calls |  Gets  |       |
--------------------------------------------------------------------
|    1.68 |    0.50 |     1.13 |     0.05 |     4 |   3352 |   418 |
--------------------------------------------------------------------

Fortunately V$SQL tells me the real story:

select sql_id, executions, rows_processed, elapsed_time/1000/1000, cpu_time/1000/1000 
from v$sql where sql_id = '5fvt3tfbgmqp3';

Gives

SQL_ID        EXECUTIONS ROWS_PROCESSED ELAPSED_TIME/1000/1000 CPU_TIME/1000/1000
------------- ---------- -------------- ---------------------- ------------------
5fvt3tfbgmqp3          1           1628              105.81511         104.581102

Then again, if you write nasty code, nasty things can happen.

Sql tuning request

Without knowing anything about the problem in advance, I thought it would be good to do a walkthrough post of a sql tuning request.

But now that I’m done I’m unconvinced as it’s probably too long, the query too meaningless and the real time sql monitoring text output too unreadable in a blog post.

I’m always reluctant to post real, specific application issues because I’m never sure how well they translate and illustrate the desired points unless you convert them to a standalone test case.

Let’s see how it goes.

  • The idea is to touch on the broad strategies that I’m going through.
  • The scope of the solution should match the scope of the problem – so for a single problem query, table design and current indexes, stats and histograms should be considered as set in stone.
  • Ideally we want to avoid hinting as much as possible or at least stick to acceptable hints.
  • And if we’re going to manually intervene, we’re not particularly interested in what the current production plan is nor the usage of any plan stability features to preserve it (unless it’s better than we can do ourselves).

This is from an 11gR2 testing environment, 11.2.0.3 to be specific.

Having done the latest merge of a production code release (9.2.0.8) into our 11gR2 environment (upgrade ETA March), a report has been reported as slow, taking about 1.5 minutes in production and some 20 minutes in the 11gR2 environment.

So, having traced the report and found that the driving query is the prime suspect, let’s get some feedback on the performance of the SQL statement.

Here’s the original sql statement to give some context – I’ve commented out most of the columns because they don’t add much other than just length to the post.

SELECT ... some columns ...,
       ... a function call ...,
       ... some more columns ...
FROM   isbk,
       inst,
       bsta,
       isco,
       sdol,
       borg,
       book,
       isdm,
       rule
WHERE  book.book_num = rule.book_num
AND    rule.rule_type_code = 'IBK'
AND    isbk.inst_num = inst.inst_num
AND    inst.inst_num = bsta.inst_num
AND    inst.inst_num = isco.inst_num
AND    isco.inst_num = isdm.inst_num
AND    isco.xcod_code = 'SEDL'
AND    isco.inst_num = sdol.inst_num
AND    sdol.xcod_code = 'ISIN'
AND    borg.borg_num = inst.issuer_num
AND    book.book_num = isbk.book_num
AND    bsta.current_mat_date >= TO_DATE(v_bus_date, 'yyyymmdd')
UNION ALL
SELECT ... some columns ...
FROM   inst, 
       bsta,
       isco,
       sdol,
       borg,
       isdm,
       inix
WHERE  inst.inst_num = bsta.inst_num
AND    inst.inst_num = isco.inst_num
AND    isco.inst_num = isdm.inst_num
AND    isco.xcod_code = 'SEDL'
AND    isco.inst_num = sdol.inst_num
AND    sdol.xcod_code = 'ISIN'
AND    borg.borg_num = inst.issuer_num
AND    bsta.current_mat_date >= TO_DATE(v_bus_date, 'yyyymmdd')
AND    isco.inst_num = inix.inst_num
AND    inix.xcod_code = 'NIX'
AND    NOT EXISTS (SELECT 1
                   FROM   isbk,
                          rule,
                          book,
                          bsta
                   WHERE  rule.rule_type_code = 'IBK'
                   AND    book.book_num = rule.book_num
                   AND    book.book_num = isbk.book_num
                   AND    inst.inst_num = isbk.inst_num
                   AND    inst.inst_num = bsta.inst_num
                   AND    bsta.current_mat_date >= TO_DATE(v_bus_date, 'yyyymmdd'));

Looking at the query and the repetition of tables and joins between the two UNION ALL parts, it looks like a classic case of two sets of disparate driving data that then need to be joined to the same additional tables.

Ironically, expanding ORs out into UNIONs is a common initial tuning step for performance problems with OR predicates.

Here’s what little extra schema knowledge might be useful:

  • INST has a pk of INST_NUM.
  • ISCO has an n:1 relationship with INST.
  • ISDM has an n:1 relationship with INST.
  • BORG has a 1:1 relationship with INST.ISSUER_NUM.
  • BSTA has a 1:1 relationship with INST.
  • ISBK has a n:1 relationship with INST and a n:1 relationship with BOOK.
  • BOOK has a pk of BOOK_NUM.
  • RULE is a table used for generic filtering rules, in this case related to BOOK.BOOK_NUM.

Let’s use Real-Time SQL Monitoring (usual license caveats apply) to see what the current performance story is.

SELECT dbms_sqltune.report_sql_monitor(<sql_id>) FROM DUAL;

If there’s one downside to RTSM, it’s a bit too wide for these blog posts­čśŽ
Maybe the RTSM pictures would have been better?

Anyway, I’ve sacrificed some of the columns in the report and tried to shrink the font (there’s a scrollbar at the bottom of the plan window).


Global Stats
===========================================================
| Elapsed |   Cpu   | PL/SQL  |  Other   | Fetch | Buffer |
| Time(s) | Time(s) | Time(s) | Waits(s) | Calls |  Gets  |
===========================================================
|    1094 |    1094 |    0.29 |     0.20 |    60 |   257K |
===========================================================

SQL Plan Monitoring Details (Plan Hash Value=1036829859)
===============================================================================================================================
| Id |                Operation                |       Name         |  Rows   | Execs |   Rows   | Activity | Activity Detail |
|    |                                         |                    | (Estim) |       | (Actual) |   (%)    |   (# samples)   |
===============================================================================================================================
|  0 | SELECT STATEMENT                        |                    |         |     1 |    29225 |          |                 |
|  1 |   NESTED LOOPS                          |                    |       1 |     1 |    29225 |          |                 |
|  2 |    NESTED LOOPS                         |                    |       1 |     1 |    29205 |          |                 |
|  3 |     NESTED LOOPS                        |                    |       1 |     1 |    29205 |          |                 |
|  4 |      VIEW                               | VW_JF_SET$AD8EBC08 |    6147 |     1 |    29205 |          |                 |
|  5 |       UNION-ALL                         |                    |         |     1 |    29205 |          |                 |
|  6 |        NESTED LOOPS                     |                    |       9 |     1 |    28719 |          |                 |
|  7 |         NESTED LOOPS                    |                    |      53 |     1 |    28722 |          |                 |
|  8 |          NESTED LOOPS                   |                    |     370 |     1 |    31674 |          |                 |
|  9 |           NESTED LOOPS                  |                    |     370 |     1 |    31674 |          |                 |
| 10 |            NESTED LOOPS                 |                    |       3 |     1 |        1 |          |                 |
| 11 |             INDEX RANGE SCAN            | RULE_IDX_2         |       3 |     1 |        1 |          |                 |
| 12 |             TABLE ACCESS BY INDEX ROWID | BOOK               |       1 |     1 |        1 |          |                 |
| 13 |              INDEX UNIQUE SCAN          | BOOK_IDX           |       1 |     1 |        1 |          |                 |
| 14 |            INDEX RANGE SCAN             | ISBK_IDX           |     116 |     1 |    31674 |          |                 |
| 15 |           TABLE ACCESS BY INDEX ROWID   | INST               |       1 | 31674 |    31674 |          |                 |
| 16 |            INDEX UNIQUE SCAN            | ISTR_PK            |       1 | 31674 |    31674 |          |                 |
| 17 |          TABLE ACCESS BY INDEX ROWID    | BSTA               |       1 | 31674 |    28722 |          |                 |
| 18 |           INDEX UNIQUE SCAN             | BSTAC_IDX          |       1 | 31674 |    31654 |          |                 |
| 19 |         INDEX RANGE SCAN                | ISCO_4_IDX         |       1 | 28722 |    28719 |          |                 |
| 20 |        HASH JOIN                        |                    |    6138 |     1 |      486 |          |                 |
| 21 |         HASH JOIN ANTI                  |                    |    5716 |     1 |    54961 |    67.16 | Cpu (732)       |
| 22 |          HASH JOIN                      |                    |   24385 |     1 |    83680 |          |                 |
| 23 |           INDEX RANGE SCAN              | ISCO_4_IDX         |    170K |     1 |     377K |          |                 |
| 24 |           HASH JOIN                     |                    |    144K |     1 |     266K |          |                 |
| 25 |            TABLE ACCESS FULL            | BSTA               |    144K |     1 |     266K |     0.09 | Cpu (1)         |
| 26 |            TABLE ACCESS FULL            | INST               |      1M |     1 |       1M |          |                 |
| 27 |          VIEW                           | VW_SQ_1            |     53M |     1 |       4G |     5.60 | Cpu (61)        |
| 28 |           HASH JOIN                     |                    |     53M |     1 |       4G |    27.06 | Cpu (295)       |
| 29 |            MERGE JOIN CARTESIAN         |                    |    458K |     1 |     266K |          |                 |
| 30 |             NESTED LOOPS                |                    |       3 |     1 |        1 |          |                 |
| 31 |              INDEX RANGE SCAN           | RULE_IDX_2         |       3 |     1 |        1 |          |                 |
| 32 |              INDEX UNIQUE SCAN          | BOOK_IDX           |       1 |     1 |        1 |          |                 |
| 33 |             BUFFER SORT                 |                    |    144K |     1 |     266K |          |                 |
| 34 |              INDEX FAST FULL SCAN       | BSTA_IDX_2         |    144K |     1 |     266K |          |                 |
| 35 |            TABLE ACCESS FULL            | ISBK               |      2M |     1 |       2M |          |                 |
| 36 |         INDEX RANGE SCAN                | ISCO_4_IDX         |    170K |     1 |    32983 |          |                 |
| 37 |      TABLE ACCESS BY INDEX ROWID        | BORG               |       1 | 29205 |    29205 |          |                 |
| 38 |       INDEX UNIQUE SCAN                 | BORG_PK            |       1 | 29205 |    29205 |          |                 |
| 39 |     INDEX RANGE SCAN                    | ISCO_4_IDX         |       1 | 29205 |    29205 |          |                 |
| 40 |    INDEX RANGE SCAN                     | ISDM_IDX           |       1 | 29205 |    29225 |          |                 |
===============================================================================================================================

There’s a lot in this report.

First thing you might have spotted is the Join Factorisation going on @ step 4 as indicated by the name VW_JF*.

Maybe you’re always wary of the MERGE JOIN CARTESIAN … BUFFER SORT? Not always a problem of course but where there’s trouble you’ll often find her (in real production scenarios, I find that this mechanism is a problem not so much when there are missing join conditions – because these are rarely found in production code – but rather as a valid join mechanism but where the rowsource estimates are significantly inaccurate).

If you look at the SQL, you’ll see a function call as well.

There are also a whole bunch of estimates that are significantly off.

Where to start?

There are four main areas of questioning:

  1. What’s taking all the time? Are there some particular steps in the plan which are more problematic than others?
  2. If estimates are inaccurate, where do they go most wrong or go wrong first?
  3. Which predicates eliminate the most data? i.e. it’s rarely a good thing to join thousands upon thousands of rows only to do a late filter in the plan to reduce it down to a few handfuls. Aka eliminate early.
  4. What is the simplest / quickest / least invasive change that can be made to significantly improve performance? And will it actually be sufficient?

From “Activity %”, I hope it’s clear from the report that all the time is taken up by the bottom half of the UNION ALL.

  • The rowsource cardinality estimates are not that accurate.
  • Plus we find in that second half our old friend the MJC+BS.
  • And we’re just burning CPU down there.

So, I’d like to isolate that bottom half of the UNION ALL and run it standalone.

However, in a clear indication of the issues with it, it won’t run standalone – It’s just blown 30+ gig of temp space.

But it returns only 486 rows so let’s try to get a runnable standalone version.

Before we start looking at possible solutions, let’s start to ask questions about the query logic itself, keeping an eye out for redundant tables and joins and asking ourselves whether this is the best way to word the question we think is being asked.

Best way to start that is

So, we’ve got a few “filters” predicates, a couple of “join predicates” and a NOT EXISTS correlated subquery.

The logic of the subquery is the first thing that jumps out at me.
1. BOOK seems redundant
We join RULE to BOOK and BOOK to ISBK all by BOOK_NUM and ISBK is correlated to the outer INST by INST_NUM.
We do no filtering by any BOOK attribute so it serves no purpose so, let’s remove BOOK, join RULE straight to ISBK.

2. The subquery filtering by BSTA.CURRENT_MAT_DATE is irrelevant.
This is the same filter as in the outer select.
It doesn’t make sense.
Ignoring the actual evaluation order of all these predicates, we should effectively only be checking the NOT EXISTS against INST_NUMS that have passed the outer BSTA filter. So, by definition, this particular predicate in the subquery will always be true. Why repeat it? This is a mistake.
The only thing this subquery should be doing is checking they’re not in the RULE/ISBK combo. So, let’s remove that.

If we run just comment out the tables and joins as per suggestion above, then that bottom query runs in a couple of seconds:


Global Stats
=================================================
| Elapsed |   Cpu   |  Other   | Fetch | Buffer |
| Time(s) | Time(s) | Waits(s) | Calls |  Gets  |
=================================================
|    1.12 |    1.12 |     0.00 |     2 |  73879 |
=================================================

SQL Plan Monitoring Details (Plan Hash Value=2734204492)
==========================================================================================================
| Id |         Operation          |   Name     |  Rows   | Execs |   Rows   | Activity | Activity Detail |
|    |                            |            | (Estim) |       | (Actual) |   (%)    |   (# samples)   |
==========================================================================================================
|  0 | SELECT STATEMENT           |            |         |     1 |      489 |          |                 |
|  1 |   HASH JOIN RIGHT ANTI     |            |   38162 |     1 |      489 |          |                 |
|  2 |    VIEW                    | VW_SQ_1    |     555 |     1 |    31674 |          |                 |
|  3 |     NESTED LOOPS           |            |     555 |     1 |    31674 |          |                 |
|  4 |      INDEX RANGE SCAN      | RULE_IDX_2 |       3 |     1 |        1 |          |                 |
|  5 |      INDEX RANGE SCAN      | ISBK_IDX   |     176 |     1 |    31674 |          |                 |
|  6 |    HASH JOIN               |            |   38189 |     1 |    28700 |          |                 |
|  7 |     HASH JOIN              |            |   37626 |     1 |    28700 |          |                 |
|  8 |      HASH JOIN             |            |   35036 |     1 |    86956 |   100.00 | Cpu (1)         |
|  9 |       HASH JOIN            |            |   32625 |     1 |    87212 |          |                 |
| 10 |        HASH JOIN           |            |   31565 |     1 |    83593 |          |                 |
| 11 |         INDEX RANGE SCAN   | ISCO_4_IDX |    170K |     1 |     377K |          |                 |
| 12 |         HASH JOIN          |            |    144K |     1 |     266K |          |                 |
| 13 |          TABLE ACCESS FULL | BSTA       |    144K |     1 |     266K |          |                 |
| 14 |          TABLE ACCESS FULL | INST       |    796K |     1 |     796K |          |                 |
| 15 |        TABLE ACCESS FULL   | ISDM       |      1M |     1 |       1M |          |                 |
| 16 |       INDEX RANGE SCAN     | ISCO_4_IDX |    170K |     1 |     861K |          |                 |
| 17 |      INDEX RANGE SCAN      | ISCO_4_IDX |    170K |     1 |    32983 |          |                 |
| 18 |     TABLE ACCESS FULL      | BORG       |    328K |     1 |     328K |          |                 |
==========================================================================================================

Note that I’ve had to hint this with the /*+ monitor */ hint because by default this now executes beneath the default threshold for monitoring.

It looks like we’ve arrived at our quickest/simplest change just by going through the query quickly trying to understand the question it’s asking. No hints required.

So this is would be a good place to stop, thoroughly test the change, validate the original and the changed results and move on to something else.

We should drop it back into the UNION ALL and see what we get:


Global Stats
===========================================================
| Elapsed |   Cpu   | PL/SQL  |  Other   | Fetch | Buffer |
| Time(s) | Time(s) | Time(s) | Waits(s) | Calls |  Gets  |
===========================================================
|    4.68 |    4.68 |    0.38 |     0.00 |    60 |   230K |
===========================================================

SQL Plan Monitoring Details (Plan Hash Value=4171599168)
===============================================================================================================================
| Id |                Operation                |       Name         |  Rows   | Execs |   Rows   | Activity | Activity Detail |
|    |                                         |                    | (Estim) |       | (Actual) |   (%)    |   (# samples)   |
===============================================================================================================================
|  0 | SELECT STATEMENT                        |                    |         |     1 |    29225 |          |                 |
|  1 |   NESTED LOOPS                          |                    |       1 |     1 |    29225 |          |                 |
|  2 |    NESTED LOOPS                         |                    |       1 |     1 |    29205 |          |                 |
|  3 |     HASH JOIN                           |                    |       1 |     1 |    29205 |          |                 |
|  4 |      TABLE ACCESS FULL                  | BORG               |    328K |     1 |     328K |          |                 |
|  5 |      VIEW                               | VW_JF_SET$AD8EBC08 |   26182 |     1 |    29205 |          |                 |
|  6 |       UNION-ALL                         |                    |         |     1 |    29205 |          |                 |
|  7 |        NESTED LOOPS                     |                    |       9 |     1 |    28719 |          |                 |
|  8 |         NESTED LOOPS                    |                    |      53 |     1 |    28722 |          |                 |
|  9 |          NESTED LOOPS                   |                    |     370 |     1 |    31674 |          |                 |
| 10 |           NESTED LOOPS                  |                    |     370 |     1 |    31674 |          |                 |
| 11 |            NESTED LOOPS                 |                    |       3 |     1 |        1 |          |                 |
| 12 |             INDEX RANGE SCAN            | RULE_IDX_2         |       3 |     1 |        1 |          |                 |
| 13 |             TABLE ACCESS BY INDEX ROWID | BOOK               |       1 |     1 |        1 |          |                 |
| 14 |              INDEX UNIQUE SCAN          | BOOK_IDX           |       1 |     1 |        1 |          |                 |
| 15 |            INDEX RANGE SCAN             | ISBK_IDX           |     116 |     1 |    31674 |          |                 |
| 16 |           TABLE ACCESS BY INDEX ROWID   | INST               |       1 | 31674 |    31674 |          |                 |
| 17 |            INDEX UNIQUE SCAN            | ISTR_PK            |       1 | 31674 |    31674 |          |                 |
| 18 |          TABLE ACCESS BY INDEX ROWID    | BSTA               |       1 | 31674 |    28722 |          |                 |
| 19 |           INDEX UNIQUE SCAN             | BSTA_IDX           |       1 | 31674 |    31654 |          |                 |
| 20 |         INDEX RANGE SCAN                | ISCO_4_IDX         |       1 | 28722 |    28719 |          |                 |
| 21 |        HASH JOIN RIGHT ANTI             |                    |   26173 |     1 |      486 |          |                 |
| 22 |         VIEW                            | VW_SQ_1            |     555 |     1 |    31674 |          |                 |
| 23 |          NESTED LOOPS                   |                    |     555 |     1 |    31674 |          |                 |
| 24 |           INDEX RANGE SCAN              | RULE_IDX_2         |       3 |     1 |        1 |          |                 |
| 25 |           INDEX RANGE SCAN              | ISBK_IDX           |     176 |     1 |    31674 |          |                 |
| 26 |         HASH JOIN                       |                    |   26187 |     1 |    28693 |          |                 |
| 27 |          HASH JOIN                      |                    |   24385 |     1 |    83680 |    33.33 | Cpu (1)         |
| 28 |           INDEX RANGE SCAN              | ISCO_4_IDX         |    170K |     1 |     377K |          |                 |
| 29 |           HASH JOIN                     |                    |    144K |     1 |     266K |          |                 |
| 30 |            TABLE ACCESS FULL            | BSTA               |    144K |     1 |     266K |          |                 |
| 31 |            TABLE ACCESS FULL            | INST               |      1M |     1 |       1M |          |                 |
| 32 |          INDEX RANGE SCAN               | ISCO_4_IDX         |    170K |     1 |    32983 |          |                 |
| 33 |     INDEX RANGE SCAN                    | ISCO_4_IDX         |       1 | 29205 |    29205 |          |                 |
| 34 |    INDEX RANGE SCAN                     | ISDM_IDX           |       1 | 29205 |    29225 |          |                 |
===============================================================================================================================

So we could leave it there.

However, I wouldn’t blame you if you wanted to go further, even if we’re bordering on Compulsive Tuning Disorder:

  • I don’t really like leaving behind a plan that’s got significantly inaccurate estimates – it leaves behind too much of a future threat.
  • And maybe we can also look at the original query and the direction the Join Factorisation was indicating and take it further? How about doing the UNION ALL a bit earlier and then do one lot of joining to the shared tables from the original SQL?
  • Let’s use some dynamic sampling to improve some single table cardinality estimates but then perhaps let’s go too far and add some join estimate adjustments (I say too far because I’m going to use opt_estimate but it’s undocumented and I’m not recommending it but if you do use it, then use it in conjunction with qb_name).
  • And let’s get rid of the function call which, take it from me, in this case is effectively a single table outer join lookup anyway.
SELECT  /*+
          find_me
          monitor
          qb_name(main)
          dynamic_sampling(sdol@main 4)
          dynamic_sampling(isin@main 4)
          */
        ... some columns ...
FROM   (SELECT /*+
                 qb_name(union1)
                 opt_estimate(join(isbk@union1 rule@union1) scale_rows=100)
                 */
               isbk.inst_num  inst_num
        ,      inix.inst_code nix_code
        ,      (SELECT book.name
                FROM   book
                WHERE  book.book_num = rule.book_num) book_name
        FROM   isbk
        ,      rule
        ,      inix
        WHERE  rule.rule_type_code    = 'IBK'
        AND    rule.book_num          = isbk.book_num 
        AND    inix.inst_num      (+) = isbk.inst_num
        AND    inix.xcod_code     (+) = 'NIX'
        UNION ALL
        SELECT /*+
                 qb_name(union2)
                 dynamic_sampling(inix@union2 4)
                 */
               inix.inst_num  inst_num
        ,      inix.inst_code nix_code
        ,      NULL           book_name
        FROM   inix
        WHERE  inix.xcod_code         = 'NIX'
        AND    NOT EXISTS (SELECT /*+ 
                                    qb_name(sub1)
                                    opt_estimate(join(isbk@sub1 rule@sub1) scale_rows=100)
                                    */
                                  1
                           FROM   isbk
                           ,      rule
                           WHERE  rule.rule_type_code    = 'IBK'
                           AND    rule.book_num          = isbk.book_num
                           AND    isbk.inst_num         = inix.inst_num))
       xxxx
,      bsta
,      isin
,      sdol
,      inst
,      borg
,      isdm
WHERE  bsta.inst_num          = xxxx.inst_num
AND    bsta.current_mat_date >= TO_DATE(:v_bus_date, 'yyyymmdd')
AND    isin.inst_num          = xxxx.inst_num 
AND    isin.xcod_code         = 'SEDL'
AND    sdol.inst_num          = xxxx.inst_num
AND    sdol.xcod_code         = 'ISIN'
AND    inst.inst_num          = xxxx.inst_num
AND    borg.borg_num          = inst.issuer_num
AND    isdm.inst_num          = xxxx.inst_num;

Which gives:


Global Stats
=================================================
| Elapsed |   Cpu   |  Other   | Fetch | Buffer |
| Time(s) | Time(s) | Waits(s) | Calls |  Gets  |
=================================================
|    1.40 |    1.40 |     0.00 |    60 |  79273 |
=================================================

SQL Plan Monitoring Details (Plan Hash Value=2392527494)
=======================================================================================================================
| Id |               Operation                |   Name      |  Rows   | Execs |   Rows   | Activity | Activity Detail |
|    |                                        |             | (Estim) |       | (Actual) |   (%)    |   (# samples)   |
=======================================================================================================================
|  0 | SELECT STATEMENT                       |             |         |     1 |    29225 |          |                 |
|  1 |   HASH JOIN                            |             |   94337 |     1 |    29225 |          |                 |
|  2 |    INDEX RANGE SCAN                    | ISCO_4_IDX  |    373K |     1 |     377K |          |                 |
|  3 |    HASH JOIN                           |             |   80747 |     1 |    29629 |          |                 |
|  4 |     HASH JOIN                          |             |   80747 |     1 |    29629 |          |                 |
|  5 |      HASH JOIN                         |             |   80747 |     1 |    29629 |          |                 |
|  6 |       HASH JOIN                        |             |   78122 |     1 |    29604 |          |                 |
|  7 |        MERGE JOIN                      |             |   78003 |     1 |    33947 |          |                 |
|  8 |         INDEX RANGE SCAN               | ISCO_4_IDX  |    818K |     1 |     861K |          |                 |
|  9 |         SORT JOIN                      |             |   55765 |  861K |    33947 |          |                 |
| 10 |          VIEW                          |             |   55765 |     1 |    33960 |          |                 |
| 11 |           UNION-ALL                    |             |         |     1 |    33960 |          |                 |
| 12 |            TABLE ACCESS BY INDEX ROWID | BOOK        |       1 |     1 |        1 |          |                 |
| 13 |             INDEX UNIQUE SCAN          | BOOK_IDX    |       1 |     1 |        1 |          |                 |
| 14 |            HASH JOIN OUTER             |             |   55485 |     1 |    31674 |          |                 |
| 15 |             NESTED LOOPS               |             |   55485 |     1 |    31674 |          |                 |
| 16 |              INDEX RANGE SCAN          | RULE_IDX_2  |       3 |     1 |        1 |          |                 |
| 17 |              INDEX RANGE SCAN          | ISBK_IDX    |   17609 |     1 |    31674 |          |                 |
| 18 |             INDEX RANGE SCAN           | ISCO_4_IDX  |    170K |     1 |    32983 |          |                 |
| 19 |            HASH JOIN ANTI              |             |     280 |     1 |     2286 |          |                 |
| 20 |             INDEX RANGE SCAN           | ISCO_4_IDX  |   28028 |     1 |    32983 |          |                 |
| 21 |             VIEW                       | VW_SQ_1     |   55485 |     1 |    31674 |          |                 |
| 22 |              NESTED LOOPS              |             |   55485 |     1 |    31674 |          |                 |
| 23 |               INDEX RANGE SCAN         | RULE_IDX_2  |       3 |     1 |        1 |          |                 |
| 24 |               INDEX RANGE SCAN         | ISBK_IDX    |   17609 |     1 |    31674 |          |                 |
| 25 |        TABLE ACCESS FULL               | BSTA        |    144K |     1 |     266K |          |                 |
| 26 |       TABLE ACCESS FULL                | ISDM        |      1M |     1 |       1M |          |                 |
| 27 |      TABLE ACCESS FULL                 | INST        |    796K |     1 |     796K |   100.00 | Cpu (1)         |
| 28 |     TABLE ACCESS FULL                  | BORG        |    328K |     1 |     328K |          |                 |
=======================================================================================================================

It might not be perfect but it’s certainly a lot better.

An acceptable compromise might be to refactor the UNION ALL as per above but omit the undocumented opt_estimate hints.

All we need to do is some more testing to validate the results and to also triple check the performance when the data is not cached and we’re done.

If you made it this far then Wow! I probably wouldn’t have done.

So, to summarise, what have we done?

  • Well, we haven’t needed to know much about the original intention of the query nor the schema.
  • We’ve improved the accuracies of some of the estimates.
  • We’ve found some redundant tables and joins.
  • And we’ve reordered the query slightly to better phrase the question that we think was being asked, in the process moving the UNION ALL earlier in the processing so that some of the joining tables only needed to be referenced once (an advantage that might not be preserved if the CBO decides to merge the UNION ALL but that would be the opposite of the recently developed join factorisation mechanism).

And these simple, quick steps effectively reduced the execution time of a query from 20 minutes to a few seconds.

In hindsight, a good example because of the gains realised but a bad example because of the length of query and the associated detail, particularly the redundant tables.

Scalar Subselect Costing

This issue is an oldie but deserving of a quick post to stop me going off on a tangent in another post.

It is an oddity of scalar subselects/subqueries that their cost is not taken into account in the top level cost of a query.

In older versions of Oracle, it used to be the case that you didn’t even see the scalar subquery in the execution plan.

However, even in the latest versions, the cost still isn’t accounted for.

Always something to keep in mind.

For example:

SQL> create table t1
  2  (col1 number not null);

Table created.

SQL> 
SQL> insert into t1
  2  select rownum
  3  from   dual
  4  connect by rownum <= 10000;

10000 rows created.

SQL> 
SQL> commit;

Commit complete.

SQL> 
SQL> create table t2
  2  (col1 number not null primary key);

Table created.

SQL> 
SQL> 
SQL> insert into t2
  2  select rownum
  3  from   dual
  4  connect by rownum <= 10000;

10000 rows created.

SQL> 
SQL> commit;

Commit complete.

SQL> 

Let’s do a scalar subselect to do an index lookup on t2 for every row in t1:

SQL> explain plan for
  2  select t1.col1
  3  ,      (select t2.col1 from t2 where t2.col1 = t1.col1)
  4  from   t1;

Explained.

SQL> 
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------
Plan hash value: 2339000913

----------------------------------------------------------------------------------
| Id  | Operation         | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |              | 10000 |   126K|     8   (0)| 00:00:01 |
|*  1 |  INDEX UNIQUE SCAN| SYS_C0078310 |     1 |    13 |     1   (0)| 00:00:01 |
|   2 |  TABLE ACCESS FULL| T1           | 10000 |   126K|     8   (0)| 00:00:01 |
----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("T2"."COL1"=:B1)

You can see that the cost of the scalar subquery is 1 per execution and it’s not accounted for at the top level.

Let’s force a full table scan of the row-by-row lookup:

SQL> explain plan for
  2  select t1.col1
  3  ,      (select /*+ full(t2) */ t2.col1 from t2 where t2.col1 = t1.col1)
  4  from   t1;

Explained.

SQL> 
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------
Plan hash value: 637946564

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      | 10000 |   126K|     8   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T2   |     1 |    13 |     2   (0)| 00:00:01 |
|   2 |  TABLE ACCESS FULL| T1   | 10000 |   126K|     8   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("T2"."COL1"=:B1)

Obviously a much more expensive operation but, again, not properly accounted for in the overall costing.

Wouldn’t it be preferable that as the optimizer has estimated the number of rows in the top level select:

|   0 | SELECT STATEMENT  |      | 10000 |   126K|     8   (0)| 00:00:01 |

and it has estimated the cost per execution of the scalar subselect:

|*  1 |  TABLE ACCESS FULL| T2   |     1 |    13 |     2   (0)| 00:00:01 |

that the top level cost include to some degree the cost of scalar subselect per execution * estimated executions?

For example, if we code a join roughly equivalent to the scalar subselect then:

SQL> explain plan for
  2  select /*+ 
  3           full(t2) 
  4           use_nl(t2)
  5           */
  6         t1.col1
  7  ,      t2.col1
  8  from   t1
  9  ,      t2
 10  where t2.col1 (+) = t1.col1;

Explained.

SQL> 
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------
Plan hash value: 2453408398

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      | 10000 |   253K| 66919   (7)| 00:05:35 |
|   1 |  NESTED LOOPS OUTER|      | 10000 |   253K| 66919   (7)| 00:05:35 |
|   2 |   TABLE ACCESS FULL| T1   | 10000 |   126K|     8   (0)| 00:00:01 |
|*  3 |   TABLE ACCESS FULL| T2   |     1 |    13 |     7  (15)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter("T2"."COL1"(+)="T1"."COL1")

Also see:
http://jonathanlewis.wordpress.com/2007/10/12/scalar-subqueries/

http://oracle-randolf.blogspot.com/2010/01/when-your-projection-is-not-cost-free.html

http://blog.sydoracle.com/2005/09/explain-plans-and-scalar-subqueries.html