SQL Patch III / Parallel Baselines

From my previous investigation, I had one outstanding question at the back of my mind:

Can we apply more than one hint with a SQL Patch?

In addition, I had previously noted that SQL Plan baselines have issues if you have a plan that has to use a parallel hint to get parallel operations.

By investigating the latter question, we can answer the former.

Initial setup, same as examples in previous post.

SQL> create table t1  
  2  (col1  number  
  3  ,col2  varchar2(200)  
  4  ,flag  varchar2(1));  

Table created.

SQL> 
SQL> insert into t1  
  2  select rownum  
  3  ,      lpad('X',200,'X')  
  4  ,      case when mod(rownum,10000) = 0  
  5              then 'Y'  
  6              else 'N'  
  7         end  
  8  from   dual  
  9  connect by rownum <= 100000;  

100000 rows created.

SQL> create index i1 on t1 (col1);

Index created.

SQL> exec dbms_stats.gather_table_stats(USER,'T1'); 

PL/SQL procedure successfully completed.

SQL> 

Let’s run a piece of code that does an index access:

SQL> select * from t1 where col1 = 1;

      COL1 COL2                                                                  F
---------- --------------------------------------------------------------------- -
         1 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX N


SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------
SQL_ID  81qv4d7vkb571, child number 0
-------------------------------------
select * from t1 where col1 = 1

Plan hash value: 1704772559

------------------------------------------------------------------------------------
| Id  | Operation                   | Name | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |      |       |       |     2 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1   |     1 |   208 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | I1   |     1 |       |     1   (0)| 00:00:01 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("COL1"=1)


19 rows selected.

SQL> 

What about if I want to hint this to a parallel full table scan?

SQL> select /*+ full(t1) parallel(t1 4) */ * from t1 where col1 = 1;

      COL1 COL2                                                                  F
---------- --------------------------------------------------------------------- -
         1 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX N


SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
SQL_ID  chhkmc32mdkak, child number 0
-------------------------------------
select /*+ full(t1) parallel(t1 4) */ * from t1 where col1 = 1

Plan hash value: 2494645258

----------------------------------------------------------------------------------------------------
| Id  | Operation            | Name     | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |          |       |       |   264 (100)|          |        |      |            |
|   1 |  PX COORDINATOR      |          |       |       |            |          |        |      |            |
|   2 |   PX SEND QC (RANDOM)| :TQ10000 |     1 |   208 |   264   (1)| 00:00:02 |  Q1,00 | P->S | QC
|   3 |    PX BLOCK ITERATOR |          |     1 |   208 |   264   (1)| 00:00:02 |  Q1,00 | PCWC |            |
|*  4 |     TABLE ACCESS FULL| T1       |     1 |   208 |   264   (1)| 00:00:02 |  Q1,00 | PCWP |            |
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access(:Z>=:Z AND :Z<=:Z)
       filter("COL1"=1)


22 rows selected.

SQL> 

As expected.

Now, can I preserve this with a baseline?

SQL> declare     
  2    sqltext clob;     
  3    spm_op pls_integer;     
  4  begin     
  5    sqltext := 'select * from t1 where col1 = 1';     
  6    spm_op  :=     
  7    dbms_spm.load_plans_from_cursor_cache     
  8    (sql_id => 'chhkmc32mdkak',     
  9     plan_hash_value => 2494645258,     
 10     sql_text => sqltext);     
 11  end;     
 12  /

PL/SQL procedure successfully completed.

SQL> 

What happens when we re-run the original SQL?

SQL> select * from t1 where col1 = 1;

      COL1 COL2                                                                  F
---------- --------------------------------------------------------------------- -
         1 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX N


SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------
SQL_ID  81qv4d7vkb571, child number 1
-------------------------------------
select * from t1 where col1 = 1

Plan hash value: 1704772559

------------------------------------------------------------------------------------
| Id  | Operation                   | Name | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |      |       |       |     2 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1   |     1 |   208 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | I1   |     1 |       |     1   (0)| 00:00:01 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("COL1"=1)


19 rows selected.

SQL> 

Nothing. No baseline used.

The reason for this is that parallel is not in the plan hints but is essential to force our parallel plan and get the appropriate plan hash value.

Remember that a baseline must reproduce the stored plan hash or it will be ignored and the stored plan hash is SQLOBJ$.PLAN_ID which maps to the PLAN_HASH_2 value found in V$SQL_PLAN.OTHER_XML.

Let’s try to get the hints in the baseline:

SQL> select to_char(exact_matching_signature) from v$sql where sql_id = '81qv4d7vkb571';

TO_CHAR(EXACT_MATCHING_SIGNATURE)
----------------------------------------
9005682359107037619

SQL> select cast(extractvalue(value(x), '/hint') as varchar2(500)) as outline_hints  
  2  from   xmltable('/outline_data/hint'  
  3         passing (select xmltype(comp_data) xml  
  4                  from   sys.sqlobj$data  
  5                  where  signature = 9005682359107037619)) x; 
       passing (select xmltype(comp_data) xml
                *
ERROR at line 3:
ORA-01427: single-row subquery returns more than one row

SQL> 

Is that interesting? There’s now more than one baseline – should not be surprising if you are familiar with the baseline mechanism.

DBA_SQL_PLAN_BASELINES doesn’t give us everything we need to investigate so back under the covers:

SQL> SELECT TO_CHAR(so.signature) signature
  2  ,      so.plan_id
  3  ,      DECODE(ad.origin, 1, 'MANUAL-LOAD',
  4                        2, 'AUTO-CAPTURE',
  5                        3, 'MANUAL-SQLTUNE',
  6                        4, 'AUTO-SQLTUNE',
  7                        5, 'STORED-OUTLINE',
  8                           'UNKNOWN') origin
  9  ,      DECODE(BITAND(so.flags, 1), 1, 'YES', 'NO') enabled
 10  ,      DECODE(BITAND(so.flags, 2), 2, 'YES', 'NO') accepted
 11  ,      DECODE(BITAND(so.flags, 64), 64, 'NO', 'YES') reproduced
 12  FROM   sys.sqlobj$        so
 13  ,      sys.sqlobj$auxdata ad
 14  WHERE  ad.signature = so.signature 
 15  AND    ad.plan_id   = so.plan_id
 16  AND    so.signature = 9005682359107037619;

SIGNATURE                                   PLAN_ID ORIGIN         ENA ACC REP
---------------------------------------- ---------- -------------- --- --- ---
9005682359107037619                       263533726 MANUAL-LOAD    YES YES NO
9005682359107037619                      3860916006 AUTO-CAPTURE   YES NO  YES

SQL> 

The plan for our manually created baseline was not reproducible so the optimizer ignores it, generates its best plan and stores that as an AUTO_CAPTURE plan for future evolution and acceptance. The AUTO_CAPTURE plan gives the outline for the original index scan that makes the most sense but we’re not interested in that.

Here are the hints for our manually loaded baseline that failed to reproduce the plan:

SQL> select cast(extractvalue(value(x), '/hint') as varchar2(500)) as outline_hints  
  2  from   xmltable('/outline_data/hint'  
  3         passing (select xmltype(comp_data) xml  
  4                  from   sys.sqlobj$data  
  5                  where  signature = 9005682359107037619
  6                  and    plan_id   = 263533726)) x;  

OUTLINE_HINTS
---------------------------------------------------------------------------------------------
IGNORE_OPTIM_EMBEDDED_HINTS
OPTIMIZER_FEATURES_ENABLE('11.2.0.3')
DB_VERSION('11.2.0.3')
OPT_PARAM('optimizer_dynamic_sampling' 4)
ALL_ROWS
OUTLINE_LEAF(@"SEL$1")
FULL(@"SEL$1" "T1"@"SEL$1")

7 rows selected.

SQL> 

You can see the FULL hint but no PARALLEL.
With no PARALLEL hint, we can’t get the hash for the parallel plan that our baseline requires.

Shall we just check a SPM trace to verify what we see when a baseline cannot be reproduced?

SQL> alter session set tracefile_identifier = 'dom_spm';

Session altered.

SQL> alter session set events 'trace[RDBMS.SQL_Plan_Management.*]';

Session altered.

SQL> 
SQL> select * from t1 where col1 = 1;

      COL1 COL2                                                                  F
---------- --------------------------------------------------------------------- -
         1 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX N


SQL> 
SQL> alter session set events 'trace[RDBMS.SQL_Plan_Management.*] off';

Session altered.

SQL> 

And in the trace:

SPM: statement found in SMB
SPM: planId's of plan baseline are: 263533726
SPM: using qksan to reproduce, cost and select accepted plan, sig = 9005682359107037619
SPM: plan reproducibility round 1 (plan outline + session OFE)
SPM: using qksan to reproduce accepted plan, planId = 263533726
SPM: planId in plan baseline = 263533726, planId of reproduced plan = 3688435342
------- START SPM Plan Dump -------
SPM: failed to reproduce the plan using the following info:
  parse_schema name        : RIMS
  plan_baseline signature  : 9005682359107037619
  plan_baseline plan_id    : 263533726
  plan_baseline hintset    :
    hint num  1 len 27 text: IGNORE_OPTIM_EMBEDDED_HINTS
    hint num  2 len 37 text: OPTIMIZER_FEATURES_ENABLE('11.2.0.3')
    hint num  3 len 22 text: DB_VERSION('11.2.0.3')
    hint num  4 len 41 text: OPT_PARAM('optimizer_dynamic_sampling' 4)
    hint num  5 len  8 text: ALL_ROWS
    hint num  6 len 22 text: OUTLINE_LEAF(@"SEL$1")
    hint num  7 len 27 text: FULL(@"SEL$1" "T1"@"SEL$1")
SPM: generated non-matching plan:
...
SPM: plan reproducibility round 2 (hinted OFE only)
SPM: using qksan to reproduce accepted plan, planId = 263533726
SPM: planId in plan baseline = 263533726, planId of reproduced plan = 3860916006
------- START SPM Plan Dump -------
SPM: failed to reproduce the plan using the following info:
  parse_schema name        : RIMS
  plan_baseline signature  : 9005682359107037619
  plan_baseline plan_id    : 263533726
  plan_baseline hintset    :
    hint num  1 len 37 text: OPTIMIZER_FEATURES_ENABLE('11.2.0.3')
SPM: generated non-matching plan:
...
------- END SPM Plan Dump -------
SPM: couldn't reproduce any enabled+accepted plan so using the cost-based plan, planId = 3860916006

The Optimizer has two attempts are reproducing the required plan – the first one with the full set of hints, the second with just the OFE hint before reverting back to the best cost plan and storing that as mentioned as an AUTO-CAPTURE plan.

So, it seems as if using just a baseline to preserve parallelism does not work.

And perhaps that’s not surprising if you know what the PARALLEL hint does – it doesn’t force parallelism.

I understand what’s going here but but I’m not convinced by this baseline behaviour.

If a PARALLEL hint is required to get a specific plan, I’d expect it to be preserved in the baseline but the bottom line, I suppose, is that it’s not preserved in the hints in V$SQL_PLAN.OTHER_XML.

So, can SQL Patch help here?

Leaving our baseline in place and just adding a SQL Patch (i’ve said before that a SQL Patch can seem quite fussy
about using the proper hint syntax):

SQL> begin  
  2    sys.dbms_sqldiag_internal.i_create_patch  
  3    (sql_text  => 'select * from t1 where col1 = 1',  
  4     hint_text => 'PARALLEL(@"SEL$1" "T1"@"SEL$1" 4)',  
  5     name      => 'patch_test');   
  6  end;  
  7  /  

PL/SQL procedure successfully completed.

SQL> 

Shall we try the SQL again?

SQL> select * from t1 where col1 = 1;

      COL1 COL2                                                                  F
---------- --------------------------------------------------------------------- -
         1 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX N


SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------
SQL_ID  81qv4d7vkb571, child number 0
-------------------------------------
select * from t1 where col1 = 1

Plan hash value: 1704772559

------------------------------------------------------------------------------------
| Id  | Operation                   | Name | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |      |       |       |     2 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1   |     1 |   208 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | I1   |     1 |       |     1   (0)| 00:00:01 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("COL1"=1)

Note
-----
   - SQL patch "patch_test" used for this statement


23 rows selected.

SQL> 

So SQL Patch + SQL Baseline does not work as desired – the Baseline by itself can’t reproduce the desired plan and the Patch hint is irrelevant once the best cost plan is picked.

So, how about just using a SQL Patch with multiple hints? Can that work?

SQL> declare
  2   l_op pls_integer;
  3  begin
  4   l_op :=
  5   dbms_spm.drop_sql_plan_baseline('SQL_7cfa9c643693a9b3');
  6  end;
  7  /

PL/SQL procedure successfully completed.

SQL> begin  
  2    sys.dbms_sqldiag.drop_sql_patch('patch_test');   
  3  end;  
  4  / 

PL/SQL procedure successfully completed.

SQL> begin  
  2    sys.dbms_sqldiag_internal.i_create_patch  
  3    (sql_text  => 'select * from t1 where col1 = 1',  
  4     hint_text => 'PARALLEL(@"SEL$1" "T1"@"SEL$1" 4) FULL(@"SEL$1" "T1"@"SEL$1")',  
  5     name      => 'patch_test');   
  6  end;  
  7  / 

PL/SQL procedure successfully completed.

SQL> select * from t1 where col1 = 1;  

      COL1 COL2                                                                  F
---------- --------------------------------------------------------------------- -
         1 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX N

SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
SQL_ID  81qv4d7vkb571, child number 0
-------------------------------------
select * from t1 where col1 = 1

Plan hash value: 2494645258

----------------------------------------------------------------------------------------------------
| Id  | Operation            | Name     | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |          |       |       |   264 (100)|          |        |      |            |
|   1 |  PX COORDINATOR      |          |       |       |            |          |        |      |            |
|   2 |   PX SEND QC (RANDOM)| :TQ10000 |     1 |   208 |   264   (1)| 00:00:02 |  Q1,00 | P->S | QC
|   3 |    PX BLOCK ITERATOR |          |     1 |   208 |   264   (1)| 00:00:02 |  Q1,00 | PCWC |            |
|*  4 |     TABLE ACCESS FULL| T1       |     1 |   208 |   264   (1)| 00:00:02 |  Q1,00 | PCWP |            |
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access(:Z>=:Z AND :Z<=:Z)
       filter("COL1"=1)

Note
-----
   - SQL patch "patch_test" used for this statement


26 rows selected.

SQL>

Success!

So, in summary, it looks like SQL Plan baselines cannot preserve PARALLEL but a SQL Patch can apply it.

And, at the same time, this shows that you can apply multiple hints in one SQL Patch.

By the way, above I made a throwaway comment above that the PARALLEL hint is not preserved in V$SQL_PLAN.OTHER_XML which is why it’s not preserved in a baseline. We can use the ‘+OUTLINE’ format to prove this from that last example above with the SQL Patch:

SQL> select * from table(dbms_xplan.display_cursor(format=>'+OUTLINE'));

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
SQL_ID  81qv4d7vkb571, child number 0
-------------------------------------
select * from t1 where col1 = 1

Plan hash value: 2494645258

----------------------------------------------------------------------------------------------------
| Id  | Operation            | Name     | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |          |       |       |   264 (100)|          |        |      |            |
|   1 |  PX COORDINATOR      |          |       |       |            |          |        |      |            |
|   2 |   PX SEND QC (RANDOM)| :TQ10000 |     1 |   208 |   264   (1)| 00:00:02 |  Q1,00 | P->S | QC
|   3 |    PX BLOCK ITERATOR |          |     1 |   208 |   264   (1)| 00:00:02 |  Q1,00 | PCWC |            |
|*  4 |     TABLE ACCESS FULL| T1       |     1 |   208 |   264   (1)| 00:00:02 |  Q1,00 | PCWP |            |
----------------------------------------------------------------------------------------------------

Outline Data
-------------

  /*+
      BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('11.2.0.3')
      DB_VERSION('11.2.0.3')
      OPT_PARAM('optimizer_dynamic_sampling' 4)
      ALL_ROWS
      OUTLINE_LEAF(@"SEL$1")
      FULL(@"SEL$1" "T1"@"SEL$1")
      END_OUTLINE_DATA
  */

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access(:Z>=:Z AND :Z<=:Z)
       filter("COL1"=1)

Note
-----
   - SQL patch "patch_test" used for this statement


41 rows selected.

SQL> 

And finally…

If you’ve been looking at the undocumented DBMS_SQLDIAG_INTERNAL package that you need to create a SQL Patch, you might have noticed that alongside I_CREATE_PATCH there is an I_CREATE_HINTSET procedure.

This isn’t another new type of SQL Object but an internal api that probably all these SQL Objects use. So, if you try to create a hintset with I_CREATE_HINTSET then by default it will create a SQL Profile that can be dropped via DBMS_SQLTUNE.DROP_SQL_PROFILE.
Note that DBMS_SPM_INTERNAL has an interface to GET_SPM_HINTSET that amongst others takes an OBJ_TYPE parameter.

SQL Patch II

Following on from my previous post, let’s look at some of the workings of a SQL patch to try to get a bigger picture about this little-known feature.

Warning – this is quite a long post.

Let’s start with a similar example to that on the Oracle Optimizer blog article.

First, let’s setup a table that might work with the BIND_AWARE hint as per the Optimizer blog example.

SQL> create table t1
  2  (col1  number
  3  ,col2  varchar2(200)
  4  ,flag  varchar2(1));

Table created.

SQL> insert into t1
  2  select rownum
  3  ,      lpad('X',200,'X')
  4  ,      case when mod(rownum,10000) = 0
  5              then 'Y'
  6              else 'N'
  7         end
  8  from   dual
  9  connect by rownum <= 100000;

100000 rows created.

SQL> select flag, count(*) from t1 group by flag;  

F   COUNT(*)
- ----------
Y         10
N      99990

SQL> create index i1 on t1 (flag);

Index created.

SQL> exec dbms_stats.gather_table_stats(USER,'T1',method_opt=>'FOR COLUMNS flag SIZE 5');

PL/SQL procedure successfully completed.

And now a query with some bind variables.
Let’s execute this three or four times per bind variable:

SQL> var n varchar2(1)
SQL> exec :n := 'N'

PL/SQL procedure successfully completed.

SQL> select count(*), max(col2) from t1 where flag = :n;

  COUNT(*) MAX(COL2)
---------- -------------------------------------------------------------------
     99990 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

SQL> /
.... etc...

SQL> exec :n := 'Y';

PL/SQL procedure successfully completed.

SQL> select count(*), max(col2) from t1 where flag = :n;

  COUNT(*) MAX(COL2)
---------- -------------------------------------------------------------------
        10 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

SQL> /
.... etc...

Let’s double check that we have a sql statement that the BIND_AWARE hint might be applicable to:

SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------
SQL_ID  731b98a8u0knf, child number 1
-------------------------------------
select count(*), max(col2) from t1 where flag = :n

Plan hash value: 3625400295

-------------------------------------------------------------------------------------
| Id  | Operation                    | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |      |       |       |     2 (100)|          |
|   1 |  SORT AGGREGATE              |      |     1 |   104 |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID| T1   |    18 |  1872 |     2   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN          | I1   |    18 |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("FLAG"=:N)


20 rows selected.

SQL> select sql_id, is_bind_aware, to_char(exact_matching_signature) sig, executions
  2  from   v$sql
  3  where  sql_id = '731b98a8u0knf';

SQL_ID        I SIG                                      EXECUTIONS
------------- - ---------------------------------------- ----------
731b98a8u0knf N 1292784087274697613                               6
731b98a8u0knf Y 1292784087274697613                               1

So far so good – we know we have a table and a statement that can be bind aware.

The next steps – as per the Optimizer blog – are to show how we can get the BIND_AWARE hint to apply from the first execution.

So, how about we flush this statement above and use the SQL Patch functionality to inject the BIND_AWARE hint.

SQL> select sql_id, sql_text, is_bind_aware, to_char(exact_matching_signature) sig, executions
  2  from   v$sql
  3  where  sql_id = '731b98a8u0knf';

no rows selected

SQL> begin
  2   sys.dbms_sqldiag_internal.i_create_patch
  3   (sql_text  => 'select count(*), max(col2) from t1 where flag = :n',
  4    hint_text => 'BIND_AWARE',
  5    name      => 'patch_test'); 
  6  end;
  7  /

PL/SQL procedure successfully completed.

SQL> select count(*), max(col2) from t1 where flag = :n;

  COUNT(*) MAX(COL2)
---------- -------------------------------------------------------------------
        10 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------
SQL_ID  731b98a8u0knf, child number 0
-------------------------------------
select count(*), max(col2) from t1 where flag = :n

Plan hash value: 3625400295

-------------------------------------------------------------------------------------
| Id  | Operation                    | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |      |       |       |     2 (100)|          |
|   1 |  SORT AGGREGATE              |      |     1 |   104 |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID| T1   |    18 |  1872 |     2   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN          | I1   |    18 |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("FLAG"=:N)

Note
-----
   - SQL patch "patch_test" used for this statement


24 rows selected.

SQL> select sql_id, sql_text, is_bind_aware, to_char(exact_matching_signature) sig, executions
  2  from   v$sql
  3  where  sql_id = '731b98a8u0knf';

SQL_ID        I SIG                                      EXECUTIONS
------------- - ---------------------------------------- ----------
731b98a8u0knf Y 1292784087274697613                               1

SQL> 

Great – it all works as per the Optimizer team told us it would.

The execution plan has one of those invaluable Notes confirming that a SQL patch was used.

So, next up, why can’t we use a baseline to do this?

Well, as mentioned previously, it’s not because baselines can’t change execution plans in a similar way.

Au contraire, they can and, in most circumstances, they are probably the best placed feature to do this using DBMS_SPM.LOAD_PLANS_FROM_CURSOR_CACHE.

What they can’t do is inject the BIND_AWARE hint.

Allow me to demonstrate.

The approach with a baseline is slightly more complicated because we have to generate the required plan with one bit of sql and then, while it’s still in the cursor cache, transfer the whole plan into a baseline for the target sql statement.

First, let me delete that SQL Patch and verify that nothing is affecting our target statement.

SQL> exec dbms_sqldiag.drop_sql_patch('patch_test');

PL/SQL procedure successfully completed.

SQL> select count(*), max(col2) from t1 t1 where flag = :n;

  COUNT(*) MAX(COL2)
---------- -------------------------------------------------------------------
        10 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX


SQL> 
SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
SQL_ID  5ya8tff6fdrsa, child number 0
-------------------------------------
select count(*), max(col2) from t1 t1 where flag = :n

Plan hash value: 3625400295

-------------------------------------------------------------------------------------
| Id  | Operation                    | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |      |       |       |     2 (100)|          |
|   1 |  SORT AGGREGATE              |      |     1 |   104 |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID| T1   |    18 |  1872 |     2   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN          | I1   |    18 |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("FLAG"=:N)


20 rows selected.

SQL> 

Then let’s get a plan for a statement with the BIND_AWARE hint.

SQL> select /*+ bind_aware */ count(*), max(col2) from t1 t1 where flag = :n;

  COUNT(*) MAX(COL2)
---------- -------------------------------------------------------------------
        10 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX


SQL> 
SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
SQL_ID  7qzu8gp22qr0v, child number 0
-------------------------------------
select /*+ bind_aware */ count(*), max(col2) from t1 t1 where flag = :n

Plan hash value: 3625400295

-------------------------------------------------------------------------------------
| Id  | Operation                    | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |      |       |       |     2 (100)|          |
|   1 |  SORT AGGREGATE              |      |     1 |   104 |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID| T1   |    18 |  1872 |     2   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN          | I1   |    18 |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("FLAG"=:N)


20 rows selected.

SQL> select sql_id, sql_text, is_bind_aware, to_char(exact_matching_signature) sig, executions
  2  from   v$sql
  3  where  sql_id in ('731b98a8u0knf','7qzu8gp22qr0v');

SQL_ID        I SIG                                      EXECUTIONS
------------- - ---------------------------------------- ----------
7qzu8gp22qr0v Y 17219147247362199511                              1

SQL> 

Our new hinted statement is instantly bind aware when hinted as such.

Now, let’s load this plan from the cursor cache to the unhinted target statement using the sql_id and plan_hash_value from the DBMS_XPLAN output above.

SQL> declare   
  2    sqltext clob;   
  3    spm_op pls_integer;   
  4  begin   
  5    sqltext := 'select count(*), max(col2) from t1 where flag = :n';   
  6    spm_op  :=   
  7    dbms_spm.load_plans_from_cursor_cache   
  8    (sql_id => '7qzu8gp22qr0v',   
  9     plan_hash_value => 3625400295,   
 10     sql_text => sqltext);   
 11  end;   
 12  /   

PL/SQL procedure successfully completed.

And let’s see what difference the baseline makes:

SQL> select count(*), max(col2) from t1 where flag = :n;

  COUNT(*) MAX(COL2)
---------- -------------------------------------------------------------------
        10 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX


SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
SQL_ID  731b98a8u0knf, child number 0
-------------------------------------
select count(*), max(col2) from t1 where flag = :n

Plan hash value: 3625400295

-------------------------------------------------------------------------------------
| Id  | Operation                    | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |      |       |       |     2 (100)|          |
|   1 |  SORT AGGREGATE              |      |     1 |   104 |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID| T1   |    18 |  1872 |     2   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN          | I1   |    18 |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("FLAG"=:N)

Note
-----
   - SQL plan baseline SQL_PLAN_13w748wknkcwd8576eb1f used for this statement


24 rows selected.

Great! So our baseline is in full effect.

But, is the statement bind aware as it was the SQL Patch was applied?

SQL> select sql_id, sql_text, is_bind_aware, to_char(exact_matching_signature) sig, executions
  2  from   v$sql
  3  where  sql_id in ('731b98a8u0knf');

SQL_ID        I SIG                                      EXECUTIONS
------------- - ---------------------------------------- ----------
731b98a8u0knf N 1292784087274697613                               1

No, it’s not.

Now, why should this be?

Let’s recap the different intentions of SQL Baselines and SQL Patches

SQL Baselines exist to reproduce a specific plan. In fact, a plan hash exists as part of the baseline.
If, on application of the baseline, the Optimizer is unable to reproduce the desired plan, the baseline is rejected outright.

On the other hand, SQL Patches have been developed primarily to get the Optimizer to avoid a particular problem path in an execution plan, specifically to avoid failures due to certain access methods, join methods, etc.

Or to put it another way, the answer is in the internals.

Let me recreate the patch again:

SQL> begin
  2   sys.dbms_sqldiag_internal.i_create_patch
  3   (sql_text  => 'select count(*), max(col2) from t1 where flag = :n',
  4    hint_text => 'BIND_AWARE',
  5    name      => 'patch_test'); 
  6  end;
  7  /

PL/SQL procedure successfully completed.

You could look at DBA_SQL_PATCHES but it doesn’t tell you very much.

Or at least it does – for example the FORCE_MATCHING column indicates that SQL Patches have the same force matching ability as SQL Profiles and which Baselines don’t have – but it doesn’t show what I want you to see.

It can be a bit more revealing to look around under the hood at SYS.SQLOBJ$, SYS.SQLOBJ$DATA and SYS.SQLOBJ$AUXDATA.

Using the signature from V$SQL above:

SQL> select signature, obj_type, plan_id, name, flags, last_executed
  2  from sys.sqlobj$ 
  3  where signature = 1292784087274697613;

 SIGNATURE   OBJ_TYPE    PLAN_ID NAME                                FLAGS LAST_EXECUTED
---------- ---------- ---------- ------------------------------ ---------- -------------------------
1.2928E+18          2 2239163167 SQL_PLAN_13w748wknkcwd8576eb1f         11 06-MAR-12 03.01.19.000000 PM
1.2928E+18          3          0 patch_test                              1

Profiles have an OBJ_TYPE of 1.

Our Baseline has a type of 2.

The Patch has a type of 3.

The flags translate to ENABLED/DISABLED for SQL Profiles and SQL Patches and to values for ENABLED, ACCEPTED, FIXED, REPRODUCED and AUTOPURGE for SQL Plan Baselines.

Another important aspect is that PLAN_ID.
I mentioned that Baselines are designed to reproduce a specific plan and if the optimizer is unable to reproduce the specific plan when applying the baseline then it won’t use it. Plan_id is another hash, not plan_hash_value but plan_hash_2 from V$SQL.OTHER_XML.

There’s a contrast of information elsewhere.
For example, SQLOBJ$AUXDATA will highlight the performance metrics that are tracked for Baselines to aid evolution.

But it’s the hints that really should be interesting us, for that we need to look at SQLOBJ$DATA.

Here’s what our SQL Patch uses:

SQL> select cast(extractvalue(value(x), '/hint') as varchar2(500)) as outline_hints
  2  from   xmltable('/outline_data/hint'
  3         passing (select xmltype(comp_data) xml
  4                  from   sys.sqlobj$data
  5                  where  signature = 1292784087274697613
  6                  and    obj_type  = 3)) x;

OUTLINE_HINTS
-------------------------------------------------------------------------------------------
BIND_AWARE

SQL> 

Just what we asked it to do – only inject BIND_AWARE

And our baseline contains all that is required to reproduce the exact plan it was asked to baseline.

SQL> select cast(extractvalue(value(x), '/hint') as varchar2(500)) as outline_hints
  2  from   xmltable('/outline_data/hint'
  3         passing (select xmltype(comp_data) xml
  4                  from   sys.sqlobj$data
  5                  where  signature = 1292784087274697613
  6                  and    obj_type  = 2)) x;

OUTLINE_HINTS
---------------------------------------------------------------------------------------
IGNORE_OPTIM_EMBEDDED_HINTS
OPTIMIZER_FEATURES_ENABLE('11.2.0.3')
DB_VERSION('11.2.0.3')
OPT_PARAM('optimizer_dynamic_sampling' 4)
ALL_ROWS
OUTLINE_LEAF(@"SEL$1")
INDEX_RS_ASC(@"SEL$1" "T1"@"SEL$1" ("T1"."FLAG"))

7 rows selected.

SQL> 

No sign of BIND_AWARE – only the hints that are required to reproduce the exact plan we asked for. And in fact BIND_AWARE might be a threat to that plan stability.

And so to the question of whether a SQL Patch and a Baseline can work together?

And the answer is yes PROVIDED that the hints are such that the pair can happily co-exist:

SQL> select count(*), max(col2) from t1 where flag = :n;

  COUNT(*) MAX(COL2)
---------- -------------------------------------------------------------------
        10 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX


SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
SQL_ID  731b98a8u0knf, child number 0
-------------------------------------
select count(*), max(col2) from t1 where flag = :n

Plan hash value: 3625400295

-------------------------------------------------------------------------------------
| Id  | Operation                    | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |      |       |       |     2 (100)|          |
|   1 |  SORT AGGREGATE              |      |     1 |   104 |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID| T1   |    18 |  1872 |     2   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN          | I1   |    18 |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("FLAG"=:N)

Note
-----
   - SQL patch "patch_test" used for this statement
   - SQL plan baseline SQL_PLAN_13w748wknkcwd8576eb1f used for this statement


25 rows selected.

SQL> select sql_id, sql_text, is_bind_aware, to_char(exact_matching_signature) sig, executions
  2  from   v$sql
  3  where  sql_id in ('731b98a8u0knf');

SQL_ID        I SIG                                      EXECUTIONS
------------- - ---------------------------------------- ----------
731b98a8u0knf Y 1292784087274697613                               1

SQL> 

What happens if we try to create a SQL Patch that conflicts with the instructions of the Baseline?

Well, first note that only one SQL Patch can exist for statement at any one time (in any one category as per sql profile functionality):

SQL> begin
  2   sys.dbms_sqldiag_internal.i_create_patch
  3   (sql_text  => 'select count(*), max(col2) from t1 where flag = :n',
  4    hint_text => 'FULL(t1)',
  5    name      => 'patch_test2'); 
  6  end;
  7  /
begin
*
ERROR at line 1:
ORA-13830: SQL profile with category DEFAULT already exists for this SQL statement
ORA-06512: at "SYS.DBMS_SQLTUNE_INTERNAL", line 16167
ORA-06512: at "SYS.DBMS_SQLDIAG_INTERNAL", line 204
ORA-06512: at line 2

So, let’s delete the existing SQL Patch and check that we’ve still got a baseline working (I’ve recreated the baseline since the demo started so it’s got a different name):

SQL> exec dbms_sqldiag.drop_sql_patch('patch_test');

PL/SQL procedure successfully completed.

SQL> select count(*), max(col2) from t1 where flag = :n;

  COUNT(*) MAX(COL2)
---------- -------------------------------------------------------------------
        10 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX


SQL> 
SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
SQL_ID  731b98a8u0knf, child number 0
-------------------------------------
select count(*), max(col2) from t1 where flag = :n

Plan hash value: 3625400295

-------------------------------------------------------------------------------------
| Id  | Operation                    | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |      |       |       |     2 (100)|          |
|   1 |  SORT AGGREGATE              |      |     1 |   104 |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID| T1   |    18 |  1872 |     2   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN          | I1   |    18 |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("FLAG"=:N)

Note
-----
   - SQL plan baseline SQL_PLAN_13w748wknkcwd8576eb1f used for this statement


24 rows selected.

SQL> 

And let’s try to create one that will conflict with the index hints in the Baseline (note that you seem to have to be very particular about the full and proper hint syntax, i.e. just a FULL(t1) wasn’t cutting it):

SQL> begin
  2   sys.dbms_sqldiag_internal.i_create_patch
  3   (sql_text  => 'select count(*), max(col2) from t1 where flag = :n',
  4    hint_text => 'FULL(@"SEL$1" "T1"@"SEL$1")',
  5    name      => 'patch_test2'); 
  6  end;
  7  /

PL/SQL procedure successfully completed.

SQL> 

And in this scenario, it seems that Patch trumps Baseline:

SQL> select count(*), max(col2) from t1 where flag = :n;

  COUNT(*) MAX(COL2)
---------- -------------------------------------------------------------------
        10 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX


SQL> 
SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
SQL_ID  731b98a8u0knf, child number 0
-------------------------------------
select count(*), max(col2) from t1 where flag = :n

Plan hash value: 3724264953

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |       |   956 (100)|          |
|   1 |  SORT AGGREGATE    |      |     1 |   104 |            |          |
|*  2 |   TABLE ACCESS FULL| T1   |    18 |  1872 |   956   (2)| 00:00:05 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("FLAG"=:N)

Note
-----
   - SQL patch "patch_test2" used for this statement


23 rows selected.

which is probably what we’d expect.

That’ll do for now.

SQL Patch I

I know I wasn’t the only one to be intrigued by the recent blog article by the Oracle Optimizer team on injecting hints using a SQL Patch.

If you’ve read the article, you’ll know that creating a SQL Patch requires the use of the undocumented package DBMS_SQLDIAG_INTERNAL which is part of the SQL Repair Advisor.

Now, whilst creating a SQL Patch may be undocumented, altering and dropping a SQL Patch are documented in the DBMS_SQLDIAG package reference.

A follow-up post by the Oracle Optimizer team has since started to address some of the questions and promises to reveal more in the next few weeks.

Some of the questions that sprang to my mind on first reading the article were:
Q. Is this supported?
A. From answers so far, this is still unclear but without a doubt the feature is as good as undocumented.

Q. What is the license situation with a SQL Patch?
A. It’s a standard part of the SQL Repair Advisor which is part of 11g Enterprise Edition, no extra licensing like Diagnostic+Tuning pack required.

Q. Why/When would you use a SQL Patch rather than a SQL Plan Baseline?
A. Unanswered.

For me, the latter has been the big question mark.

Below is my opinion based on what I’ve read and played with so far.

I think part of the confusion is related to the title and premise of the original article – “Using SQL Patch to add hints to a packaged application”. The main illustration and link to previous post concerned the BIND_AWARE hint – that, I believe, was the primary use case – injecting a single hint – being illustrated by the article, not that a SQL Patch is necessarily the best mechanism for changing execution plans for packaged application code.

To my mind, in most circumstances SQL Plan Baselines are the prefered mechanism for changing the executions plans without touching the source code. The API DBMS_SPM.LOAD_PLANS_FROM_CURSOR_CACHE is ideal for this as shown in this article.

But… you can’t use a baseline to inject the BIND_AWARE hint (I’ll touch on that a bit in my next post).

It all comes down to slightly different intentions and use cases.

SQL Baselines exist to reproduce a specific plan. In fact, a plan hash exists as part of the baseline. If, on application of the baseline, the Optimizer is unable to reproduce the desired plan, the baseline is rejected outright.

On the other hand, SQL Patches have been developed primarily to get the Optimizer to avoid a particular problem path in an execution plan, specifically to avoid failures due to certain access methods, join methods, etc (and when you see the wealth of wrong results bugs, it’s not surprising that such a feature has been implemented).

If you were in such a crash or wrong results scenario, maybe you can start to see why you might want the lighter touch intervention of a SQL Patch over the more prescriptive SQL Plan Baseline.

In my next post, I hope to touch on some of these internal differences and show how in a very limited set of circumstances you could have a patch and a baseline applied to the same SQL.

For further information on SQL Patch functionality see my other posts:
SQL Patch II
SQL Patch III

Block Corruption II

As mentioned before, we’ve had a block corruption that hadn’t yet caused any problems in production but was causing issues for our 11g upgrade rehearsals with the datapump export/import stage.

Because it wasn’t causing any problems in production, the approach to addressing it was somewhat relaxed!

But what this issue did demonstrate was big concerns over the amount of redo that this system generates in a day – 500 GB – and how that volume of redo restricts the practicalities and ability to respond for a physical recovery of the problem, particularly as we only thing we knew was the recovery had happened at some point since 1st January 2012.

Fortunately we were able to identify that:

  1. Only one block in one partition of one table was affected

And we could say with a high degree of confidence that:

  1. That two rows in this one block were affected by the corruption and
  2. We could get the values from these two row from one of the uat databases and that the data hadn’t changed since.

In other words, we could resolve this corruption with a manual, logical approach.

For example, cross-referencing a block dump to see how the block is corrupted, I could get all the rows in this one block that haven’t been affected using something like this:

SELECT DBMS_ROWID.ROWID_ROW_NUMBER(rowid)
,      t.*
FROM   trade PARTITION (p_01) t
WHERE  rowid >= DBMS_ROWID.ROWID_CREATE (1,98381,24,1093595,0)
AND    rowid <  DBMS_ROWID.ROWID_CREATE (1,98381,24,1093596,0)
AND    rowid NOT IN ('AAAYBNAAYAAEK/bAAR','AAAYBNAAYAAEK/bAAT');

So, the approach for this manual fix was along the lines of:
1. Create an empty copy of the affected table.

CREATE TABLE repaired_table
TABLESPACE ....
AS
SELECT *
FROM   TRADE
WHERE  1 = 2;

2. Copy all the data apart from this one block into the new table

INSERT
INTO   repaired_table
SELECT * 
FROM   trade PARTITION (P_01) 
WHERE  rowid < DBMS_ROWID.ROWID_CREATE (1,98381,24,1093595,0)
UNION ALL
SELECT *
FROM   trade PARTITION (P_01)
WHERE  rowid >= DBMS_ROWID.ROWID_CREATE (1,98381,24,1093596,0);

3. Insert rows from block that were ok (double checked via block dump)

INSERT 
INTO   TRADE
SELECT t.*
FROM   trade PARTITION (p_01) t
WHERE  rowid >= DBMS_ROWID.ROWID_CREATE (1,98381,24,1093595,0)
AND    rowid <  DBMS_ROWID.ROWID_CREATE (1,98381,24,1093596,0)
AND    rowid NOT IN ('AAAYBNAAYAAEK/bAAR','AAAYBNAAYAAEK/bAAT');

4. Insert good versions of the corrupted rows extracted from an older UAT database.
5. Create mirror of local indexes on copy table

6. Partition exchange without validation including indexes

7. Rebuild global indexes on original table

8. Do stuff with stats if you needed to (we recovered stats from a stats backup table)

At this point, table REPAIRED_TRADE now has the corrupt block allocated.

Even if we drop the table, DBVERIFY still reports the corrupt block as it has not been formatted.

This is normal as the block will not be formatted until it is taken off the freelist and used by a new segment.

For completeness, we wanted to get this block formatted.

Because we did not drop table REPAIRED_TRADE we know where this block is.

So:

1. TRUNCATE TABLE repaired_trade REUSE STORAGE;

2. Drop the indexes on this table that we wanted for ease of partition exchange

3. Create trigger to tell us when we format the block in question:

CREATE OR REPLACE TRIGGER corrupt_trigger 
AFTER INSERT ON repaired_trade 
REFERENCING OLD AS p_old NEW AS new_p 
FOR EACH ROW 
DECLARE 
  e_corrupt EXCEPTION; 
BEGIN 
  IF  (DBMS_ROWID.ROWID_BLOCK_NUMBER(:new_p.rowid)=1093595)
  AND (DBMS_ROWID.ROWID_RELATIVE_FNO(:new_p.rowid)=24) 
  THEN 
      RAISE e_corrupt; 
  END IF; 
EXCEPTION 
  WHEN e_corrupt THEN 
     RAISE_APPLICATION_ERROR(-20000, 'Corrupt block has been formatted'); 
END; 
/

4. Reinsert data into this table from original TRADE table

INSERT
INTO   repaired_trade
SELECT * FROM trade;

5. Repeat 4 if necessary, i.e. until we get the exception then

6. DROP TABLE repaired_trade;

At this point, DBVERIFY will no longer report that there is a corrupt block.

Block corruption I

Sod’s law – the penultimate dress rehearsal for our 11gR2 upgrade showed that we have suddenly have a block corruption issue on our 9i production database.

Mind you better to find out now than go-live weekend.

It’s only one block and actually it only seems to be two rows in that one block.

The response to the severity 1 SR was less than impressive.

Anyway… as you might expect, the impact of a block corruption can be unpredictable.

Some observations from this particular case:

1. Can’t access the table block by PK.

SQL> explain plan for
  2  select * from trade where trad_tag = 185263584;

Explained.

SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------

----------------------------------------------------------------------------------------------------
| Id  | Operation                          |  Name       | Rows  | Bytes | Cost (%CPU)| Pstart| Pstop |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                   |             |     1 |   332 |    13   (8)|       |       |
|   1 |  PARTITION LIST ALL                |             |       |       |            |     1 |    11 |
|   2 |   TABLE ACCESS BY LOCAL INDEX ROWID| TRADE       |     1 |   332 |    13   (8)|     1 |    11 |
|*  3 |    INDEX RANGE SCAN                | TRADE_IDX   |     1 |       |    24   (5)|     1 |    11 |
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("TRADE"."TRAD_TAG"=185263584)

14 rows selected.

SQL> select * from trade where trad_tag = 185263584;
select * from trade where trad_tag = 185263584
*
ERROR at line 1:
ORA-03113: end-of-file on communication channel

2. Can access the index with no table access.

SQL> explain plan for
  2  select rowid rd, trad_tag from trade where trad_tag = 185263584;

Explained.

SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------

-----------------------------------------------------------------------------------------
| Id  | Operation            |  Name       | Rows  | Bytes | Cost (%CPU)| Pstart| Pstop |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |             |     1 |    14 |    12   (0)|       |       |
|   1 |  PARTITION LIST ALL  |             |       |       |            |     1 |    11 |
|*  2 |   INDEX RANGE SCAN   | TRADE_IDX   |     1 |    14 |    24   (5)|     1 |    11 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("TRADE"."TRAD_TAG"=185263584)

13 rows selected.

SQL> select rowid rd, trad_tag from trade where trad_tag = 185263584;

RD                   TRAD_TAG
------------------ ----------
AAAYBNAAYAAEK/bAAR  185263584

3. But can access the corruption directly via rowid:

SQL> explain plan for
  2  select trad_tag, int_override_ind, length(int_override_ind)
  3  from   trade
  4  where  rowid = 'AAAYBNAAYAAEK/bAAR';

Explained.

SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------

-----------------------------------------------------------------------------------------------
| Id  | Operation                  |  Name       | Rows  | Bytes | Cost (%CPU)| Pstart| Pstop |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |             |     1 |    16 |     2  (50)|       |       |
|   1 |  TABLE ACCESS BY USER ROWID| TRADE       |     1 |    16 |     2  (50)| ROWID | ROW L |
-----------------------------------------------------------------------------------------------

7 rows selected.

SQL> select trad_tag, int_override_ind, length(int_override_ind)
  2  from   trade
  3  where  rowid = 'AAAYBNAAYAAEK/bAAR';

  TRAD_TAG I LENGTH(INT_OVERRIDE_IND)
---------- - ------------------------
           N                        8

SQL> 

4. But an index of one of the corrupted values in the corrupt block is also affected:

SQL> explain plan for 
  2  select route_code, count(*) from trade group by route_code;

Explained.

SQL>  select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------

--------------------------------------------------------------------------------------------
| Id  | Operation              |  Name        | Rows  | Bytes | Cost (%CPU)| Pstart| Pstop |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |              |     3 |     6 | 15098  (57)|       |       |
|   1 |  SORT GROUP BY         |              |     3 |     6 | 15098  (57)|       |       |
|   2 |   PARTITION LIST ALL   |              |       |       |            |     1 |    11 |
|   3 |    INDEX FAST FULL SCAN| TRADE_7_IDX  |    30M|    59M|  7452  (13)|     1 |    11 |
--------------------------------------------------------------------------------------------

9 rows selected.


SQL> select route_code, count(*) from trade group by route_code;
ERROR:
ORA-00600: internal error code, arguments: [%s], [%s], [%s], [%s], [%s], [%s], [%s], [%s]

For one illustration of the corruption, see this VARCHAR2(1) column:

SQL> desc trade
 Name                                                              Null?    Type
 ----------------------------------------------------------------- -------- -----------------
 ...
 INT_OVERRIDE_IND                                                  NOT NULL VARCHAR2(1)
 ...

SQL> select int_override_ind, length(int_override_ind) lngth, dump(int_override_ind) dmp
  2  from trade
  3  where trad_tag = 185263584;

I      LNGTH DMP
- ---------- --------------------------------------------------
N          8 Typ=1 Len=8: 78,1,78,255,255,1,128,1

Back to access by rowid…

5. This works in Toad but not SQL*Plus, i.e. there must be some significant difference in how the
different clients deal with their resultsets:

SQL> select * 
  2  from   trade 
  3  where  rowid IN ( chartorowid('AAAYBNAAYAAEK/bAAR'), 
  5                    chartorowid('AAAYBNAAYAAEK/bAAT'));
ERROR:
ORA-01858: a non-numeric character was found where a numeric was expected


no rows selected

SQL> 

That’s it really.

Obviously we need to fix it.

Upgrade update

Finally, my client’s 11g upgrade is approaching.

It feels like it’s been imminent for ages as we slipped from February 2011 to September then through October to November 2011, February 2012, March 10 2012 and latterly to March 25 2012.

On the whole, these slippages have been more to do with affairs of big, corporate IT rather than any issues with the testing of the application under 11g, for example, red tape, higher priority changes and procurement for the whole 11g development estate.

When we actually go live, I’d be surprised if we don’t see some significant issues and oversights in a relatively short timeframe.

However, before then I just want to go through a quick whirlwind of some of the highlights and challenges undertaken/encountered:

Migration from 9.2.0.8 on Solaris to 11.2.0.3 on Linux via an in situ upgrade to 10g on Solaris as an interim step to faciliate use of datapump as opposed to imp/exp
- Change of hardware.
- Change of OS.
- Change of db version.
- We do not use “incremental change”.

Moving from a 20G sga with 17G buffer cache to a 70G sga and a minimum 25G buffer cache – in reality nearer 60G – using sga_target and db_cache_size.

Not using AMM (memory_target) yet not using hugepages!

Async and concurrent IO.

Moving from a statistics strategy of ‘FOR ALL INDEXED COLUMNS SIZE AUTO’ fixed with the two-monthly release cycle to the default stats job (running in a weekend window).

Removal of a weekend maintenance job that rebuilt lots of indexes for undocumented historical reasons.

Application issues:

No (very limited) use of sql plan baselines.

Trial of cursor_sharing = force for modules with poor shared sql principles, abandonment of trial on 11.2.0.2 due to ORA-07445/ORA-00600 from pro*c code.

Manual tuning of any code that exhibited any performance degradation – plenty of that (not surprising given that the oldest comments in the code date back to 1992).

Bugs/Patches applied initially on 11.2.0.2:
11719151 – crippling sql plan management slowness
9842771 – wrong sreadtim, mreadtim statistics
10269193 – wrong results with outer join and case expression
9877980 – issues with cursor_sharing = force

then because of an XSLT bug (10390389) and because of the number of other bugs listed as fixed
11.2.0.3 (which incorporates patches 11719151 and 9877980 previously applied on top of 11.2.0.2)

Fundamentally Irrelevant

There’s always a risk when you go off testing something that you notice some side-effect or issue that turns out to be irrelevant to the main investigation.

I’ve been investigating a performance problem on an insert .. select statement.

For the last couple of days, I’ve had a physical copy of the production database from the day before and I’ve been running this statement, rolling it back, running it again, rolling it back, etc and I’ve been running it in within a plsql harness doing some before/after runstats calls to supplement the extended trace.

Comparing the first run of the insert statement with subsequent runs, I noticed significant differences between the statistics leaf node splits and leaf node 90-10 splits and wondered how differences in these might tie in to some of the differences in other statistics.

The actual values are unimportant but on the first run there were roughly 3000 leaf node splits and 1200 leaf node 90 10 splits, meaning that 1800 were 50-50.

On any subsequent run, there were 1200 leaf node splits, of which all were leaf node 90-10 splits.

So, why the difference?

With a 50:50 leaf split, I want to insert a new entry somewhere in the middle of my existing index block and so Oracle is taking my full index block, getting a new block and allocating half of the index entries from the full block and updating the various linking references.

Whereas with a 90:10 split, I’ve got a value higher than the current entries in this full block. Typically you get 90:10 splits with sequence inserts, i.e. monotonically increasing values always going into the far right side of the index, each value bigger than the last.

So why the different observations?

I wondered the question on oracle-l but sometimes just being a bit dim doesn’t come translate effectively in an email or forum thread.

There was far too much information in my question when really I should have just asked “what am I not getting?“.

Sometimes it’s difficult to ask the right question when you don’t get what you’re missing.

Sometimes we can look at 2 + 2 and come up with 5.

Sometimes we’re just being vacant and just staring at 2 + 2 not realising we’re expected to add it up?

I understood why my insert would be getting mostly 50:50 splits on these indexes and 90:10 splits on those but didn’t simply get the correlation with the rollback & repeat.

Space management is a recursive operation that is effectively unimpacted by my transaction rollback.

I could understand why there were no 50:50 splits in the subsequent runs.

So, even though my transaction rolled back, the effect of my transaction-that-never-was was still to split some of the index keys across more blocks that they used to occupy.

But then why wouldn’t the same be true for 90-10 splits?

Honestly, this was a painful wait for me for the penny to drop and I prefer not to do my penny drop waiting in public.

Umm… because those new blocks were filled with new data – those hundreds of thousands of new sequence numbers.

… and you rolled back

… so you left behind a bunch of completely empty blocks which went back on the freelist

… ready for you to do it all again when you repeated the insert.

Doh!

Something fundamental just overlooked.

And yet completely irrelevant to what I was investigating, brought about only because of the whole roll-back-and-repeat thing

Materialize

Summary – Note the recursive SQL, the association of an in-memory temporary table with a child cursor, and possible side-effects for distributed transactions.

Prompted by a recent thread on the OTN forums, if you /*+ materialize */ a subquery will you always get IO associated with that materialisation?

Short answer: Yes. For that is what materialisation is all about.

A longer answer is perhaps slightly more interesting.

In terms of materialisation, you can force it with the hint above or it will automatically kick in if you reference it at least twice. I’m not aware of this threshold being documented but Jonathan Lewis mentioned this observation here and it ties in with what I’ve seen.

And it doesn’t seem to matter how small the result set is, it will always be materialised if those materialisation criteria are met.

If we trace a new query, we can see some of the recursive sql involved.

SQL> alter session set events '10046 trace name context forever, level 12';
SQL>
SQL> with x as
  2  (select /*+ materialize */
  3          1 col1
  4   from   dual)
  5  select * from x;

      COL1
----------
         1

SQL> alter session set events '10046 trace name context off';
SQL>

Tracing older versions of Oracle can be more revealing because in 9iR2 for example the trace file explicitly lists the recursive INSERT into the temp table, whereas by the time you get to 11.2 the INSERT has disappeared and the associated waits incorporated into the SELECT.

All versions list the creation of the temporary table (if indeed it needs to be created – see below), the DDL for which includes the specifications IN_MEMORY_METADATA and CURSOR_SPECIFIC_SEGMENT.

CREATE GLOBAL TEMPORARY TABLE "SYS"."SYS_TEMP_0FD9D6610_8A97FC" ("C0" NUMBER )
   IN_MEMORY_METADATA CURSOR_SPECIFIC_SEGMENT STORAGE (OBJNO 4254950928 )
  NOPARALLEL

Note in the creation of table SYS_TEMP_0FD9D6610_8A97FC that 0FD9D6610 is the hex of 4254950928, which is just the sequence-based objno. Not sure of the significance of the last part, e.g.8A97FC.

We can also see that the data is written to temp using a direct path write/direct path write temp depending on version … and selected back via the buffer cache (for efficient use of the data) using a db file sequential read or db file scattered read.

In older versions as mentioned, you should find the recursive INSERT listed separately, e.g. (different database, different version, different temp table name and if you’re interested in the control file sequential read see this post by Timur Akhmadeev):

INSERT /*+ APPEND BYPASS_RECURSIVE_CHECK */ INTO 
  "SYS"."SYS_TEMP_0FD9D662B_671BC5CD" SELECT /*+ */ 1 FROM "SYS"."DUAL" 
  "DUAL"

Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  control file sequential read                    3        0.00          0.00
  direct path write                               1        0.00          0.00
********************************************************************************

Otherwise in newer versions, no insert but the waits for the recursive statement listed as part of the main select:

with x as
(select /*+ materialize */
        1 col1
 from   dual)
select * from x 

Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  Disk file operations I/O                        2        0.00          0.00
  direct path write temp                          1        0.00          0.00
  direct path sync                                1        0.02          0.02
  SQL*Net message to client                       2        0.00          0.00
  db file sequential read                         1        0.00          0.00
  SQL*Net message from client                     2       14.45         14.45
********************************************************************************

The temp table exists in memory and our session and other sessions cannot describe it but can select from it:

SQL> desc  sys.SYS_TEMP_0FD9D6610_8A97FC;
ERROR:
ORA-04043: object sys.SYS_TEMP_0FD9D6610_8A97FC does not exist


SQL> select * from sys.SYS_TEMP_0FD9D6610_8A97FC;

no rows selected

SQL> 

Note that the temp table is associated with the child cursor. This can be observed by using multiple sessions and forcing the creation of multiple child cursors – for example by using different optimizer settings – and tracing those sessions.

Subsequent executions of this cursor – by this session or another – can reuse this existing in-memory temporary table with no need to recreate it.

So, if we ran into one of the numerous situations that exist – often caused by bugs – where there are excessive child cursors for sql statements, if these use materialised subqueries then this is something else to be slightly concerned about.

If the cursor ages out or we flush the shared pool, the table will be cleaned up along with the cursor.

SQL> alter system flush shared_pool;
SQL> select * from sys.SYS_TEMP_0FD9D6610_8A97FC;
select * from sys.SYS_TEMP_0FD9D6610_8A97FC
                  *
ERROR at line 1:
ORA-00942: table or view does not exist


SQL> 

This recursive creation of the temp table might raise some interesting questions. For example, how this (recursive DDL) might affect / be affected by transactions?

Short answer: It does and it doesn’t

The longer answer is that it only seems to affect distributed transactions and this effect is apparently a bug or bugs, separately listed in both 10.2 – bug 9399589 – and 11.1/11.2 – bug 9706532.

I’ve not tested the proposed patches to the issue, but certainly what happens in 11.2.0.3 is that if you hard-parse the statement as part of a distributed transaction, then the materialisation is silently bypassed.

SQL> alter system flush shared_pool;
SQL> -- distributed transaction
SQL> insert into t1@test values(1);
SQL> with x as
  2  (select /*+ materialize */
  3          1
  4   from   dual)
  5  select * from x;

         1
----------
         1
SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------
SQL_ID  0x0a8cyg22wz7, child number 0
-------------------------------------
with x as (select /*+ materialize */         1  from   dual) select *
from x

Plan hash value: 1388734953

-----------------------------------------------------------------
| Id  | Operation        | Name | Rows  | Cost (%CPU)| Time     |
-----------------------------------------------------------------
|   0 | SELECT STATEMENT |      |       |     2 (100)|          |
|   1 |  FAST DUAL       |      |     1 |     2   (0)| 00:00:01 |
-----------------------------------------------------------------

SQL> commit;
SQL> -- no distributed transaction
SQL> with x as
  2  (select /*+ materialize */
  3          1
  4   from   dual)
  5  select * from x;

         1
----------
         1
SQL> 
SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------
SQL_ID  0x0a8cyg22wz7, child number 0
-------------------------------------
with x as (select /*+ materialize */         1  from   dual) select *
from x

Plan hash value: 1388734953

-----------------------------------------------------------------
| Id  | Operation        | Name | Rows  | Cost (%CPU)| Time     |
-----------------------------------------------------------------
|   0 | SELECT STATEMENT |      |       |     2 (100)|          |
|   1 |  FAST DUAL       |      |     1 |     2   (0)| 00:00:01 |
-----------------------------------------------------------------

SQL> 


Whereas if it’s a local transaction that does the hard-parse then materialisation can be used and subsequent executions of that cursor in a distributed transaction can make use of that plan and the existing temp table.

SQL> alter system flush shared_pool;

System altered.

SQL> -- no distributed transaction
SQL> with x as
  2  (select /*+ materialize */
  3          1
  4   from   dual)
  5  select * from x;

         1
----------
         1

1 row selected.

SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------
SQL_ID  0x0a8cyg22wz7, child number 0
-------------------------------------
with x as (select /*+ materialize */         1  from   dual) select *
from x

Plan hash value: 3267439756

---------------------------------------------------------------------------------------------
| Id  | Operation                  | Name                      | Rows  | Bytes | Cost (%CPU)|
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |                           |       |       |     4 (100)|
|   1 |  TEMP TABLE TRANSFORMATION |                           |       |       |            |
|   2 |   LOAD AS SELECT           |                           |       |       |            |
|   3 |    FAST DUAL               |                           |     1 |       |     2   (0)|
|   4 |   VIEW                     |                           |     1 |     3 |     2   (0)|
|   5 |    TABLE ACCESS FULL       | SYS_TEMP_0FD9D6604_8B16D2 |     1 |    13 |     2   (0)|
---------------------------------------------------------------------------------------------


18 rows selected.

SQL> -- distributed transaction
SQL> insert into t1@test values(1);

1 row created.

SQL> with x as
  2  (select /*+ materialize */
  3          1
  4   from   dual)
  5  select * from x;

         1
----------
         1

1 row selected.

SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------
SQL_ID  0x0a8cyg22wz7, child number 0
-------------------------------------
with x as (select /*+ materialize */         1  from   dual) select *
from x

Plan hash value: 3267439756

---------------------------------------------------------------------------------------------
| Id  | Operation                  | Name                      | Rows  | Bytes | Cost (%CPU)|
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |                           |       |       |     4 (100)|
|   1 |  TEMP TABLE TRANSFORMATION |                           |       |       |            |
|   2 |   LOAD AS SELECT           |                           |       |       |            |
|   3 |    FAST DUAL               |                           |     1 |       |     2   (0)|
|   4 |   VIEW                     |                           |     1 |     3 |     2   (0)|
|   5 |    TABLE ACCESS FULL       | SYS_TEMP_0FD9D6604_8B16D2 |     1 |    13 |     2   (0)|
---------------------------------------------------------------------------------------------


18 rows selected.

SQL> 

I saw on Twitter last week and this week that @Boneist had an interesting experience with this sort of thing.

Finally, as a quick related distraction, note that if you try to get a real time sql monitoring report within a distributed transaction – I mean, why would you? but anyway I found this whilst investing the distributed behaviour above – then it will bomb out with ORA-32036: unsupported case for inlining of query name in WITH clause.

ORA-32035 considered a good thing?

I used to think that it was a good thing that this error was no longer raised in 11.2.

Now I’m not so sure.

SQL> select * from v$version;

BANNER
-------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
PL/SQL Release 11.2.0.3.0 - Production
CORE    11.2.0.3.0      Production
TNS for Linux: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production

SQL> with t as
  2  (select 1 col2 from dual)
  3  select * from t1;

COL1
-----
XXXXX

SQL> 

Doh!

Note to self: Clear up old Ts, T1s, T2s, etc straight afterwards.

I used to ALWAYS prefix my factored subqueries as subq_*.

This provides a compelling reason to continue that – not that it changes anything of course, just means I’m less likely to make such a typo.

Sql tuning request

Without knowing anything about the problem in advance, I thought it would be good to do a walkthrough post of a sql tuning request.

But now that I’m done I’m unconvinced as it’s probably too long, the query too meaningless and the real time sql monitoring text output too unreadable in a blog post.

I’m always reluctant to post real, specific application issues because I’m never sure how well they translate and illustrate the desired points unless you convert them to a standalone test case.

Let’s see how it goes.

  • The idea is to touch on the broad strategies that I’m going through.
  • The scope of the solution should match the scope of the problem – so for a single problem query, table design and current indexes, stats and histograms should be considered as set in stone.
  • Ideally we want to avoid hinting as much as possible or at least stick to acceptable hints.
  • And if we’re going to manually intervene, we’re not particularly interested in what the current production plan is nor the usage of any plan stability features to preserve it (unless it’s better than we can do ourselves).

This is from an 11gR2 testing environment, 11.2.0.3 to be specific.

Having done the latest merge of a production code release (9.2.0.8) into our 11gR2 environment (upgrade ETA March), a report has been reported as slow, taking about 1.5 minutes in production and some 20 minutes in the 11gR2 environment.

So, having traced the report and found that the driving query is the prime suspect, let’s get some feedback on the performance of the SQL statement.

Here’s the original sql statement to give some context – I’ve commented out most of the columns because they don’t add much other than just length to the post.

SELECT ... some columns ...,
       ... a function call ...,
       ... some more columns ...
FROM   isbk,
       inst,
       bsta,
       isco,
       sdol,
       borg,
       book,
       isdm,
       rule
WHERE  book.book_num = rule.book_num
AND    rule.rule_type_code = 'IBK'
AND    isbk.inst_num = inst.inst_num
AND    inst.inst_num = bsta.inst_num
AND    inst.inst_num = isco.inst_num
AND    isco.inst_num = isdm.inst_num
AND    isco.xcod_code = 'SEDL'
AND    isco.inst_num = sdol.inst_num
AND    sdol.xcod_code = 'ISIN'
AND    borg.borg_num = inst.issuer_num
AND    book.book_num = isbk.book_num
AND    bsta.current_mat_date >= TO_DATE(v_bus_date, 'yyyymmdd')
UNION ALL
SELECT ... some columns ...
FROM   inst, 
       bsta,
       isco,
       sdol,
       borg,
       isdm,
       inix
WHERE  inst.inst_num = bsta.inst_num
AND    inst.inst_num = isco.inst_num
AND    isco.inst_num = isdm.inst_num
AND    isco.xcod_code = 'SEDL'
AND    isco.inst_num = sdol.inst_num
AND    sdol.xcod_code = 'ISIN'
AND    borg.borg_num = inst.issuer_num
AND    bsta.current_mat_date >= TO_DATE(v_bus_date, 'yyyymmdd')
AND    isco.inst_num = inix.inst_num
AND    inix.xcod_code = 'NIX'
AND    NOT EXISTS (SELECT 1
                   FROM   isbk,
                          rule,
                          book,
                          bsta
                   WHERE  rule.rule_type_code = 'IBK'
                   AND    book.book_num = rule.book_num
                   AND    book.book_num = isbk.book_num
                   AND    inst.inst_num = isbk.inst_num
                   AND    inst.inst_num = bsta.inst_num
                   AND    bsta.current_mat_date >= TO_DATE(v_bus_date, 'yyyymmdd'));

Looking at the query and the repetition of tables and joins between the two UNION ALL parts, it looks like a classic case of two sets of disparate driving data that then need to be joined to the same additional tables.

Ironically, expanding ORs out into UNIONs is a common initial tuning step for performance problems with OR predicates.

Here’s what little extra schema knowledge might be useful:

  • INST has a pk of INST_NUM.
  • ISCO has an n:1 relationship with INST.
  • ISDM has an n:1 relationship with INST.
  • BORG has a 1:1 relationship with INST.ISSUER_NUM.
  • BSTA has a 1:1 relationship with INST.
  • ISBK has a n:1 relationship with INST and a n:1 relationship with BOOK.
  • BOOK has a pk of BOOK_NUM.
  • RULE is a table used for generic filtering rules, in this case related to BOOK.BOOK_NUM.

Let’s use Real-Time SQL Monitoring (usual license caveats apply) to see what the current performance story is.

SELECT dbms_sqltune.report_sql_monitor(<sql_id>) FROM DUAL;

If there’s one downside to RTSM, it’s a bit too wide for these blog posts :(
Maybe the RTSM pictures would have been better?

Anyway, I’ve sacrificed some of the columns in the report and tried to shrink the font (there’s a scrollbar at the bottom of the plan window).


Global Stats
===========================================================
| Elapsed |   Cpu   | PL/SQL  |  Other   | Fetch | Buffer |
| Time(s) | Time(s) | Time(s) | Waits(s) | Calls |  Gets  |
===========================================================
|    1094 |    1094 |    0.29 |     0.20 |    60 |   257K |
===========================================================

SQL Plan Monitoring Details (Plan Hash Value=1036829859)
===============================================================================================================================
| Id |                Operation                |       Name         |  Rows   | Execs |   Rows   | Activity | Activity Detail |
|    |                                         |                    | (Estim) |       | (Actual) |   (%)    |   (# samples)   |
===============================================================================================================================
|  0 | SELECT STATEMENT                        |                    |         |     1 |    29225 |          |                 |
|  1 |   NESTED LOOPS                          |                    |       1 |     1 |    29225 |          |                 |
|  2 |    NESTED LOOPS                         |                    |       1 |     1 |    29205 |          |                 |
|  3 |     NESTED LOOPS                        |                    |       1 |     1 |    29205 |          |                 |
|  4 |      VIEW                               | VW_JF_SET$AD8EBC08 |    6147 |     1 |    29205 |          |                 |
|  5 |       UNION-ALL                         |                    |         |     1 |    29205 |          |                 |
|  6 |        NESTED LOOPS                     |                    |       9 |     1 |    28719 |          |                 |
|  7 |         NESTED LOOPS                    |                    |      53 |     1 |    28722 |          |                 |
|  8 |          NESTED LOOPS                   |                    |     370 |     1 |    31674 |          |                 |
|  9 |           NESTED LOOPS                  |                    |     370 |     1 |    31674 |          |                 |
| 10 |            NESTED LOOPS                 |                    |       3 |     1 |        1 |          |                 |
| 11 |             INDEX RANGE SCAN            | RULE_IDX_2         |       3 |     1 |        1 |          |                 |
| 12 |             TABLE ACCESS BY INDEX ROWID | BOOK               |       1 |     1 |        1 |          |                 |
| 13 |              INDEX UNIQUE SCAN          | BOOK_IDX           |       1 |     1 |        1 |          |                 |
| 14 |            INDEX RANGE SCAN             | ISBK_IDX           |     116 |     1 |    31674 |          |                 |
| 15 |           TABLE ACCESS BY INDEX ROWID   | INST               |       1 | 31674 |    31674 |          |                 |
| 16 |            INDEX UNIQUE SCAN            | ISTR_PK            |       1 | 31674 |    31674 |          |                 |
| 17 |          TABLE ACCESS BY INDEX ROWID    | BSTA               |       1 | 31674 |    28722 |          |                 |
| 18 |           INDEX UNIQUE SCAN             | BSTAC_IDX          |       1 | 31674 |    31654 |          |                 |
| 19 |         INDEX RANGE SCAN                | ISCO_4_IDX         |       1 | 28722 |    28719 |          |                 |
| 20 |        HASH JOIN                        |                    |    6138 |     1 |      486 |          |                 |
| 21 |         HASH JOIN ANTI                  |                    |    5716 |     1 |    54961 |    67.16 | Cpu (732)       |
| 22 |          HASH JOIN                      |                    |   24385 |     1 |    83680 |          |                 |
| 23 |           INDEX RANGE SCAN              | ISCO_4_IDX         |    170K |     1 |     377K |          |                 |
| 24 |           HASH JOIN                     |                    |    144K |     1 |     266K |          |                 |
| 25 |            TABLE ACCESS FULL            | BSTA               |    144K |     1 |     266K |     0.09 | Cpu (1)         |
| 26 |            TABLE ACCESS FULL            | INST               |      1M |     1 |       1M |          |                 |
| 27 |          VIEW                           | VW_SQ_1            |     53M |     1 |       4G |     5.60 | Cpu (61)        |
| 28 |           HASH JOIN                     |                    |     53M |     1 |       4G |    27.06 | Cpu (295)       |
| 29 |            MERGE JOIN CARTESIAN         |                    |    458K |     1 |     266K |          |                 |
| 30 |             NESTED LOOPS                |                    |       3 |     1 |        1 |          |                 |
| 31 |              INDEX RANGE SCAN           | RULE_IDX_2         |       3 |     1 |        1 |          |                 |
| 32 |              INDEX UNIQUE SCAN          | BOOK_IDX           |       1 |     1 |        1 |          |                 |
| 33 |             BUFFER SORT                 |                    |    144K |     1 |     266K |          |                 |
| 34 |              INDEX FAST FULL SCAN       | BSTA_IDX_2         |    144K |     1 |     266K |          |                 |
| 35 |            TABLE ACCESS FULL            | ISBK               |      2M |     1 |       2M |          |                 |
| 36 |         INDEX RANGE SCAN                | ISCO_4_IDX         |    170K |     1 |    32983 |          |                 |
| 37 |      TABLE ACCESS BY INDEX ROWID        | BORG               |       1 | 29205 |    29205 |          |                 |
| 38 |       INDEX UNIQUE SCAN                 | BORG_PK            |       1 | 29205 |    29205 |          |                 |
| 39 |     INDEX RANGE SCAN                    | ISCO_4_IDX         |       1 | 29205 |    29205 |          |                 |
| 40 |    INDEX RANGE SCAN                     | ISDM_IDX           |       1 | 29205 |    29225 |          |                 |
===============================================================================================================================

There’s a lot in this report.

First thing you might have spotted is the Join Factorisation going on @ step 4 as indicated by the name VW_JF*.

Maybe you’re always wary of the MERGE JOIN CARTESIAN … BUFFER SORT? Not always a problem of course but where there’s trouble you’ll often find her (in real production scenarios, I find that this mechanism is a problem not so much when there are missing join conditions – because these are rarely found in production code – but rather as a valid join mechanism but where the rowsource estimates are significantly inaccurate).

If you look at the SQL, you’ll see a function call as well.

There are also a whole bunch of estimates that are significantly off.

Where to start?

There are four main areas of questioning:

  1. What’s taking all the time? Are there some particular steps in the plan which are more problematic than others?
  2. If estimates are inaccurate, where do they go most wrong or go wrong first?
  3. Which predicates eliminate the most data? i.e. it’s rarely a good thing to join thousands upon thousands of rows only to do a late filter in the plan to reduce it down to a few handfuls. Aka eliminate early.
  4. What is the simplest / quickest / least invasive change that can be made to significantly improve performance? And will it actually be sufficient?

From “Activity %”, I hope it’s clear from the report that all the time is taken up by the bottom half of the UNION ALL.

  • The rowsource cardinality estimates are not that accurate.
  • Plus we find in that second half our old friend the MJC+BS.
  • And we’re just burning CPU down there.

So, I’d like to isolate that bottom half of the UNION ALL and run it standalone.

However, in a clear indication of the issues with it, it won’t run standalone – It’s just blown 30+ gig of temp space.

But it returns only 486 rows so let’s try to get a runnable standalone version.

Before we start looking at possible solutions, let’s start to ask questions about the query logic itself, keeping an eye out for redundant tables and joins and asking ourselves whether this is the best way to word the question we think is being asked.

Best way to start that is

So, we’ve got a few “filters” predicates, a couple of “join predicates” and a NOT EXISTS correlated subquery.

The logic of the subquery is the first thing that jumps out at me.
1. BOOK seems redundant
We join RULE to BOOK and BOOK to ISBK all by BOOK_NUM and ISBK is correlated to the outer INST by INST_NUM.
We do no filtering by any BOOK attribute so it serves no purpose so, let’s remove BOOK, join RULE straight to ISBK.

2. The subquery filtering by BSTA.CURRENT_MAT_DATE is irrelevant.
This is the same filter as in the outer select.
It doesn’t make sense.
Ignoring the actual evaluation order of all these predicates, we should effectively only be checking the NOT EXISTS against INST_NUMS that have passed the outer BSTA filter. So, by definition, this particular predicate in the subquery will always be true. Why repeat it? This is a mistake.
The only thing this subquery should be doing is checking they’re not in the RULE/ISBK combo. So, let’s remove that.

If we run just comment out the tables and joins as per suggestion above, then that bottom query runs in a couple of seconds:


Global Stats
=================================================
| Elapsed |   Cpu   |  Other   | Fetch | Buffer |
| Time(s) | Time(s) | Waits(s) | Calls |  Gets  |
=================================================
|    1.12 |    1.12 |     0.00 |     2 |  73879 |
=================================================

SQL Plan Monitoring Details (Plan Hash Value=2734204492)
==========================================================================================================
| Id |         Operation          |   Name     |  Rows   | Execs |   Rows   | Activity | Activity Detail |
|    |                            |            | (Estim) |       | (Actual) |   (%)    |   (# samples)   |
==========================================================================================================
|  0 | SELECT STATEMENT           |            |         |     1 |      489 |          |                 |
|  1 |   HASH JOIN RIGHT ANTI     |            |   38162 |     1 |      489 |          |                 |
|  2 |    VIEW                    | VW_SQ_1    |     555 |     1 |    31674 |          |                 |
|  3 |     NESTED LOOPS           |            |     555 |     1 |    31674 |          |                 |
|  4 |      INDEX RANGE SCAN      | RULE_IDX_2 |       3 |     1 |        1 |          |                 |
|  5 |      INDEX RANGE SCAN      | ISBK_IDX   |     176 |     1 |    31674 |          |                 |
|  6 |    HASH JOIN               |            |   38189 |     1 |    28700 |          |                 |
|  7 |     HASH JOIN              |            |   37626 |     1 |    28700 |          |                 |
|  8 |      HASH JOIN             |            |   35036 |     1 |    86956 |   100.00 | Cpu (1)         |
|  9 |       HASH JOIN            |            |   32625 |     1 |    87212 |          |                 |
| 10 |        HASH JOIN           |            |   31565 |     1 |    83593 |          |                 |
| 11 |         INDEX RANGE SCAN   | ISCO_4_IDX |    170K |     1 |     377K |          |                 |
| 12 |         HASH JOIN          |            |    144K |     1 |     266K |          |                 |
| 13 |          TABLE ACCESS FULL | BSTA       |    144K |     1 |     266K |          |                 |
| 14 |          TABLE ACCESS FULL | INST       |    796K |     1 |     796K |          |                 |
| 15 |        TABLE ACCESS FULL   | ISDM       |      1M |     1 |       1M |          |                 |
| 16 |       INDEX RANGE SCAN     | ISCO_4_IDX |    170K |     1 |     861K |          |                 |
| 17 |      INDEX RANGE SCAN      | ISCO_4_IDX |    170K |     1 |    32983 |          |                 |
| 18 |     TABLE ACCESS FULL      | BORG       |    328K |     1 |     328K |          |                 |
==========================================================================================================

Note that I’ve had to hint this with the /*+ monitor */ hint because by default this now executes beneath the default threshold for monitoring.

It looks like we’ve arrived at our quickest/simplest change just by going through the query quickly trying to understand the question it’s asking. No hints required.

So this is would be a good place to stop, thoroughly test the change, validate the original and the changed results and move on to something else.

We should drop it back into the UNION ALL and see what we get:


Global Stats
===========================================================
| Elapsed |   Cpu   | PL/SQL  |  Other   | Fetch | Buffer |
| Time(s) | Time(s) | Time(s) | Waits(s) | Calls |  Gets  |
===========================================================
|    4.68 |    4.68 |    0.38 |     0.00 |    60 |   230K |
===========================================================

SQL Plan Monitoring Details (Plan Hash Value=4171599168)
===============================================================================================================================
| Id |                Operation                |       Name         |  Rows   | Execs |   Rows   | Activity | Activity Detail |
|    |                                         |                    | (Estim) |       | (Actual) |   (%)    |   (# samples)   |
===============================================================================================================================
|  0 | SELECT STATEMENT                        |                    |         |     1 |    29225 |          |                 |
|  1 |   NESTED LOOPS                          |                    |       1 |     1 |    29225 |          |                 |
|  2 |    NESTED LOOPS                         |                    |       1 |     1 |    29205 |          |                 |
|  3 |     HASH JOIN                           |                    |       1 |     1 |    29205 |          |                 |
|  4 |      TABLE ACCESS FULL                  | BORG               |    328K |     1 |     328K |          |                 |
|  5 |      VIEW                               | VW_JF_SET$AD8EBC08 |   26182 |     1 |    29205 |          |                 |
|  6 |       UNION-ALL                         |                    |         |     1 |    29205 |          |                 |
|  7 |        NESTED LOOPS                     |                    |       9 |     1 |    28719 |          |                 |
|  8 |         NESTED LOOPS                    |                    |      53 |     1 |    28722 |          |                 |
|  9 |          NESTED LOOPS                   |                    |     370 |     1 |    31674 |          |                 |
| 10 |           NESTED LOOPS                  |                    |     370 |     1 |    31674 |          |                 |
| 11 |            NESTED LOOPS                 |                    |       3 |     1 |        1 |          |                 |
| 12 |             INDEX RANGE SCAN            | RULE_IDX_2         |       3 |     1 |        1 |          |                 |
| 13 |             TABLE ACCESS BY INDEX ROWID | BOOK               |       1 |     1 |        1 |          |                 |
| 14 |              INDEX UNIQUE SCAN          | BOOK_IDX           |       1 |     1 |        1 |          |                 |
| 15 |            INDEX RANGE SCAN             | ISBK_IDX           |     116 |     1 |    31674 |          |                 |
| 16 |           TABLE ACCESS BY INDEX ROWID   | INST               |       1 | 31674 |    31674 |          |                 |
| 17 |            INDEX UNIQUE SCAN            | ISTR_PK            |       1 | 31674 |    31674 |          |                 |
| 18 |          TABLE ACCESS BY INDEX ROWID    | BSTA               |       1 | 31674 |    28722 |          |                 |
| 19 |           INDEX UNIQUE SCAN             | BSTA_IDX           |       1 | 31674 |    31654 |          |                 |
| 20 |         INDEX RANGE SCAN                | ISCO_4_IDX         |       1 | 28722 |    28719 |          |                 |
| 21 |        HASH JOIN RIGHT ANTI             |                    |   26173 |     1 |      486 |          |                 |
| 22 |         VIEW                            | VW_SQ_1            |     555 |     1 |    31674 |          |                 |
| 23 |          NESTED LOOPS                   |                    |     555 |     1 |    31674 |          |                 |
| 24 |           INDEX RANGE SCAN              | RULE_IDX_2         |       3 |     1 |        1 |          |                 |
| 25 |           INDEX RANGE SCAN              | ISBK_IDX           |     176 |     1 |    31674 |          |                 |
| 26 |         HASH JOIN                       |                    |   26187 |     1 |    28693 |          |                 |
| 27 |          HASH JOIN                      |                    |   24385 |     1 |    83680 |    33.33 | Cpu (1)         |
| 28 |           INDEX RANGE SCAN              | ISCO_4_IDX         |    170K |     1 |     377K |          |                 |
| 29 |           HASH JOIN                     |                    |    144K |     1 |     266K |          |                 |
| 30 |            TABLE ACCESS FULL            | BSTA               |    144K |     1 |     266K |          |                 |
| 31 |            TABLE ACCESS FULL            | INST               |      1M |     1 |       1M |          |                 |
| 32 |          INDEX RANGE SCAN               | ISCO_4_IDX         |    170K |     1 |    32983 |          |                 |
| 33 |     INDEX RANGE SCAN                    | ISCO_4_IDX         |       1 | 29205 |    29205 |          |                 |
| 34 |    INDEX RANGE SCAN                     | ISDM_IDX           |       1 | 29205 |    29225 |          |                 |
===============================================================================================================================

So we could leave it there.

However, I wouldn’t blame you if you wanted to go further, even if we’re bordering on Compulsive Tuning Disorder:

  • I don’t really like leaving behind a plan that’s got significantly inaccurate estimates – it leaves behind too much of a future threat.
  • And maybe we can also look at the original query and the direction the Join Factorisation was indicating and take it further? How about doing the UNION ALL a bit earlier and then do one lot of joining to the shared tables from the original SQL?
  • Let’s use some dynamic sampling to improve some single table cardinality estimates but then perhaps let’s go too far and add some join estimate adjustments (I say too far because I’m going to use opt_estimate but it’s undocumented and I’m not recommending it but if you do use it, then use it in conjunction with qb_name).
  • And let’s get rid of the function call which, take it from me, in this case is effectively a single table outer join lookup anyway.
SELECT  /*+
          find_me
          monitor
          qb_name(main)
          dynamic_sampling(sdol@main 4)
          dynamic_sampling(isin@main 4)
          */
        ... some columns ...
FROM   (SELECT /*+
                 qb_name(union1)
                 opt_estimate(join(isbk@union1 rule@union1) scale_rows=100)
                 */
               isbk.inst_num  inst_num
        ,      inix.inst_code nix_code
        ,      (SELECT book.name
                FROM   book
                WHERE  book.book_num = rule.book_num) book_name
        FROM   isbk
        ,      rule
        ,      inix
        WHERE  rule.rule_type_code    = 'IBK'
        AND    rule.book_num          = isbk.book_num 
        AND    inix.inst_num      (+) = isbk.inst_num
        AND    inix.xcod_code     (+) = 'NIX'
        UNION ALL
        SELECT /*+
                 qb_name(union2)
                 dynamic_sampling(inix@union2 4)
                 */
               inix.inst_num  inst_num
        ,      inix.inst_code nix_code
        ,      NULL           book_name
        FROM   inix
        WHERE  inix.xcod_code         = 'NIX'
        AND    NOT EXISTS (SELECT /*+ 
                                    qb_name(sub1)
                                    opt_estimate(join(isbk@sub1 rule@sub1) scale_rows=100)
                                    */
                                  1
                           FROM   isbk
                           ,      rule
                           WHERE  rule.rule_type_code    = 'IBK'
                           AND    rule.book_num          = isbk.book_num
                           AND    isbk.inst_num         = inix.inst_num))
       xxxx
,      bsta
,      isin
,      sdol
,      inst
,      borg
,      isdm
WHERE  bsta.inst_num          = xxxx.inst_num
AND    bsta.current_mat_date >= TO_DATE(:v_bus_date, 'yyyymmdd')
AND    isin.inst_num          = xxxx.inst_num 
AND    isin.xcod_code         = 'SEDL'
AND    sdol.inst_num          = xxxx.inst_num
AND    sdol.xcod_code         = 'ISIN'
AND    inst.inst_num          = xxxx.inst_num
AND    borg.borg_num          = inst.issuer_num
AND    isdm.inst_num          = xxxx.inst_num;

Which gives:


Global Stats
=================================================
| Elapsed |   Cpu   |  Other   | Fetch | Buffer |
| Time(s) | Time(s) | Waits(s) | Calls |  Gets  |
=================================================
|    1.40 |    1.40 |     0.00 |    60 |  79273 |
=================================================

SQL Plan Monitoring Details (Plan Hash Value=2392527494)
=======================================================================================================================
| Id |               Operation                |   Name      |  Rows   | Execs |   Rows   | Activity | Activity Detail |
|    |                                        |             | (Estim) |       | (Actual) |   (%)    |   (# samples)   |
=======================================================================================================================
|  0 | SELECT STATEMENT                       |             |         |     1 |    29225 |          |                 |
|  1 |   HASH JOIN                            |             |   94337 |     1 |    29225 |          |                 |
|  2 |    INDEX RANGE SCAN                    | ISCO_4_IDX  |    373K |     1 |     377K |          |                 |
|  3 |    HASH JOIN                           |             |   80747 |     1 |    29629 |          |                 |
|  4 |     HASH JOIN                          |             |   80747 |     1 |    29629 |          |                 |
|  5 |      HASH JOIN                         |             |   80747 |     1 |    29629 |          |                 |
|  6 |       HASH JOIN                        |             |   78122 |     1 |    29604 |          |                 |
|  7 |        MERGE JOIN                      |             |   78003 |     1 |    33947 |          |                 |
|  8 |         INDEX RANGE SCAN               | ISCO_4_IDX  |    818K |     1 |     861K |          |                 |
|  9 |         SORT JOIN                      |             |   55765 |  861K |    33947 |          |                 |
| 10 |          VIEW                          |             |   55765 |     1 |    33960 |          |                 |
| 11 |           UNION-ALL                    |             |         |     1 |    33960 |          |                 |
| 12 |            TABLE ACCESS BY INDEX ROWID | BOOK        |       1 |     1 |        1 |          |                 |
| 13 |             INDEX UNIQUE SCAN          | BOOK_IDX    |       1 |     1 |        1 |          |                 |
| 14 |            HASH JOIN OUTER             |             |   55485 |     1 |    31674 |          |                 |
| 15 |             NESTED LOOPS               |             |   55485 |     1 |    31674 |          |                 |
| 16 |              INDEX RANGE SCAN          | RULE_IDX_2  |       3 |     1 |        1 |          |                 |
| 17 |              INDEX RANGE SCAN          | ISBK_IDX    |   17609 |     1 |    31674 |          |                 |
| 18 |             INDEX RANGE SCAN           | ISCO_4_IDX  |    170K |     1 |    32983 |          |                 |
| 19 |            HASH JOIN ANTI              |             |     280 |     1 |     2286 |          |                 |
| 20 |             INDEX RANGE SCAN           | ISCO_4_IDX  |   28028 |     1 |    32983 |          |                 |
| 21 |             VIEW                       | VW_SQ_1     |   55485 |     1 |    31674 |          |                 |
| 22 |              NESTED LOOPS              |             |   55485 |     1 |    31674 |          |                 |
| 23 |               INDEX RANGE SCAN         | RULE_IDX_2  |       3 |     1 |        1 |          |                 |
| 24 |               INDEX RANGE SCAN         | ISBK_IDX    |   17609 |     1 |    31674 |          |                 |
| 25 |        TABLE ACCESS FULL               | BSTA        |    144K |     1 |     266K |          |                 |
| 26 |       TABLE ACCESS FULL                | ISDM        |      1M |     1 |       1M |          |                 |
| 27 |      TABLE ACCESS FULL                 | INST        |    796K |     1 |     796K |   100.00 | Cpu (1)         |
| 28 |     TABLE ACCESS FULL                  | BORG        |    328K |     1 |     328K |          |                 |
=======================================================================================================================

It might not be perfect but it’s certainly a lot better.

An acceptable compromise might be to refactor the UNION ALL as per above but omit the undocumented opt_estimate hints.

All we need to do is some more testing to validate the results and to also triple check the performance when the data is not cached and we’re done.

If you made it this far then Wow! I probably wouldn’t have done.

So, to summarise, what have we done?

  • Well, we haven’t needed to know much about the original intention of the query nor the schema.
  • We’ve improved the accuracies of some of the estimates.
  • We’ve found some redundant tables and joins.
  • And we’ve reordered the query slightly to better phrase the question that we think was being asked, in the process moving the UNION ALL earlier in the processing so that some of the joining tables only needed to be referenced once (an advantage that might not be preserved if the CBO decides to merge the UNION ALL but that would be the opposite of the recently developed join factorisation mechanism).

And these simple, quick steps effectively reduced the execution time of a query from 20 minutes to a few seconds.

In hindsight, a good example because of the gains realised but a bad example because of the length of query and the associated detail, particularly the redundant tables.

Follow

Get every new post delivered to your Inbox.

Join 62 other followers