Reluctant Local Index

I have a couple of performance issues at the moment which link back to queries against large tables choosing to do a FULL segment scan rather than using a non-prefixed local index.

The problems right now are not currently reproducible in or outside of production.

But when I look at a specific point in time at the reproduction of a problem, the system state is not necessarily as it was when the problem was occurring.

My main theory is that somewhere on these large partitioned tables there is intermittently an unusable partition of the index perhaps due some aspect of a data load, data movement or other maintenance operation on a different subpartition (where each partition represents a different day of a specific feed and each subpartition represents a version of that daily load).

However I have no evidence currently to support that theory.

If this theory was in the right ballpark, what would it mean?

That a query using bind variables (tick) would not use the index by default as the optimizer would not be able to guarantee that that shareable plan would not be accessing the unusable.

A query using literals would not be subject to the same restrictions as the optimizer would know, if an unusable subpartition was relevant, that the unusable subpartition was not relevant to this specific query (and any DDL which might make a relevant partition unusable would invalidate the shared cursor).

Whilst I was musing over this yesterday, I had a vague sense of familiarity and I found these previous posts to job my memory:

But just to illustrate my theory…

Let’s start with a roughly representative table:

create table  t1
(pkey         varchar2(24) not null
,spkey        number       not null
,id           varchar2(24) not null
,version      number       not null
,status       varchar2(24) not null
,filler       varchar2(255))
partition by list (pkey)
subpartition by list (spkey)
subpartition template
(subpartition sp_1 values (1))
(partition p_0 values ('X'));

alter table t1 add partition p_abc values ('ABC');
alter table t1 add partition p_def values ('DEF');
alter table t1 add partition p_ghi values ('GHI');

create index i1 on t1 (id) local;

create sequence s1 start with 10000000;
create type o1 is object
(pkey  varchar2(24)
,spkey        number
,id           varchar2(24));
/
create type c1 is table of o1;
/

select object_name, subobject_name, object_type from user_objects where object_name = 'T1' order by object_name, subobject_name nulls first;

OBJECT_NAM SUBOBJECT_ OBJECT_TYPE            
---------- ---------- -----------------------
T1                    TABLE                  
T1         P_0        TABLE PARTITION        
T1         P_0_SP_1   TABLE SUBPARTITION     
T1         P_ABC      TABLE PARTITION        
T1         P_ABC_SP_1 TABLE SUBPARTITION     
T1         P_DEF      TABLE PARTITION        
T1         P_DEF_SP_1 TABLE SUBPARTITION     
T1         P_GHI      TABLE PARTITION        
T1         P_GHI_SP_1 TABLE SUBPARTITION    

select object_name, subobject_name, object_type from user_objects where object_name = 'I1' order by object_name, subobject_name nulls first;

OBJECT_NAM SUBOBJECT_ OBJECT_TYPE            
---------- ---------- -----------------------
I1                    INDEX                  
I1         P_0        INDEX PARTITION        
I1         P_0_SP_1   INDEX SUBPARTITION     
I1         P_ABC      INDEX PARTITION        
I1         P_ABC_SP_1 INDEX SUBPARTITION     
I1         P_DEF      INDEX PARTITION        
I1         P_DEF_SP_1 INDEX SUBPARTITION     
I1         P_GHI      INDEX PARTITION        
I1         P_GHI_SP_1 INDEX SUBPARTITION     

And I’m going to seed some data simply:

declare
 p sys.odcivarchar2list := sys.odcivarchar2list('ABC','DEF','GHI');
begin
 for i in 1 .. p.count
 loop
     for j in 1 .. 10000
     loop
         insert into t1
         values (p(i), 1, 'PB:'||s1.nextval, 1, 'LATEST', rpad('X',255,'X'));
     end loop;
 end loop;
commit;
end;
/

select pkey, spkey, count(*), min(id), max(id) from t1 group by pkey, spkey;

PKEY      SPKEY   COUNT(*) MIN(ID)                  MAX(ID)                 
---- ---------- ---------- ------------------------ ------------------------
ABC           1      10000 PB:10000000              PB:10009999             
DEF           1      10000 PB:10010000              PB:10019999             
GHI           1      10000 PB:10020000              PB:10029999    

And then just to keep it representative to my real world problem, I’m going to run a bulk update to set the status of some of the rows to SUPERSEDED (and which is then in the real world followed by the INSERT of some LATEST versions of those rows supplied by the client but no need to do that here):

declare
 v1 c1 := c1(o1('DEF',1,'PB:10010001'),
             o1('DEF',1,'PB:10010002'),
             o1('DEF',1,'PB:10010003'),
             o1('DEF',1,'PB:10010004'),
             o1('DEF',1,'PB:10010005'));
begin
   forall i in 1 .. v1.count
      update /*+ find_me_dom */
             t1
      set    status = 'SUPERSEDED'
      where  pkey   = v1(i).pkey
      and    spkey  = v1(i).spkey
      and    id     = v1(i).id;
end;
/

And I’m going to lookup my sql id from v$sql and plug it into DBMS_XPLAN to see how my UPDATE performed:

select * from table(dbms_xplan.display_cursor('bjddz8c4jrk3y',0));

-------------------------------------------------------------
| Id  | Operation                                    | Name |
-------------------------------------------------------------
|   0 | UPDATE STATEMENT                             |      |
|   1 |  UPDATE                                      | T1   |
|   2 |   PARTITION LIST SINGLE                      |      |
|   3 |    PARTITION LIST SINGLE                     |      |
|   4 |     TABLE ACCESS BY LOCAL INDEX ROWID BATCHED| T1   |
|   5 |      INDEX RANGE SCAN                        | I1   |
-------------------------------------------------------------

Range Scan of I1 as desired

Now I will mark one subpartition of the index unusable (different subpartition from one affected by the update) and repeat the update, the getting of the SQL and the fetching of the plan (different child number):

alter index i1 modify subpartition P_ABC_SP_1 unusable;
--Repeat dml
--lookup sql

select * from table(dbms_xplan.display_cursor('bjddz8c4jrk3y',1));
-- ^ different child cursor number because marking the index subpartition as unusable
-- has invalidated the previous child cursor 0

---------------------------------------------
| Id  | Operation                    | Name |
---------------------------------------------
|   0 | UPDATE STATEMENT             |      |
|   1 |  UPDATE                      | T1   |
|   2 |   PARTITION LIST SINGLE      |      |
|   3 |    PARTITION LIST SINGLE     |      |
|   4 |     TABLE ACCESS STORAGE FULL| T1   |
---------------------------------------------

The access method is no longer an index range scan.

As I covered initially, as part of the parse process the execution plan has to be good for all possible ranges of supplied values in the event that that SQL child cursor is shared for executions with different binds than those initially parsed with. Because an update (or select etc, etc) against rows for table subpartition (ABC,1) cannot use the index because that index subpartition is unusable, the optimizer has to discount it.

We can force the index:

declare
 v1 c1 := c1(o1('DEF',1,'PB:10010001'),
             o1('DEF',1,'PB:10010002'),
             o1('DEF',1,'PB:10010003'),
             o1('DEF',1,'PB:10010004'),
             o1('DEF',1,'PB:10010005'));
begin
   forall i in 1 .. v1.count
      update /*+ find_me_dom index(t1 (id))*/
             t1
      set    status = 'SUPERSEDED'
      where  pkey   = v1(i).pkey
      and    spkey  = v1(i).spkey
      and    id     = v1(i).id;
end;
/

select * from table(dbms_xplan.display_cursor('3m2xyxgruxkpr',0));

-------------------------------------------------------------
| Id  | Operation                                    | Name |
-------------------------------------------------------------
|   0 | UPDATE STATEMENT                             |      |
|   1 |  UPDATE                                      | T1   |
|   2 |   PARTITION LIST SINGLE                      |      |
|   3 |    PARTITION LIST SINGLE                     |      |
|   4 |     TABLE ACCESS BY LOCAL INDEX ROWID BATCHED| T1   |
|   5 |      INDEX RANGE SCAN                        | I1   |
-------------------------------------------------------------

But note, as covered in my earlier link, if we force the index in a situation where it can’t be used, it will error in current versions:

alter index i1 modify subpartition P_DEF_SP_1 unusable;
declare
 v1 c1 := c1(o1('DEF',1,'PB:10010001'),
             o1('DEF',1,'PB:10010002'),
             o1('DEF',1,'PB:10010003'),
             o1('DEF',1,'PB:10010004'),
             o1('DEF',1,'PB:10010005'));
begin
   forall i in 1 .. v1.count
      update /*+ find_me_dom index(t1 (id))*/
             t1
      set    status = 'SUPERSEDED'
      where  pkey   = v1(i).pkey
      and    spkey  = v1(i).spkey
      and    id     = v1(i).id;
end;
/
ORA-01502: index 'DOM.I1' or partition of such index is in unusable state
ORA-06512: at line 8
01502. 00000 -  "index '%s.%s' or partition of such index is in unusable state"
*Cause:    An attempt has been made to access an index or index partition
           that has been marked unusable by a direct load or by a DDL
           operation
*Action:   DROP the specified index, or REBUILD the specified index, or
           REBUILD the unusable index partition

If I rebuild that second partition again, then I can look at my second option – using literals not binds, at least for the partition keys, but really we then lose the option of FORALL.

alter index i1 rebuild subpartition P_DEF_SP_1;

declare
 v1 c1 := c1(o1('DEF',1,'PB:10010001'),
             o1('DEF',1,'PB:10010002'),
             o1('DEF',1,'PB:10010003'),
             o1('DEF',1,'PB:10010004'),
             o1('DEF',1,'PB:10010005'));
begin
   for i in 1 .. v1.count
   loop
     execute immediate 
     'update /*+ find_me_dom */
             t1
      set    status = ''SUPERSEDED''
      where  pkey   = '''||v1(i).pkey||''''||'
      and    spkey  = '||v1(i).spkey||'
      and    id     = :1' using v1(i).id;
   end loop;
end;
/


select * from table(dbms_xplan.display_cursor('5a3vcac58x32q',0));

----------------------------------------
| Id  | Operation               | Name |
----------------------------------------
|   0 | UPDATE STATEMENT        |      |
|   1 |  UPDATE                 | T1   |
|   2 |   PARTITION LIST SINGLE |      |
|   3 |    PARTITION LIST SINGLE|      |
|   4 |     INDEX RANGE SCAN    | I1   |
----------------------------------------

At least we have options whilst playing a waiting game to see if we can observe a problem state which might cause such an issue…. or wait for other possibilities to make themselves known…

Detail! And don’t forget the notes…

…and how sparse tables can mess up your dynamic sampling.

A little lesson to self about paying attention to the detail, the obvious frickin’ detail. Getting older and slower.

We had one of those problems where a multitude of things were slow and yet there were no significant system-level problems (e.g. no general slow IO, no setting of the fast=false parameter, etc)

In such situations, if you check the high level system level metrics – AWR etc – and nothing is obvious then it makes sense to narrow the scope & drill into a specific problem, and distill that problem/statement as small as you can. If you can find the issue there then you might hope to find that that is the same as the wider problem and if not well you can just take that approach to every individual area,

I had a problem across a number of reporting modules, I picked on one module and from there I picked on the statement which was currently running and which was running for much longer than expected, I picked out one of the obvious problem joins within the problem statement and started looking at a simple two table join where the cardinality estimates were going wrong.

Most of the specific details of this plan and SQL can be ignored. It’s not important.

What is important is the cardinality difference.

And in fact, in this particular cut down example, the cardinality difference does not have the plan impact it does in the original setting.

 
set pages 999
explain plan for
SELECT /*+ parallel(4)*/
	   count(*)
FROM   psbal
LEFT OUTER JOIN
       prdct
ON    (prdct.d_product_id                        = psbal.d_product_id
AND    prdct.create_ts                           TO_TIMESTAMP('14.10.2020 08:16:34.822981', 'DD.MM.YYYY HH24:MI:SS.FF6'));

select * from table(dbms_xplan.display);

Plan hash value: 3885851338
--------------------------------------------------------------------------------------------------------------------------------------------------
|  Id | Operation                             | Name          | Rows | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |  TQ   |IN-OUT| PQ Distrib |
--------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |               |    1 |    72 |   3430K (3)| 01:54:22 |       |       |       |      |            |
|   1 |  SORT AGGREGATE                       |               |    1 |    72 |            |          |       |       |       |      |            |
|   2 |   PX COORDINATOR                      |               |      |       |            |          |       |       |       |      |            |
|   3 |    PX SEND QC (RANDOM)                | :TQ10003      |    1 |    72 |            |          |       |       | Q1,03 | P->S |  QC (RAND) |
|   4 |     SORT AGGREGATE                    |               |    1 |    72 |            |          |       |       | Q1,03 | PCWP |            |
|*  5 |      HASH JOIN                        |               | 7466M|   500G|   3430K (3)| 01:54:22 |       |       | Q1,03 | PCWP |            |
|   6 |       PART JOIN FILTER CREATE         | :BF0000       |  286K|  1680K|     56  (0)| 00:00:01 |       |       | Q1,03 | PCWP |            |
|   7 |        PX RECEIVE                     |               |  286K|  1680K|     56  (0)| 00:00:01 |       |       | Q1,03 | PCWP |            |
|   8 |         PX SEND BROADCAST             | :TQ10000      |  286K|  1680K|     56  (0)| 00:00:01 |       |       | Q1,00 | P->P |  BROADCAST |
|   9 |          PX BLOCK ITERATOR            |               |  286K|  1680K|     56  (0)| 00:00:01 |       |       | Q1,00 | PCWC |            |
|  10 |           INDEX STORAGE FAST FULL SCAN| S_PK          |  286K|  1680K|     56  (0)| 00:00:01 |       |       | Q1,00 | PCWP |            |
|* 11 |       HASH JOIN RIGHT OUTER           |               | 7466M|   458G|   3425K (3)| 01:54:12 |       |       | Q1,03 | PCWP |            |
|  12 |        PX RECEIVE                     |               |    1 |    30 |    306  (0)| 00:00:01 |       |       | Q1,03 | PCWP |            |
|  13 |         PX SEND BROADCAST             | :TQ10001      |    1 |    30 |    306  (0)| 00:00:01 |       |       | Q1,01 | P->P |  BROADCAST |
|  14 |          PX BLOCK ITERATOR            |               |    1 |    30 |    306  (0)| 00:00:01 |     1 |     3 | Q1,01 | PCWC |            |
|* 15 |           TABLE ACCESS STORAGE FULL   | PRODUCT       |    1 |    30 |    306  (0)| 00:00:01 |     1 |     3 | Q1,01 | PCWP |            |
|* 16 |        HASH JOIN                      |               | 7466M|   250G|   3420K (3)| 01:54:01 |       |       | Q1,03 | PCWP |            |
|  17 |         PX RECEIVE                    |               | 6903K|   118M|  16379  (1)| 00:00:33 |       |       | Q1,03 | PCWP |            |
|  18 |          PX SEND BROADCAST            | :TQ10002      | 6903K|   118M|  16379  (1)| 00:00:33 |       |       | Q1,02 | P->P |  BROADCAST |
|  19 |           PX BLOCK ITERATOR           |               | 6903K|   118M|  16379  (1)| 00:00:33 |     1 |     3 | Q1,02 | PCWC |            |
|* 20 |            TABLE ACCESS STORAGE FULL  | BOOK          | 6903K|   118M|  16379  (1)| 00:00:33 |     1 |     3 | Q1,02 | PCWP |            |
|  21 |         PX PARTITION LIST JOIN-FILTER |               | 8064M|   135G|   3398K (3)| 01:53:17 |:BF0000|:BF0000| Q1,03 | PCWC |            |
|  22 |          TABLE ACCESS STORAGE FULL    | BALANCE       | 8064M|   135G|   3398K (3)| 01:53:17 |:BF0000|:BF0000| Q1,03 | PCWP |            |
--------------------------------------------------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
  5 - access("FCT"."S_VID"="DNSP"."S_VID")
 11 - access("PRDCT"."PRODUCT_ID"(+)="FCT"."PRODUCT_ID")
 15 - storage(NVL("PRDCT"."MODIFY_TS"(+),TIMESTAMP' 9999-12-31 00:00:00')>TIMESTAMP' 2020-10-14 08:16:34.822981000' AND 
          "PRDCT"."CREATE_TS"(+)TIMESTAMP' 2020-10-14 08:16:34.822981000' AND 
         "PRDCT"."CREATE_TS"(+)<=TIMESTAMP' 2020-10-14 08:16:34.822981000')
 16 - access("BOOK_ID"="FCT"."BOOK_ID" AND "VERSION"="FCT"."BOOK_VERSION")
 20 - storage("COLUMN1" IS NULL OR "D_BOOK_VID"=(-2) OR "D_BOOK_VID"=(-1))
    filter("COLUMN1" IS NULL OR "D_BOOK_VID"=(-2) OR "D_BOOK_VID"=(-1))
 
Note
-----
  - dynamic sampling used for this statement (level=7)
  - Degree of Parallelism is 4 because of hint

and this:


set pages 999
explain plan for
select count(*)
FROM   prdct
WHERE  prdct.create_ts  <= TO_TIMESTAMP('14.10.2020 08:16:34.822981', 'DD.MM.YYYY HH24:MI:SS.FF6')
AND    NVL(prdct.modify_ts, DATE '9999-12-31')  > TO_TIMESTAMP('14.10.2020 08:16:34.822981', 'DD.MM.YYYY HH24:MI:SS.FF6');
   
select * from table(dbms_xplan.display);

Plan hash value: 2639949168
----------------------------------------------------------------------------------------------------
| Id | Operation                 | Name     | Rows | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
----------------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT          |          |    1 |    14 |   1103  (1)| 00:00:14 |       |       |
|  1 | SORT AGGREGATE            |          |    1 |    14 |            |          |       |       |
|  2 |  PARTITION LIST ALL       |          |  399 |  5586 |   1103  (1)| 00:00:14 |     1 |     3 |
|* 3 |  TABLE ACCESS STORAGE FULL| PRODUCT  |  399 |  5586 |   1103  (1)| 00:00:14 |     1 |     3 |
----------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
  3 - storage(NVL("MODIFY_TS",TIMESTAMP' 9999-12-31 00:00:00')>TIMESTAMP' 2020-10-14 
       08:16:34.822981000' AND "CREATE_TS"<=TIMESTAMP' 2020-10-14 08:16:34.822981000')
    filter(NVL("MODIFY_TS",TIMESTAMP' 9999-12-31 00:00:00')>TIMESTAMP' 2020-10-14 
       08:16:34.822981000' AND "CREATE_TS"<=TIMESTAMP' 2020-10-14 08:16:34.822981000')

Once you see the problem, you can’t unsee it but for whatever reason, for ages, I was focusing on why there would be such a cardinality difference between A and B for what are basically the single table cardinality estimates for the same table without seeing the explanation right there in front of me.


Note
-----
  - dynamic sampling used for this statement (level=7)

I then got distracted and spent some time investigating why dynamic sampling was kicking in. Parallel query has different rules about when it can kick in. And I found reasons why it might.

But then I put that thought to one side.

Why is dynamic sampling saying this will return 1 row?

This is a small table and the estimate should be accurate and we like dynamic sampling so we want to show the feature to its full capability whenever and wherever it kicks in

Sure enough, as we might expect, if I hint dynamic sampling on the single table, estimate is one.


set pages 999
explain plan for
select /*+ dynamic_sampling(4) */ count(*)
FROM  prdct
WHERE prdct.create_ts                  <= TO_TIMESTAMP('14.10.2020 08:16:34.822981', 'DD.MM.YYYY HH24:MI:SS.FF6')
AND  NVL(prdct.modify_ts, DATE '9999-12-31')      > TO_TIMESTAMP('14.10.2020 08:16:34.822981', 'DD.MM.YYYY HH24:MI:SS.FF6');
   
select * from table(dbms_xplan.display);

----------------------------------------------------------------------------------------------------
| Id | Operation                 | Name     | Rows | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
----------------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT          |          |    1 |    14 |   1103  (1)| 00:00:14 |       |       |
|  1 | SORT AGGREGATE            |          |    1 |    14 |            |          |       |       |
|  2 |  PARTITION LIST ALL       |          |    1 |    14 |   1103  (1)| 00:00:14 |     1 |     3 |
|* 3 |  TABLE ACCESS STORAGE FULL| PRODUCT  |    1 |    14 |   1103  (1)| 00:00:14 |     1 |     3 |
----------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
  3 - storage(NVL("MODIFY_TS",TIMESTAMP' 9999-12-31 00:00:00')>TIMESTAMP' 2020-10-14 
       08:16:34.822981000' AND "CREATE_TS"<=TIMESTAMP' 2020-10-14 08:16:34.822981000')
    filter(NVL("MODIFY_TS",TIMESTAMP' 9999-12-31 00:00:00')>TIMESTAMP' 2020-10-14 
       08:16:34.822981000' AND "CREATE_TS"<=TIMESTAMP' 2020-10-14 08:16:34.822981000')
 
Note
-----
  - dynamic sampling used for this statement (level=4)

Without access to the sample statements, what suggestive evidence can we find?


select table_name, num_rows, blocks, avg_row_len
from  user_tables
where table_name = 'PRODUCT';

TABLE_NAME                       NUM_ROWS     BLOCKS AVG_ROW_LEN
------------------------------ ---------- ---------- -----------
PRODUCT                               462       4064         193

So that’s not a lot of rows in a lot more blocks than should be necessary.

At this point, we should go off and find out what the reasons might be.

For what it’s worth, for this application, my suspicions are historic direct path maintenance code coupled with table partitioning on status (LATEST/SUPERSEDED) and therefore row movement means an update is a delete + insert, coupled with periods of very low rate of change and small row sizes.

So we end up with old rows moving partition – using a new block for the updated version and leaving behind an empty block where the version was before the update and then relatively few new version rows per run which are using new blocks, and so lots of empty blocks.

i.e. sparsely populated table.

And then dynamic sampling kicks in and samples a bunch of blocks, and finds few actual rows from those samples and comes up with a low estimate and then the whole problem starts. And these were shared tables across the reporting modules so it had a reasonable wide impact.

It would be nice to back this up with some trace files detailing any of the dynamic sampling numbers but I don’t have access by default and mocking this all up outside of work is always a worthwhile exercise but just takes up so much time.

Anyway… move some partitions, rebuild some indexes and 4000 blocks becomes just over 40.

Now our dynamic sampling is looking much better:


----------------------------------------------------------------------------------------------------
| Id | Operation                 | Name     | Rows | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
----------------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT          |          |   1 |     14 |   1103  (1)| 00:00:14 |       |       |
|  1 | SORT AGGREGATE            |          |   1 |     14 |            |          |       |       |
|  2 |  PARTITION LIST ALL       |          | 297 |     14 |   1103  (1)| 00:00:14 |     1 |     3 |
|* 3 |  TABLE ACCESS STORAGE FULL| PRODUCT  | 297 |     14 |   1103  (1)| 00:00:14 |     1 |     3 |
----------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
  3 - storage(NVL("MODIFY_TS",TIMESTAMP' 9999-12-31 00:00:00')>TIMESTAMP' 2020-10-14 
       08:16:34.822981000' AND "CREATE_TS"<=TIMESTAMP' 2020-10-14 08:16:34.822981000')
    filter(NVL("MODIFY_TS",TIMESTAMP' 9999-12-31 00:00:00')>TIMESTAMP' 2020-10-14 
       08:16:34.822981000' AND "CREATE_TS"<=TIMESTAMP' 2020-10-14 08:16:34.822981000')
 
Note
-----
  - dynamic sampling used for this statement (level=4)

Exercise repeated for a whole bunch of tables in a similar situation and everything across the board is a lot healthier.

Materialize cardinality

Blowing the dust off the blog with a couple of trivial observations.

Firstly, this little one about one implication of materialize.

I’ve always been a fan of the simplicity of the materialize hint as a quick fix for certain plan stability/performance issues but it comes at a clear cost of actually materialising to temp that subquery and is often used when actually a combination of no_merge, no_unnest and/or no_push_pred might be better choice.

Having been recently working on a platform with a problematic systemic temp addiction, I rarely use it unless I know the cost of materialising that resultset once is cheaper than querying the data the requisite number of times via any alternative method particularly on Exadata where the overhead of querying the data twice might be less than you think (note to self: might be helpful to demo this in a future post).

Here is another implication of materialize on the cardinality of a set of data.

This simulates a real world problem observation where the view contained a materialize hint.

Starting with some data – one day for each of April, five versions per day, between 0 and 5 versions potentially approved each day:

drop table  ref_data_versions;

create table ref_data_versions
(business_date   DATE
,version         NUMBER
,status          VARCHAR2(10));


insert into ref_data_versions
select to_date(20200401,'YYYYMMDD') + days.rn-1
,      versions.rn
,      CASE when versions.rn = round(dbms_random.value(1,5)) then 'APPROVED' ELSE 'UNAPPROVED' END
from   dual
cross join
       (select rownum rn from xmltable('1 to 30')) days
cross join
       (select rownum rn from xmltable('1 to 5')) versions;
       
commit;

select count(*) from ref_data_versions;

The following query represents our view and happens to show the tangential observation that the optimizer does not recognize that the row_number analytic will filter any rows.

explain plan for 
with x as
(select /*+ */ *
 from   (select rdv.*
         ,      row_number() over (partition by business_date order by decode(status,'APPROVED',1,2), version DESC) rnk
         from   ref_data_versions rdv)
 where  rnk = 1)
select * 
from   x;
 
select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
Plan hash value: 2125428461
 
----------------------------------------------------------------------------------------------
| Id  | Operation                | Name              | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |                   |   150 |  6300 |     4  (25)| 00:00:01 |
|*  1 |  VIEW                    |                   |   150 |  6300 |     4  (25)| 00:00:01 |
|*  2 |   WINDOW SORT PUSHED RANK|                   |   150 |  4350 |     4  (25)| 00:00:01 |
|   3 |    TABLE ACCESS FULL     | REF_DATA_VERSIONS |   150 |  4350 |     3   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - filter("RNK"=1)
   2 - filter(ROW_NUMBER() OVER ( PARTITION BY "BUSINESS_DATE" ORDER BY 
              DECODE("STATUS",'APPROVED',1,2),INTERNAL_FUNCTION("VERSION") DESC )<=1)
 
Note
-----
   - dynamic statistics used: dynamic sampling (level=2)

If we add in a predicate on business date, we get:

explain plan for 
with x as
(select /*+ */ *
 from   (select rdv.*
         ,      row_number() over (partition by business_date order by decode(status,'APPROVED',1,2), version DESC) rnk
         from   ref_data_versions rdv)
 where  rnk = 1)
select * 
from   x
where  business_date = to_date(20200429,'YYYYMMDD');
 
select * from table(dbms_xplan.display);
Plan hash value: 2125428461
 
----------------------------------------------------------------------------------------------
| Id  | Operation                | Name              | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |                   |     5 |   210 |     4  (25)| 00:00:01 |
|*  1 |  VIEW                    |                   |     5 |   210 |     4  (25)| 00:00:01 |
|*  2 |   WINDOW SORT PUSHED RANK|                   |     5 |   145 |     4  (25)| 00:00:01 |
|*  3 |    TABLE ACCESS FULL     | REF_DATA_VERSIONS |     5 |   145 |     3   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - filter("RNK"=1)
   2 - filter(ROW_NUMBER() OVER ( PARTITION BY "BUSINESS_DATE" ORDER BY 
              DECODE("STATUS",'APPROVED',1,2),INTERNAL_FUNCTION("VERSION") DESC )<=1)
   3 - filter("RDV"."BUSINESS_DATE"=TO_DATE(' 2020-04-29 00:00:00', 'syyyy-mm-dd 
              hh24:mi:ss'))
 
Note
-----
   - dynamic statistics used: dynamic sampling (level=2)

Note that the query expects to get 5 rows rather than 1, but that’s consistent with what we saw before.

What happens if the subquery uses materialize:

explain plan for 
with x as
(select /*+ materialize */ *
 from   (select rdv.*
         ,      row_number() over (partition by business_date order by decode(status,'APPROVED',1,2), version DESC) rnk
         from   ref_data_versions rdv)
 where  rnk = 1)
select * 
from   x
where  business_date = to_date(20200429,'YYYYMMDD');
 
select * from table(dbms_xplan.display);
Plan hash value: 1377080515
 
------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                | Name                        | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                         |                             |   150 |  6300 |     6  (17)| 00:00:01 |
|   1 |  TEMP TABLE TRANSFORMATION               |                             |       |       |            |          |
|   2 |   LOAD AS SELECT (CURSOR DURATION MEMORY)| SYS_TEMP_0FD9D787C_3AB51228 |       |       |            |          |
|*  3 |    VIEW                                  |                             |   150 |  6300 |     4  (25)| 00:00:01 |
|*  4 |     WINDOW SORT PUSHED RANK              |                             |   150 |  4350 |     4  (25)| 00:00:01 |
|   5 |      TABLE ACCESS FULL                   | REF_DATA_VERSIONS           |   150 |  4350 |     3   (0)| 00:00:01 |
|*  6 |   VIEW                                   |                             |   150 |  6300 |     2   (0)| 00:00:01 |
|   7 |    TABLE ACCESS FULL                     | SYS_TEMP_0FD9D787C_3AB51228 |   150 |  6300 |     2   (0)| 00:00:01 |
------------------------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   3 - filter("RNK"=1)
   4 - filter(ROW_NUMBER() OVER ( PARTITION BY "BUSINESS_DATE" ORDER BY 
              DECODE("STATUS",'APPROVED',1,2),INTERNAL_FUNCTION("VERSION") DESC )<=1)
   6 - filter("BUSINESS_DATE"=TO_DATE(' 2020-04-29 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

The filter at step 6 is now no longer having any effect on the overall cardinality.

There are approaches using cardinality and opt_estimate which you might use to address some of the underlying issues.

However, just another example of why you should think twice about the liberal application of materialize hints (or any hints!).

The system with the problem was 11.2.0.4. Examples above are run on LiveSQL which is currently 19.

Outer Join with OR and Lateral View Decorrelation

Use of ANSI SQL is a personal thing.

Historically I have not been a fan apart from where it makes things easier/possible.

This reticence was mainly due to optimizer bugs and limitations in the earlier days.

Recently I have been using it much more because I find that the developers I interact with prefer it / understand it better.

You might/should be aware that Oracle will rewrite ANSI SQL to an Oracle syntax representation, this transformation being listed in the optimizer trace file.

You might/should also be aware that Oracle outer join syntax does not allow OR or IN:

drop table t1;
drop table t2;

create table t1
as
select floor((rownum+1)/2) col1
,      case when mod(rownum,2) = 0 then 1 else 2 end col2
,      10 col3
from   dual
connect by rownum <= 20;

create table t2
as
select rownum col1
,      case when mod(rownum,2) = 0 then 2 else 1 end col3
from   dual
connect by rownum <= 10;

select *
from   t1
,      t2
where  t1.col1 = t2.col1 (+) 
and  ((t1.col2 = 1
and    t2.col3 (+) > t1.col3)
or    (t1.col2 = 2
and    t2.col3 (+) < t1.col3));

ORA-01719: outer join operator (+) not allowed in operand of OR or IN

ANSI SQL remedies this:

alter session tracefile_identifier = 'domlg1';
alter session set events 'trace[rdbms.SQL_Optimizer.*]';
select *
from   t1
left join t2
on    t1.col1 = t2.col1
and ((t1.col2 = 1
and   t2.col3 > t1.col3)
or   (t1.col2 = 2
and   t2.col3 < t1.col3));

alter session set events 'trace off';

But it comes at a price.

Note the execution plan:

----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |    20 |  1300 |    42   (0)| 00:00:01 |
|   1 |  NESTED LOOPS OUTER |      |    20 |  1300 |    42   (0)| 00:00:01 |
|   2 |   TABLE ACCESS FULL | T1   |    20 |   780 |     2   (0)| 00:00:01 |
|   3 |   VIEW              |      |     1 |    26 |     2   (0)| 00:00:01 |
|*  4 |    TABLE ACCESS FULL| T2   |     1 |    26 |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - filter("T1"."COL1"="T2"."COL1" AND ("T1"."COL2"=1 AND
              "T2"."COL3">"T1"."COL3" OR "T1"."COL2"=2 AND "T2"."COL3"<"T1"."COL3"))   

Now, maybe you will have better luck than me but no matter what I try I cannot change the NESTED LOOPS OUTER operation (I could rewrite it to do two joins rather than one and remove the OR).

So, if that lateral view involves some full table scans or other significant operations, they might be very expense on the outer operation of a nested loop.

The reason is in the optimizer trace.

Query after View Removal
******* UNPARSED QUERY IS ********
SELECT "T1."COL1" "COL1", "T1."COL2" "COL2", "T1."COL3" "COL3", "VW_LAT_AE9E49E8"."ITEM_1_0" "COL1", "VW_LAT_AE9E49E8"."ITEM_2_1" "COL3" FROM "DOM"."T1" "T1", LATERAL( (SELECT "T2"."COL1" "ITEM_1_0", "T2"."COL3" "ITEM_2_1" FROM "DOM"."T2" "T2" WHERE "T1"."COL1"="T2"."COL1" AND ("T1"."COL2"=1 AND "T2"."COL3">"T1"."COL3" OR "T1"."COL2"=2 AND "T2"."COL3" < "T1"."COL3"))) (+) "VW_LAT_AE9E49E8"
DCL:Checking validity of lateral view decorrelation SEL$BCD4421C (#1)
DCL: Bypassed: view has non-well-formed predicate
DCL: Failed decorrelation validity for lateral view block SEL$BCD4421C (#1)

The OR prevents the decorrelation which seems to mean that we’re stuck with a NESTED LOOP for now.

Further Reading on ANSI:
Oracle Optimizer Blog
Jonathan Lewis on ANSI Outer
Jonathan Lewis on ANSI

Did it really fix it 1: OFE & nasty FIRST_ROWS plan on 11.2.0.4

Plan degradations on upgrade are normal.

This one’s no different.

On further investigation, turned out application was setting optimizer_mode = first_rows somewhere.

First point about this is that first_rows really shouldn’t be used either as a hint or as an optimizer_mode.

What does FIRST_ROWS mean?
From 11g doco:

The optimizer uses a mix of costs and heuristics to find a best plan for fast delivery of the first few rows.

If any sort of FIRST_ROWS type intervention is required, then it might be better to be using a more modern, fully-costed FIRST_ROWS_n (where n = 1 | 10 | 100 | 1000).
If it’s even required.

The 12c documentation is clearer about this:

FIRST_ROWS is available for backward compatibility and plan stability; use FIRST_ROWS_n instead.

But I don’t think we should attach too much credence to that statement not being in the 11g documentation.

Here’s the problem plan.

Pre upgrade, snippet of the plan looked as per below.
Please note position of T1.

-------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                   | Name                        | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                            |                             |    34 | 13668 |  3003   (1)| 00:00:22 |       |       |
|   1 |  SORT ORDER BY                              |                             |    34 | 13668 |  3003   (1)| 00:00:22 |       |       |
|   2 |   NESTED LOOPS OUTER                        |                             |    34 | 13668 |  3002   (1)| 00:00:22 |       |       |
|   3 |    NESTED LOOPS OUTER                       |                             |    34 | 12648 |  2968   (1)| 00:00:22 |       |       |
|   4 |     NESTED LOOPS OUTER                      |                             |    34 | 12274 |  2935   (1)| 00:00:22 |       |       |
|   5 |      NESTED LOOPS OUTER                     |                             |    34 | 11900 |  2867   (1)| 00:00:21 |       |       |
|   6 |       NESTED LOOPS OUTER                    |                             |    34 | 11628 |  2833   (1)| 00:00:21 |       |       |
|   7 |        NESTED LOOPS OUTER                   |                             |    34 | 10404 |  2799   (1)| 00:00:21 |       |       |
|   8 |         NESTED LOOPS OUTER                  |                             |    34 |  8228 |  2765   (1)| 00:00:20 |       |       |
|   9 |          NESTED LOOPS                       |                             |    34 |  6732 |  2731   (1)| 00:00:20 |       |       |
|  10 |           NESTED LOOPS                      |                             |    69 | 12282 |  2524   (1)| 00:00:19 |       |       |
|  11 |            NESTED LOOPS                     |                             |    69 |  9591 |  2386   (1)| 00:00:18 |       |       |
|* 12 |             TABLE ACCESS FULL               | T1                          |  1658 |   213K|   725   (3)| 00:00:06 |       |       |
|* 13 |             TABLE ACCESS BY INDEX ROWID     | T2                          |     1 |     7 |     1   (0)| 00:00:01 |       |       |
|* 14 |              INDEX UNIQUE SCAN              | T2_U1                       |     1 |       |     0   (0)| 00:00:01 |       |       |
|  15 |            TABLE ACCESS BY INDEX ROWID      | T3                          |     1 |    39 |     2   (0)| 00:00:01 |       |       |
|* 16 |             INDEX UNIQUE SCAN               | T3_U1                       |     1 |       |     1   (0)| 00:00:01 |       |       |
|  17 |           TABLE ACCESS BY GLOBAL INDEX ROWID| T4                          |     1 |    20 |     3   (0)| 00:00:01 | ROWID | ROWID |
|* 18 |            INDEX RANGE SCAN                 | T4_N1                       |     1 |       |     2   (0)| 00:00:01 |       |       |

Post upgrade, plan has degraded to:

--------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                    | Name                        | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                             |                             |    69 | 27738 |   508M  (3)|999:59:59 |       |       |
|   1 |  NESTED LOOPS OUTER                          |                             |    69 | 27738 |   508M  (3)|999:59:59 |       |       |
|   2 |   NESTED LOOPS OUTER                         |                             |    69 | 26979 |   508M  (3)|999:59:59 |       |       |
|   3 |    NESTED LOOPS OUTER                        |                             |    69 | 26220 |   508M  (3)|999:59:59 |       |       |
|   4 |     NESTED LOOPS OUTER                       |                             |    69 | 23184 |   508M  (3)|999:59:59 |       |       |
|   5 |      NESTED LOOPS OUTER                      |                             |    69 | 22632 |   508M  (3)|999:59:59 |       |       |
|   6 |       NESTED LOOPS OUTER                     |                             |    69 | 20562 |   508M  (3)|999:59:59 |       |       |
|   7 |        NESTED LOOPS OUTER                    |                             |    69 | 16146 |   508M  (3)|999:59:59 |       |       |
|   8 |         NESTED LOOPS                         |                             |    69 | 13662 |   508M  (3)|999:59:59 |       |       |
|   9 |          NESTED LOOPS                        |                             |  1675 |   312K|   508M  (3)|999:59:59 |       |       |
|  10 |           NESTED LOOPS                       |                             |   703K|    39M|  3476K  (1)| 06:54:55 |       |       |
|* 11 |            TABLE ACCESS BY GLOBAL INDEX ROWID| T4                          |   703K|    13M|  2068K  (1)| 04:06:54 | ROWID | ROWID |
|  12 |             INDEX FULL SCAN                  | T4_N2                       |  2240K|       |  7943   (1)| 00:00:57 |       |       |
|  13 |            TABLE ACCESS BY INDEX ROWID       | T3                          |     1 |    39 |     2   (0)| 00:00:01 |       |       |
|* 14 |             INDEX UNIQUE SCAN                | T3_U1                       |     1 |       |     1   (0)| 00:00:01 |       |       |
|* 15 |           TABLE ACCESS FULL                  | T1                          |     1 |   132 |   718   (3)| 00:00:06 |       |       |

In the post-upgrade plan, the most pertinent part was step 15.

A FULL TABLE SCAN as the inner rowsource of a NESTED LOOP, in other words for every row in the driving rowsource do a full table scan.”

Standard reaction to such a degradation is to test the SQL with OPTIMIZER_FEATURES_ENABLE (OFE).

With OFE set either via OPT_PARAM hint or via ALTER SESSION then the SQL reverted to pre-upgrade plan.

This is good!

But… how much confidence can we have that this actually fixed the problem?

It would be great if we could find a bug which was definitively responsible for this change in execution plan.

OFE is an umbrella but it doesn’t turn off ALL fixes/optimizer features in a release.

Many of those fixes covered by the OFE umbrella are represented in V$SYSTEM_FIX_CONTROL

So, I went looking for anything which sounded related and went back to default 11.2.0.4 OFE and starting turning off the BUGNOs one-by-one – no difference, still a bad plan

Then still at OFE 11.2.0.4, I turned off all the BUGNOs…. no difference, still a bad plan.

At this point, my confidence in OFE as a fix for this problem was starting to waiver.

So, I tried to reduce the problem SQL to it’s smallest form representing the problem.

alter session set optimizer_mode = first_rows;
alter session set optimizer_features_enable = '11.2.0.4';

 explain plan for
 SELECT *
 FROM   T1
 ,      T3
 ,      T4
 WHERE  T1.COL1             = T3.COL1
 AND    T3.COL2             = T4.COL2
 AND    T1.DATE_COL        '+OUTLINE'));

This distilled statement reproduced the issue at the heart of the original statement:

----------------------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name    | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |         |  6029 |  4533K|  1612M  (3)|999:59:59 |       |       |
|   1 |  NESTED LOOPS                        |         |  6029 |  4533K|  1612M  (3)|999:59:59 |       |       |
|   2 |   NESTED LOOPS                       |         |  2237K|   795M|  6551K  (1)| 13:01:53 |       |       |
|   3 |    TABLE ACCESS BY GLOBAL INDEX ROWID| T4      |  2240K|   188M|  2068K  (1)| 04:06:53 | ROWID | ROWID |
|   4 |     INDEX FULL SCAN                  | T4_N2   |  2240K|       |  7943   (1)| 00:00:57 |       |       |
|   5 |    TABLE ACCESS BY INDEX ROWID       | T3      |     1 |   285 |     2   (0)| 00:00:01 |       |       |
|*  6 |     INDEX UNIQUE SCAN                | T3_U1   |     1 |       |     1   (0)| 00:00:01 |       |       |
|*  7 |   TABLE ACCESS FULL                  | T1      |     1 |   397 |   718   (3)| 00:00:06 |       |       |
----------------------------------------------------------------------------------------------------------------

Is this test case a 100% indication that OFE wasn’t applicable to the original problem?
Not sure.
But at this point I had littleconfidence that OFE provided any sort of reliable fix to the original SQL statement.

I don’t know what the internal heuristics are/were for FIRST_ROWS.

But that is a stupid place for a full table scan regardless of the number of rows expected to return.

If we remove one of the T1 predicates above and the various row estimates go up then the FTS moves to a driving position.

I could/should get a 10053 trace but FIRST_ROWS shouldn’t be in use in the first place.

Solution – hunt down where the app code is doing the FIRST_ROWS thing and exterminate.

All the other possible OPTIMIZER_MODE modes gave a plan for the original SQL which started like this:

--------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                   | Name | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                            |      |    35 | 14105 |  1386   (2)| 00:00:10 |       |       |
|   1 |  SORT ORDER BY                              |      |    35 | 14105 |  1386   (2)| 00:00:10 |       |       |
|   2 |   NESTED LOOPS OUTER                        |      |    35 | 14105 |  1385   (2)| 00:00:10 |       |       |
|*  3 |    HASH JOIN OUTER                          |      |    35 | 12565 |  1350   (2)| 00:00:10 |       |       |
|*  4 |     HASH JOIN OUTER                         |      |    35 | 11305 |  1315   (2)| 00:00:10 |       |       |
|   5 |      NESTED LOOPS OUTER                     |      |    35 |  9065 |  1280   (2)| 00:00:10 |       |       |
|   6 |       NESTED LOOPS OUTER                    |      |    35 |  8015 |  1245   (2)| 00:00:09 |       |       |
|   7 |        NESTED LOOPS OUTER                   |      |    35 |  7735 |  1210   (2)| 00:00:09 |       |       |
|   8 |         NESTED LOOPS OUTER                  |      |    35 |  7350 |  1176   (2)| 00:00:09 |       |       |
|   9 |          NESTED LOOPS                       |      |    35 |  6965 |  1106   (2)| 00:00:08 |       |       |
|  10 |           NESTED LOOPS                      |      |    69 | 12351 |   899   (3)| 00:00:07 |       |       |
|* 11 |            HASH JOIN                        |      |    69 |  9660 |   761   (3)| 00:00:06 |       |       |
|* 12 |             TABLE ACCESS FULL               | T2   |   237 |  1659 |    36   (6)| 00:00:01 |       |       |
|* 13 |             TABLE ACCESS FULL               | T1   |  1671 |   217K|   725   (3)| 00:00:06 |       |       |

So…. when optimizer upgrade problems strike, you should always try OFE.

But it’s also worth further investigation to confirm that OFE really is a reliable fix for the problem.

COUNT STOPKEY – NVL

Yesterday I mentioned issues with a recursive delete operation on statistics history. This is a quick illustration of the last points I made on that post regarding the lack of a COUNT STOPKEY optimisation because of the use of NVL. COUNT STOPKEY is an optimisation which allows processing to stop once the target number of rows has been reached. For example:

create table t1 as select * from dba_objects;

alter session set statistics_level = all;

var rn number
exec :rn := 10;

select *
from   t1
where  rownum <= :rn
and    mod(object_id,5) = 0;

select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));

Plan hash value: 3836375644

-------------------------------------------------------------------------------------
| Id  | Operation          | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |      1 |        |     10 |00:00:00.01 |       4 | 
|*  1 |  COUNT STOPKEY     |      |      1 |        |     10 |00:00:00.01 |       4 |
|*  2 |   TABLE ACCESS FULL| T1   |      1 |  26148 |     10 |00:00:00.01 |       4 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(ROWNUM<=:RN)
   2 - filter(MOD("OBJECT_ID",5)=0) 

COUNT STOPKEY kicks in after we have fetched the relevant rows and stops any unnecessary further execution – note in particular A-Rows & Buffers for STEP 2 However, if we use NVL around our ROWNUM limitation, then this optimisation is prevented.

var rn number
exec :rn := 10;

select *
from   t1
where  rownum <= nvl(:rn,rownum)
and    mod(object_id,5) = 0;

select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));

Plan hash value: 624922415

--------------------------------------------------------------------------------------
| Id  | Operation           | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |      1 |        |     10 |00:00:00.09 |    2310 |
|   1 |  COUNT              |      |      1 |        |     10 |00:00:00.09 |    2310 |
|*  2 |   FILTER            |      |      1 |        |     10 |00:00:00.09 |    2310 |
|*  3 |    TABLE ACCESS FULL| T1   |      1 |  26148 |  29630 |00:00:00.08 |    2310 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(ROWNUM<=NVL(:RN,ROWNUM))
   3 - filter(MOD("OBJECT_ID",5)=0)

Understanding vs Resolution – Statistics History Cleanup

Today I helped resolve a problem quickly but to the detriment of my understanding of exactly what was going on and why.

And as part of the quicker resolution, I had to destroy the evidence which would have helped understand better.

So… now need to go back and figure it out if I have time to model it properly, etc.

Here’s what little I know so far.

What happened was that there were a number of application sessions experiencing slowness and contention when executing a recursive stats history cleanup statement.

Verified via ASH that this recursive delete was somehow being called by app ETL code (TOP_LEVEL_SQL_ID, PLSQL_ENTRY_OBJECT_ID & USER_ID columns), four of them each running slow statement and also because of that nature of the statement below and being blocked by mode 6 TX locks from the leading execution.

Version is 11.2.0.3

Statement was sql id 9v9n97qj8z1dg:

delete /*+ dynamic_sampling(4) */ 
from sys.wri$_optstat_histhead_history 
where savtime < :1 
and rownum <= NVL(:2, rownum);

First up, resolution was quick and easy according to, by a quirk of coincidence, my last post:
https://orastory.wordpress.com/2015/02/16/dbms_stats-purge_stats/

We just got rid of all the statistics history using the magic PURGE_ALL truncate flag.
The history is of limited usefulness day-to-day anyway.

The slowness was holding up ETL jobs which were just trying to calculate stats for their own partition.

I was brought into the situation towards the end of the situation but here’s an illustration of the slowness:

SQL Text
------------------------------
delete /*+ dynamic_sampling(4) */ from sys.wri$_optstat_histhead_history where savtime < :1 and rownum <= NVL(:2, rownum)

Global Information
------------------------------
 Status              :  EXECUTING               
 Instance ID         :  1                       
 SQL ID              :  9v9n97qj8z1dg           
 SQL Execution ID    :  16777249                
 Execution Started   :  02/24/2015 19:11:25     
 First Refresh Time  :  02/24/2015 19:13:06     
 Last Refresh Time   :  02/25/2015 09:10:35     
 Duration            :  50351s                  
 Module/Action       :  JDBC Thin Client/-      
 Program             :  JDBC Thin Client        

Binds
========================================================================================================================
| Name | Position |  Type  |                                           Value                                           |
========================================================================================================================
| :2   |        2 | NUMBER | 10000                                                                                     |
========================================================================================================================

Global Stats
===============================================================================================
| Elapsed |   Cpu   |    IO    | Application | Concurrency |  Other   | Buffer | Read | Read  |
| Time(s) | Time(s) | Waits(s) |  Waits(s)   |  Waits(s)   | Waits(s) |  Gets  | Reqs | Bytes |
===============================================================================================
|   50359 |   35199 |     0.16 |       14669 |         254 |      237 |     1G |   63 | 504KB |
===============================================================================================

SQL Plan Monitoring Details (Plan Hash Value=2348801730)
=======================================================================================================================================================================================================
| Id   |            Operation             |             Name              |  Rows   | Cost  |   Time    | Start  | Execs |   Rows   | Read | Read  | Activity |            Activity Detail            |
|      |                                  |                               | (Estim) |       | Active(s) | Active |       | (Actual) | Reqs | Bytes |   (%)    |              (# samples)              |
=======================================================================================================================================================================================================
|    0 | DELETE STATEMENT                 |                               |         |       |     50262 |   +101 |  1423 |        1 |      |       |          |                                       |
|    1 |   DELETE                         | WRI$_OPTSTAT_HISTHEAD_HISTORY |         |       |     50361 |     +2 |  1423 |        1 |      |       |    29.74 | enq: TX - row lock contention (14630) |
|      |                                  |                               |         |       |           |        |       |          |      |       |          | buffer busy waits (231)               |
|      |                                  |                               |         |       |           |        |       |          |      |       |          | Cpu (82)                              |
|      |                                  |                               |         |       |           |        |       |          |      |       |          | resmgr:cpu quantum (5)                |
|    2 |    COUNT                         |                               |         |       |     50262 |   +101 |  1423 |      14M |      |       |     0.00 | Cpu (1)                               |
|    3 |     FILTER                       |                               |         |       |     50266 |    +97 |  1423 |      14M |      |       |    30.14 | Cpu (15146)                           |
|    4 |      TABLE ACCESS BY INDEX ROWID | WRI$_OPTSTAT_HISTHEAD_HISTORY |      6M | 23218 |     50253 |   +100 |  1423 |       4G |   22 | 176KB |    23.17 | buffer busy waits (18)                |
|      |                                  |                               |         |       |           |        |       |          |      |       |          | Cpu (11627)                           |
| -> 5 |       INDEX RANGE SCAN           | I_WRI$_OPTSTAT_HH_ST          |    990K |  5827 |     50264 |   +101 |  1423 |       4G |   41 | 328KB |    16.94 | buffer busy waits (3)                 |
|      |                                  |                               |         |       |           |        |       |          |      |       |          | Cpu (8332)                            |
|      |                                  |                               |         |       |           |        |       |          |      |       |          | resmgr:cpu quantum (180)              |
=======================================================================================================================================================================================================
   3 - filter(ROWNUM<=NVL(:2,ROWNUM))
   5 - access("WRI$_OPTSTAT_HISTHEAD_HISTORY"."SYS_NC00024$"<SYS_EXTRACT_UTC(:1))

So, first thought was that:

1. I thought that MMON was responsible for purging old stats… clearly not in this case as there were multiple concurrent application connections purging old data as part of their ETL process.

2. The DELETE is deleting any old history older than a parameterised date, the first 10000 rows thereof. There is no connection to the object on whose statistics the application ETL jobs were working on.

3. I would expect a COUNT STOPKEY operation but then the NVL predicate avoids this optimisation.

4. UPDATE: The real time sql monitoring output is also reporting the metrics for 1423 executions of this statement. The clue is in the 14M rows reported in the COUNT + FILTER operations. 1423 * 10000 rows = 14M. But I requested the output for what I thought was a single execution id – 16777249 – strange… bulk/batched delete ?

More questions than answers…

ORs, IN lists and LNNVL

I’ve previously written about manually rewriting an OR condition into a UNION ALL using LNNVL.

This is a description of a performance issue observed in the real world from the optimizer coming up with a CONCATENATION operation against many child operations including an INLIST operator and other children which then have to use LNNVL as a filter.

Really this is nothing more than an illustration of our old performance killing friend the function call from within SQL, particularly from within predicates, and the problem of context switching.

Ok. So, the real world example is a crazy query.
It contains 1091 OR clauses each stipulating a unique key lookup.
Yes, “1091” OR clauses.
I know, I know.
Probably itself a workaround to avoid the 1000 limit on an IN list, who knows?
Anyway.

I’m going to use a table just to illustrate the execution plan and then use some metrics from the execution of the real world example.

drop table t1;

create table t1
(col1 number
,col2 number
,col3 varchar2(10)
,constraint pk_t1 primary key (col1, col2));

insert into t1
select mod(rownum,10000)+1 col1
,      ceil(rownum/10000)  col2
,      rpad('X',10,'X')    col3
from   dual
connect by rownum <= 100000
order by col1, col2;

commit;

If we use a simple multi-column OR condition, we don’t get the desired LNNVLs.

explain plan for
select *
from   t1
where (col1 = 1
and    col2 = 1)
or    (col1 = 2
and    col2 = 2)
or    (col1 = 3
and    col2 = 3)
or    (col1 = 3
and    col2 = 4);

select * from table(dbms_xplan.display);

The optimizer just uses an INLIST iterator:

--------------------------------------------------------------------------------------
| Id  | Operation                    | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |       |  1129 | 37257 |     5   (0)| 00:00:01 |
|   1 |  INLIST ITERATOR             |       |       |       |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID| T1    |  1129 | 37257 |     5   (0)| 00:00:01 |
|*  3 |    INDEX UNIQUE SCAN         | PK_T1 |     7 |       |     4   (0)| 00:00:01 |
--------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   3 - access(("COL1"=1 AND "COL2"=1 OR "COL1"=2 AND "COL2"=2 OR "COL1"=3 AND 
              "COL2"=3 OR "COL1"=3 AND "COL2"=4))

However, if we combine an IN list with some of those ORs:

explain plan for
select *
from   t1
where (col1 = 1
and    col2 = 1)
or    (col1 = 2
and    col2 = 2)
or    (col1 = 3
and    col2 IN (3,4));

select * from table(dbms_xplan.display);

Which gives

---------------------------------------------------------------------------------------
| Id  | Operation                     | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |       |  1140 | 37620 |     2   (0)| 00:00:01 |
|   1 |  CONCATENATION                |       |       |       |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID | T1    |    11 |   363 |     1   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN           | PK_T1 |     1 |       |     2   (0)| 00:00:01 |
|   4 |   INLIST ITERATOR             |       |       |       |            |          |
|   5 |    TABLE ACCESS BY INDEX ROWID| T1    |  1129 | 37257 |     1   (0)| 00:00:01 |
|*  6 |     INDEX UNIQUE SCAN         | PK_T1 |     1 |       |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   3 - access("COL1"=3)
       filter("COL2"=3 OR "COL2"=4)
   6 - access(("COL1"=1 AND "COL2"=1 OR "COL1"=2 AND "COL2"=2))
       filter(LNNVL("COL1"=3) OR LNNVL("COL2"=3) AND LNNVL("COL2"=4))

We get the CONCATENATION of the simpler OR conditions which are combined into a single INLIST ITERATOR operations with another child operation which applies LNNVLs in the filter predicate.

Now imagine we have a much larger number of the simpler OR predicates combined with liberal scatterings of such IN clauses throughout our 1091 OR’d predicates.

Perhaps we’d be a bit concerned about the context switching and increased cpu usage?

Looking at my real world example right here… there’s no point me pasting in the thousands of lines.

If I run the real world behemoth:

Client elapsed time 108 seconds

DBMS_XPLAN.DISPLAY_CURSOR tells me it ran in 1.26 seconds:

plan hash value: 4262066066  
 
-------------------------------------------------------------------------------------------------------------------- 
| Id  | Operation                     | Name            | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  | 
--------------------------------------------------------------------------------------------------------------------  
|   0 | SELECT STATEMENT              |                 |      1 |        |   1628 |00:00:01.26 |    3352 |    418 | 
|   1 |  CONCATENATION                |                 |      1 |        |   1628 |00:00:01.26 |    3352 |    418 |  
|   2 |   INLIST ITERATOR             |                 |      1 |        |    966 |00:00:00.94 |    2735 |    389 | 
|   3 |    TABLE ACCESS BY INDEX ROWID| XXXXXXXXXX      |    966 |    624 |    966 |00:00:00.94 |    2735 |    389 | 
|*  4 |     INDEX RANGE SCAN          | XXXXXXXXXX_UK01 |    966 |    624 |    966 |00:00:00.86 |    1915 |    380 | 
|   5 |   TABLE ACCESS BY INDEX ROWID | XXXXXXXXXX      |      1 |      1 |      2 |00:00:00.01 |       4 |      0 |  
|*  6 |    INDEX RANGE SCAN           | XXXXXXXXXX_UK01 |      1 |      1 |      2 |00:00:00.01 |       3 |      0 |  
....
| 253 |   TABLE ACCESS BY INDEX ROWID | XXXXXXXXXX      |      1 |      2 |     47 |00:00:00.01 |      14 |      1 |  
|*254 |    INDEX RANGE SCAN           | XXXXXXXXXX_UK01 |      1 |      2 |     47 |00:00:00.01 |       3 |      1 |   
--------------------------------------------------------------------------------------------------------------------  

Predicate Information (identified by operation id):
--------------------------------------------------- 
   4 - access(((("XXXXXXXXXX"."VERSION"=1 AND "XXXXXXXXXX"."ID_XXXXXXXXXX"=1611722) OR             
              ("XXXXXXXXXX"."VERSION"=1 AND "XXXXXXXXXX"."ID_XXXXXXXXXX"=1611795) OR ("XXXXXXXXXX"."VERSION"=1 AND                                       
              "XXXXXXXXXX"."ID_XXXXXXXXXX"=1611863) OR ("XXXXXXXXXX"."VERSION"=1 AND
              "XXXXXXXXXX"."ID_XXXXXXXXXX"=1612023) OR ("XXXXXXXXXX"."VERSION"=1 AND 			  .....
  6 - access "XXXXXXXXXX"."ID_XXXXXXXXXX"=2046939) 
       filter(((LNNVL("XXXXXXXXXX"."VERSION"=1) OR LNNVL("XXXXXXXXXX"."ID_XXXXXXXXXX"=1611722)) AND 
           (LNNVL("XXXXXXXXXX"."VERSION"=1) OR LNNVL("XXXXXXXXXX"."ID_XXXXXXXXXX"=1611795)) AND 
              (LNNVL("XXXXXXXXXX"."VERSION"=1) OR LNNVL("XXXXXXXXXX"."ID_XXXXXXXXXX"=1611863)) AND 
.....

DBMS_MONITOR tells me the query ran in 1.68 seconds:

Global Information
 Status              :  DONE (ALL ROWS)     
 Instance ID         :  2                   
 Session ID          :  1220                
 SQL ID              :  5fvt3tfbgmqp3       
 SQL Execution ID    :  33554432            
 Plan Hash Value     :  4262066066          
 Execution Started   :  04/05/2013 14:53:42 
 First Refresh Time  :  04/05/2013 14:53:42 
 Last Refresh Time   :  04/05/2013 14:53:45 

--------------------------------------------------------------------
| Elapsed |   Cpu   |    IO    | Cluster  | Fetch | Buffer | Reads |
| Time(s) | Time(s) | Waits(s) | Waits(s) | Calls |  Gets  |       |
--------------------------------------------------------------------
|    1.68 |    0.50 |     1.13 |     0.05 |     4 |   3352 |   418 |
--------------------------------------------------------------------

Fortunately V$SQL tells me the real story:

select sql_id, executions, rows_processed, elapsed_time/1000/1000, cpu_time/1000/1000 
from v$sql where sql_id = '5fvt3tfbgmqp3';

Gives

SQL_ID        EXECUTIONS ROWS_PROCESSED ELAPSED_TIME/1000/1000 CPU_TIME/1000/1000
------------- ---------- -------------- ---------------------- ------------------
5fvt3tfbgmqp3          1           1628              105.81511         104.581102

Then again, if you write nasty code, nasty things can happen.

Poll: SQL Plan Management

I’m interested in what SQL Plan Management features people are actively using.
Read more of this post

ORA-00600 [kkqtSetOp.1] – Join Factorization

Just a quick note about an ORA-00600 that recently occurred following an upgrade to 11.2.0.3.

I’ve not been able to distill an isolated test case from the specific production code generating this error so I’ll skip the application-specific example.

The error message reported was:
Read more of this post