Remote DML with DBMS_PARALLEL_EXECUTE

An example of sucking data into a table over a db link using DBMS_PARALLEL_EXECUTE.

This particular example is based on something I needed to do in the real world, copying data from one database into another over a db link. Datapump is not available to me. Tables in question happen to be partitioned by a date-like number (boo!) hence some of the specific actions in the detail.

I think it’s a good example of how to use dbms_parallel_execute but also it might be interesting to see how we might combine that functionality with parallel sessions each operating on a single numeric partition.

For setup, let’s create a suitable source table on a remote db.
In this example, I’m recreating the entries in dba_objects for every day for a couple of years.

CREATE TABLE remote_px_test
(dt,owner,object_name,subobject_name,object_id,data_object_id,object_type,created,last_ddl_time,timestamp,status,temporary,generated,secondary,namespace,edition_name)
PARTITION BY RANGE(dt) INTERVAL(1)
(PARTITION p_default VALUES LESS THAN (20120101))
AS
WITH days AS
(SELECT TO_NUMBER(TO_CHAR(TO_DATE(20120101,'YYYYMMDD') + ROWNUM - 1,'YYYYMMDD')) dt
 FROM   dual
 CONNECT BY ROWNUM <= (TRUNC(SYSDATE) - TO_DATE(20120101,'YYYYMMDD')))
SELECT d.dt, o.*
FROM   dba_objects o
CROSS JOIN days d;
SELECT /*+ parallel(16) */ COUNT(*) FROM remote_px_test;
209957272

SELECT round(sum(bytes)/power(1024,3)) FROM user_segments WHERE segment_name = 'REMOTE_PX_TEST';
31

First step is to see how long it takes to do a parallel INSERT SELECT over a db link.

The benefits of parallelisation in such an operation is severely limited because we have a single session over the db link.

Back to the target database.

First create an empty destination table, same as remote.

CREATE TABLE remote_px_test
(dt,owner,object_name,subobject_name,object_id,data_object_id,object_type,created,last_ddl_time,timestamp,status,temporary,generated,secondary,namespace,edition_name)
PARTITION BY RANGE(dt) INTERVAL(1)
(PARTITION p_default VALUES LESS THAN (20100101))
AS
WITH days AS
(SELECT TO_NUMBER(TO_CHAR(TO_DATE(20120101,'YYYYMMDD') + ROWNUM - 1,'YYYYMMDD')) dt
 FROM   dual
 WHERE 1=0)
SELECT d.dt, o.*
FROM   dba_objects o
CROSS JOIN days d;

Now, let’s see how long it takes to do an INSERT SELECT over a db link.
Time is often not a good measure but in this case I’m primarily interested in how long it takes to copy a whole bunch of tables from A to B over a db link.

insert /*+ append */ into remote_px_test l
select * 
from   remote_px_test@d1 r;

209,957,272 rows inserted.

commit;

This executed in 20 minutes.

As mentioned, you could parallelise bits of it either side but the benefit is limited, it might even make things worse thanks to BUFFER SORT operation.

Next let’s compare to method with DBMS_PARALLEL_EXECUTE.

We want some parallel threads to work on independent partitions, doing direct path inserts, concurrently.

First I’m just going to create a view on the SOURCE DB to make my chunking on daily partition interval simpler.

I could create this on the TARGET DB with references to the dictionary tables over db link but it could be significantly slower depending on the number of partitioned tables and whether predicates are being pushed.

CREATE OR REPLACE VIEW vw_interval_partitions
AS
SELECT table_name, partition_name, partition_position, hi
FROM   (SELECT table_name, partition_name, partition_position
        ,      to_char(
                 extractvalue(
                   dbms_xmlgen.getxmltype
                  ('select high_value from user_tab_partitions x'
                 ||' where x.table_name   = '''||t.table_name||''''
                 ||' and   x.partition_name = '''|| t.partition_name|| ''''),'//text()')) hi
        FROM   user_tab_partitions t);

Secondly, I’m going to create a little helper package which will generate the dynamic SQL for our inserts into specific partitions (PARTITION FOR clause not able to use binds).

		
CREATE OR REPLACE PACKAGE sid_data_pkg
AS
  --
  PROCEDURE sid_ipt (
    i_table_name                 IN     VARCHAR2,
    i_table_owner                IN     VARCHAR2,
    i_column_name                IN     VARCHAR2,
    i_dblink                     IN     VARCHAR2,
    i_start_id                   IN     NUMBER,
    i_end_id                     IN     NUMBER
  );
  --
END sid_data_pkg;
/

CREATE OR REPLACE PACKAGE BODY sid_data_pkg
AS
  PROCEDURE sid_ipt (
    i_table_name                 IN     VARCHAR2,
    i_table_owner                IN     VARCHAR2,
    i_column_name                IN     VARCHAR2,
    i_dblink                     IN     VARCHAR2,
    i_start_id                   IN     NUMBER,
    i_end_id                     IN     NUMBER
  )
  AS
    --
    l_cmd CLOB;
    --
  BEGIN
     --
     l_cmd :=
     q'{INSERT /*+ APPEND */}'||chr(10)||
     q'{INTO   }'||i_table_name||chr(10)||
     q'{PARTITION FOR (}'||i_start_id||')'||chr(10)||
     q'{SELECT *}'||chr(10)||
     q'{FROM   }'||CASE WHEN i_table_owner IS NOT NULL THEN i_table_owner||'.' END
                 ||i_table_name
                 ||CASE WHEN i_dblink IS NOT NULL THEN '@'||i_dblink END
                 ||chr(10)||
     q'{WHERE  }'||i_column_name||' < '||i_end_id||chr(10)||
     CASE WHEN i_start_id IS NOT NULL THEN q'{AND   }'||i_column_name||' >= '||i_start_id END;
     --
     --DBMS_OUTPUT.PUT_LINE(l_cmd);
     --
     EXECUTE IMMEDIATE l_cmd;
     --
     COMMIT;
     --
  END sid_ipt;
  --
END sid_data_pkg;
/

Next, truncate our target table again.

Then create our parallel execute task:

begin
  DBMS_PARALLEL_EXECUTE.CREATE_TASK(task_name => 'PX_TEST_TASK');
end;
/

Create the chunks of work to be executed concurrently:

declare
 l_chunk_sql varchar2(1000);
begin
  l_chunk_sql := q'{select (hi - 1) AS partval, hi }'||chr(10)||
                 q'{from   vw_interval_partitions@d1 v }'||chr(10)||
                 q'{where  table_name = 'REMOTE_PX_TEST' }'||chr(10)||
                 q'{order  by partition_position }';
  DBMS_PARALLEL_EXECUTE.CREATE_CHUNKS_BY_SQL(task_name => 'PX_TEST_TASK',sql_stmt => l_chunk_sql, by_rowid => false);
end;
/

Check our task and our chunks:

select * from dba_parallel_execute_tasks;

TASK_OWNER TASK_NAME    CHUNK_TYPE   STATUS  TABLE_OWNER TABLE_NAME NUMBER_COLUMN TASK_COMMENT JOB_PREFIX SQL_STMT LANGUAGE_FLAG EDITION APPLY_CROSSEDITION_TRIGGER FIRE_APPLY_TRIGGER PARALLEL_LEVEL JOB_CLASS
---------- ------------ ------------ ------- ----------- ---------- ------------- ------------ ---------- -------- ------------- ------- -------------------------- ------------------ -------------- ---------
ME_DBA     PX_TEST_TASK NUMBER_RANGE CHUNKED 
select * from dba_parallel_execute_chunks order by chunk_id;

  CHUNK_ID TASK_OWNER TASK_NAME    STATUS     START_ROWID END_ROWID START_ID END_ID   JOB_NAME START_TS END_TS ERROR_CODE ERROR_MESSAGE
---------- ---------- ------------ ---------- ----------- --------- -------- -------- -------- -------- ------ ---------- -------------
      3053 ME_DBA     PX_TEST_TASK UNASSIGNED                       20120100 20120101 
      3054 ME_DBA     PX_TEST_TASK UNASSIGNED                       20120101 20120102 
        ...
      4017 ME_DBA     PX_TEST_TASK UNASSIGNED                       20140821 20140822 

 965 rows selected 

Then we run our parallel tasks thus, each executing the helper package and working on individual partitions:

set serveroutput on
DECLARE
  l_task     VARCHAR2(24) := 'PX_TEST_TASK';
  l_sql_stmt VARCHAR2(1000);
BEGIN
  --
  l_sql_stmt := q'{begin sid_data_pkg.sid_ipt ('REMOTE_PX_TEST','ME_DBA','DT','D1',:start_id,:end_id); end;}';
  --
  DBMS_PARALLEL_EXECUTE.RUN_TASK(l_task, l_sql_stmt, DBMS_SQL.NATIVE,parallel_level => 16);
  --
  dbms_output.put_line(DBMS_PARALLEL_EXECUTE.TASK_STATUS(l_task));
  --
end;
/

This executed in 2 minutes and returned code 6 which is FINISHED (without error).

Status of individual chunks can be checked via DBA_PARALLEL_EXECUTE_CHUNKS.

Remote Surprise

Here is an example of surprising behaviour from a remote DB from an OTN forum thread

Setup a link to a remote DB (I’ve used an actual remote DB and not tested a loopback)

Remote DB:

create table t1
(col1  varchar2(1));

Local DB:

create or replace view v1 
as 
select count(*) c1 from t1@l1;

Then alternate variations on this sequence of events:

1. On local DB execute SELECT:

SELECT * FROM v1;

2. On remote DB execute

begin
insert into t1 values ('a');
commit;
end;
/

3. On local DB execute SELECT:

SELECT * FROM v1;

4. On local DB execute SELECT:

SELECT * FROM v1;

OR

1. On local DB execute SELECT:

SELECT * FROM v1;

2. On remote DB execute

begin
delete from t1; 
commit;
end;
/

3. On local DB execute:

SELECT * FROM v1;

4. On local DB execute:

SELECT * FROM v1;

You may have to execute several times to catch the “inconsistency” but between the remote INSERT or DELETE and the local SELECT, you should see the odd “old” result which is corrected on the subsequent execution.

For example, on local DB with nothing happening on remote DB between the two SELECTS:

SQL> select * from v1;

        C1
----------
         6

SQL> select * from v1;

        C1
----------
         7

The behaviour is documented.


no practical way exists to keep SCNs in a distributed system absolutely synchronized: a window always exists in which one node may have an SCN that is somewhat in the past with respect to the SCN of another node.

Because of the SCN gap, you can execute a query that uses a slightly old snapshot, so that the most recent changes to the remote database are not seen. In accordance with read consistency, a query can therefore retrieve consistent, but out-of-date data.

And two workarounds documented:

You can use the following techniques to ensure that the SCNs of the two systems are synchronized just before a query:

Because SCNs are synchronized at the end of a remote query, precede each remote query with a dummy remote query to the same site, for example, SELECT * FROM DUAL@REMOTE.

Because SCNs are synchronized at the start of every remote transaction, commit or roll back the current transaction before issuing the remote query.

SQL utils using XML

You may have previously seen a short post I did on a SQL statement to identify which statements are using dynamic sampling.

If not, quick recap:

SELECT p.sql_id, t.val
FROM   v$sql_plan p
,      xmltable('for $i in /other_xml/info
                 where $i/@type eq "dynamic_sampling"
                 return $i'
                passing xmltype(p.other_xml)
                columns attr varchar2(50) path '@type',
                        val  varchar2(50) path '/') t
WHERE  p.other_xml IS NOT NULL;

This uses the incredibly powerful XMLTABLE functionality, there’s so much that can be done with it.

Here are a couple of other utilities I used recently which also highlight the powerful convenience of SQL and XML.

First up, I don’t know if this is useful to anyone but I had a crappy refresh script which should have been creating table partitions with SEGMENT CREATION DEFERRED but wasn’t.

So there was a reasonable amount of space wastage caused by empty segments.

How to identify? See below.

Could be combined with DBMS_SPACE_ADMIN.DROP_EMPTY_SEGMENTS to clean up?

WITH subq_pos_empty AS
(SELECT t.table_owner
 ,      t.table_name
 ,      t.partition_name
 ,      x.cnt
 FROM   dba_segments         s
 ,      dba_tab_partitions   t
 ,      xmltable('for $i in /ROWSET/ROW/CNT
                  return $i'
                 passing xmltype(
                          dbms_xmlgen.getxml
                          ('select count(*) cnt '
                         ||'from '||t.table_owner||'.'||t.table_name||' PARTITION ('||t.partition_name||') '
                         --||'SAMPLE(.01)' -- If you want to sample to speed up unexpected large seg counts                  
                           ))
                 columns cnt number path '/') x
 WHERE  s.segment_type       = 'TABLE PARTITION'
 --AND    t.table_owner      LIKE 'XYZ%'
 AND    t.table_owner        = s.owner
 AND    t.table_name         = s.segment_name
 AND    t.partition_name     = s.partition_name
 AND    t.num_rows           = 0
 AND    t.partition_position > 1)
SELECT *
FROM   subq_pos_empty
WHERE  cnt = 0
ORDER BY 
       table_owner
,      table_name
,      partition_name;

For example:

SQL> create table t1
  2  (col1 date
  3  ,col2 number)
  4  partition by range(col1) interval (numtodsinterval(1,'DAY'))
  5  (PARTITION p0 values less than (to_Date(20130101,'YYYYMMDD')) segment creation immediate
  6  ,PARTITION p1 values less than (to_Date(20130102,'YYYYMMDD'))  segment creation immediate)
  7  ;

Table created.

SQL> exec dbms_stats.gather_table_stats(USER,'T1');

PL/SQL procedure successfully completed.

SQL> WITH subq_pos_empty AS
  2  (SELECT t.table_owner
  3   ,      t.table_name
  4   ,      t.partition_name
  5   ,      x.cnt
  6   FROM   dba_segments         s
  7   ,      dba_tab_partitions   t
  8   ,      xmltable('for $i in /ROWSET/ROW/CNT
  9                    return $i'
 10                   passing xmltype(
 11                            dbms_xmlgen.getxml
 12                            ('select count(*) cnt '
 13                           ||'from '||t.table_owner||'.'||t.table_name||' PARTITION ('||t.partition_name||') '
 14                           --||'SAMPLE(.01)' -- If you want to sample to speed up unexpected large seg counts                                      
 15                             ))
 16                   columns cnt number path '/') x
 17   WHERE  s.segment_type       = 'TABLE PARTITION'
 18   --AND    t.table_owner      LIKE 'XYZ%'
 19   AND    t.table_name         = 'T1' -- Comment out
 20   AND    t.table_owner        = s.owner
 21   AND    t.table_name         = s.segment_name
 22   AND    t.partition_name     = s.partition_name
 23   AND    t.num_rows           = 0
 24   AND    t.partition_position > 1)
 25  SELECT *
 26  FROM   subq_pos_empty
 27  WHERE  cnt = 0
 28  ORDER BY
 29         table_owner
 30  ,      table_name
 31  ,      partition_name;

TABLE_OWNER                    TABLE_NAME                     PARTITION_NAME                        CNT
------------------------------ ------------------------------ ------------------------------ ----------
PGPS_UAT1                      T1                             P1                                      0

Secondly, a helper for partitions and that nasty LONG column which can be used for partition maintenance to roll off oldest partitions:

SELECT table_name
,      partition_name
,      hi
FROM   (SELECT t.table_name
        ,      t.partition_name
        ,      t.partition_position
        ,      x.hi
        FROM   user_tab_partitions t
        ,      xmltable('for $i in /ROWSET/ROW/HI
                         return $i'
                        passing xmltype(
                                dbms_xmlgen.getxml
                                ('select high_value hi from user_tab_partitions x'
                               ||' where x.table_name     = '''||t.table_name||''''
                               ||' and   x.partition_name = '''|| t.partition_name|| ''''))
                        columns hi number path '/') x
        --WHERE  partition_position > 1
        --AND    table_name  = i_table_name
        )
--WHERE hi       <= i_date_yyyymmdd
;   

This works an awful lot more easily if you have range/interval partitioning on a number – which most people probably don’t have.

For the more normal DATE range partitioning, it’s only slightly more fiddly.

I haven’t spent too long thinking about it so there may be a better way, but I tried to avoid the deprecated EXTRACTVALUE approach:

SELECT t.table_name
,      t.partition_name
,      t.partition_position
,      to_date(x2.dt,'YYYYMMDDHH24MISS') hi
FROM   user_tab_partitions t
,      xmltable('for $i in /ROWSET/ROW/HI
                 return $i'
                passing xmltype(
                          dbms_xmlgen.getxml
                          ('select high_value hi from user_tab_partitions x'
                         ||' where x.table_name     = '''||t.table_name||''''
                         ||' and   x.partition_name = '''|| t.partition_name|| ''''))
                columns dt varchar2(4000) path '/') x
,      xmltable('for $i in /ROWSET/ROW/DT
                 return $i'
                passing xmltype(dbms_xmlgen.getxml(q'[select to_char(]'||x.dt||q'[,'YYYYMMDDHH24MISS') dt from dual]'))
                columns dt varchar2(16) path '/') x2
;

For example:

SQL> alter session set nls_date_format = 'DD-MON-YYYY HH24:MI';

SQL> SELECT t.table_name
  2  ,      t.partition_name
  3  ,      t.partition_position
  4  ,      to_date(x2.dt,'YYYYMMDDHH24MISS') hi
  5  FROM   user_tab_partitions t
  6  ,      xmltable('for $i in /ROWSET/ROW/HI
  7                   return $i'
  8                  passing xmltype(
  9                            dbms_xmlgen.getxml
 10                            ('select high_value hi from user_tab_partitions x'
 11                           ||' where x.table_name     = '''||t.table_name||''''
 12                           ||' and   x.partition_name = '''|| t.partition_name|| ''''))
 13                  columns dt varchar2(4000) path '/') x
,      xmltable('for $i in /ROWSET/ROW/DT
 14   15                   return $i'
 16                  passing xmltype(dbms_xmlgen.getxml(q'[select to_char(]'||x.dt||q'[,'YYYYMMDDHH24MISS') dt from dual]'))
 17                  columns dt varchar2(16) path '/') x2
 18  WHERE  t.table_name = 'T1';

TABLE_NAME                     PARTITION_NAME                 PARTITION_POSITION HI
------------------------------ ------------------------------ ------------------ -----------------
T1                             P0                                              1 01-JAN-2013 00:00
T1                             P1                                              2 02-JAN-2013 00:00

SQL>

Warning about the XMLTABLE approach – if running on versions less than 11.2, you may occasionally run into some ORA-00600 bugs.

ORs, IN lists and LNNVL

I’ve previously written about manually rewriting an OR condition into a UNION ALL using LNNVL.

This is a description of a performance issue observed in the real world from the optimizer coming up with a CONCATENATION operation against many child operations including an INLIST operator and other children which then have to use LNNVL as a filter.

Really this is nothing more than an illustration of our old performance killing friend the function call from within SQL, particularly from within predicates, and the problem of context switching.

Ok. So, the real world example is a crazy query.
It contains 1091 OR clauses each stipulating a unique key lookup.
Yes, “1091” OR clauses.
I know, I know.
Probably itself a workaround to avoid the 1000 limit on an IN list, who knows?
Anyway.

I’m going to use a table just to illustrate the execution plan and then use some metrics from the execution of the real world example.

drop table t1;

create table t1
(col1 number
,col2 number
,col3 varchar2(10)
,constraint pk_t1 primary key (col1, col2));

insert into t1
select mod(rownum,10000)+1 col1
,      ceil(rownum/10000)  col2
,      rpad('X',10,'X')    col3
from   dual
connect by rownum <= 100000
order by col1, col2;

commit;

If we use a simple multi-column OR condition, we don’t get the desired LNNVLs.

explain plan for
select *
from   t1
where (col1 = 1
and    col2 = 1)
or    (col1 = 2
and    col2 = 2)
or    (col1 = 3
and    col2 = 3)
or    (col1 = 3
and    col2 = 4);

select * from table(dbms_xplan.display);

The optimizer just uses an INLIST iterator:

--------------------------------------------------------------------------------------
| Id  | Operation                    | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |       |  1129 | 37257 |     5   (0)| 00:00:01 |
|   1 |  INLIST ITERATOR             |       |       |       |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID| T1    |  1129 | 37257 |     5   (0)| 00:00:01 |
|*  3 |    INDEX UNIQUE SCAN         | PK_T1 |     7 |       |     4   (0)| 00:00:01 |
--------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   3 - access(("COL1"=1 AND "COL2"=1 OR "COL1"=2 AND "COL2"=2 OR "COL1"=3 AND 
              "COL2"=3 OR "COL1"=3 AND "COL2"=4))

However, if we combine an IN list with some of those ORs:

explain plan for
select *
from   t1
where (col1 = 1
and    col2 = 1)
or    (col1 = 2
and    col2 = 2)
or    (col1 = 3
and    col2 IN (3,4));

select * from table(dbms_xplan.display);

Which gives

---------------------------------------------------------------------------------------
| Id  | Operation                     | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |       |  1140 | 37620 |     2   (0)| 00:00:01 |
|   1 |  CONCATENATION                |       |       |       |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID | T1    |    11 |   363 |     1   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN           | PK_T1 |     1 |       |     2   (0)| 00:00:01 |
|   4 |   INLIST ITERATOR             |       |       |       |            |          |
|   5 |    TABLE ACCESS BY INDEX ROWID| T1    |  1129 | 37257 |     1   (0)| 00:00:01 |
|*  6 |     INDEX UNIQUE SCAN         | PK_T1 |     1 |       |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   3 - access("COL1"=3)
       filter("COL2"=3 OR "COL2"=4)
   6 - access(("COL1"=1 AND "COL2"=1 OR "COL1"=2 AND "COL2"=2))
       filter(LNNVL("COL1"=3) OR LNNVL("COL2"=3) AND LNNVL("COL2"=4))

We get the CONCATENATION of the simpler OR conditions which are combined into a single INLIST ITERATOR operations with another child operation which applies LNNVLs in the filter predicate.

Now imagine we have a much larger number of the simpler OR predicates combined with liberal scatterings of such IN clauses throughout our 1091 OR’d predicates.

Perhaps we’d be a bit concerned about the context switching and increased cpu usage?

Looking at my real world example right here… there’s no point me pasting in the thousands of lines.

If I run the real world behemoth:

Client elapsed time 108 seconds

DBMS_XPLAN.DISPLAY_CURSOR tells me it ran in 1.26 seconds:

plan hash value: 4262066066  
 
-------------------------------------------------------------------------------------------------------------------- 
| Id  | Operation                     | Name            | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  | 
--------------------------------------------------------------------------------------------------------------------  
|   0 | SELECT STATEMENT              |                 |      1 |        |   1628 |00:00:01.26 |    3352 |    418 | 
|   1 |  CONCATENATION                |                 |      1 |        |   1628 |00:00:01.26 |    3352 |    418 |  
|   2 |   INLIST ITERATOR             |                 |      1 |        |    966 |00:00:00.94 |    2735 |    389 | 
|   3 |    TABLE ACCESS BY INDEX ROWID| XXXXXXXXXX      |    966 |    624 |    966 |00:00:00.94 |    2735 |    389 | 
|*  4 |     INDEX RANGE SCAN          | XXXXXXXXXX_UK01 |    966 |    624 |    966 |00:00:00.86 |    1915 |    380 | 
|   5 |   TABLE ACCESS BY INDEX ROWID | XXXXXXXXXX      |      1 |      1 |      2 |00:00:00.01 |       4 |      0 |  
|*  6 |    INDEX RANGE SCAN           | XXXXXXXXXX_UK01 |      1 |      1 |      2 |00:00:00.01 |       3 |      0 |  
....
| 253 |   TABLE ACCESS BY INDEX ROWID | XXXXXXXXXX      |      1 |      2 |     47 |00:00:00.01 |      14 |      1 |  
|*254 |    INDEX RANGE SCAN           | XXXXXXXXXX_UK01 |      1 |      2 |     47 |00:00:00.01 |       3 |      1 |   
--------------------------------------------------------------------------------------------------------------------  

Predicate Information (identified by operation id):
--------------------------------------------------- 
   4 - access(((("XXXXXXXXXX"."VERSION"=1 AND "XXXXXXXXXX"."ID_XXXXXXXXXX"=1611722) OR             
              ("XXXXXXXXXX"."VERSION"=1 AND "XXXXXXXXXX"."ID_XXXXXXXXXX"=1611795) OR ("XXXXXXXXXX"."VERSION"=1 AND                                       
              "XXXXXXXXXX"."ID_XXXXXXXXXX"=1611863) OR ("XXXXXXXXXX"."VERSION"=1 AND
              "XXXXXXXXXX"."ID_XXXXXXXXXX"=1612023) OR ("XXXXXXXXXX"."VERSION"=1 AND 			  .....
  6 - access "XXXXXXXXXX"."ID_XXXXXXXXXX"=2046939) 
       filter(((LNNVL("XXXXXXXXXX"."VERSION"=1) OR LNNVL("XXXXXXXXXX"."ID_XXXXXXXXXX"=1611722)) AND 
           (LNNVL("XXXXXXXXXX"."VERSION"=1) OR LNNVL("XXXXXXXXXX"."ID_XXXXXXXXXX"=1611795)) AND 
              (LNNVL("XXXXXXXXXX"."VERSION"=1) OR LNNVL("XXXXXXXXXX"."ID_XXXXXXXXXX"=1611863)) AND 
.....

DBMS_MONITOR tells me the query ran in 1.68 seconds:

Global Information
 Status              :  DONE (ALL ROWS)     
 Instance ID         :  2                   
 Session ID          :  1220                
 SQL ID              :  5fvt3tfbgmqp3       
 SQL Execution ID    :  33554432            
 Plan Hash Value     :  4262066066          
 Execution Started   :  04/05/2013 14:53:42 
 First Refresh Time  :  04/05/2013 14:53:42 
 Last Refresh Time   :  04/05/2013 14:53:45 

--------------------------------------------------------------------
| Elapsed |   Cpu   |    IO    | Cluster  | Fetch | Buffer | Reads |
| Time(s) | Time(s) | Waits(s) | Waits(s) | Calls |  Gets  |       |
--------------------------------------------------------------------
|    1.68 |    0.50 |     1.13 |     0.05 |     4 |   3352 |   418 |
--------------------------------------------------------------------

Fortunately V$SQL tells me the real story:

select sql_id, executions, rows_processed, elapsed_time/1000/1000, cpu_time/1000/1000 
from v$sql where sql_id = '5fvt3tfbgmqp3';

Gives

SQL_ID        EXECUTIONS ROWS_PROCESSED ELAPSED_TIME/1000/1000 CPU_TIME/1000/1000
------------- ---------- -------------- ---------------------- ------------------
5fvt3tfbgmqp3          1           1628              105.81511         104.581102

Then again, if you write nasty code, nasty things can happen.

CPU_time much higher than elapsed_time

Of course, one of the first places you should go is Oracle Support, but if you, like me, put off doing that and scratch your head for a while wondering why ELAPSED_TIME in V$SQL might be significantly lower than CPU_TIME when there are no obvious other explanations, then note these two bugs in 11.1.0.7:

BUG 7168259 – Millisecond elapsed_time shows as 0 : Elapsed_time for SQL statements which execute in under one millisecond can show up as 0

BUG 7561762 – Elapsed_time is under-reported : Elapsed_time can be 0 or significantly less than V$SQL.CPU_TIME if there is significant idle wait class time

Manual Concatenation and LNNVL

A quick example about things to consider when considering rewriting a SQL OR condition to a UNION/UNION ALL, as pointed out to me by Jonathan Lewis referencing this post on this recent OTN forum thread.

At the heart of the issue are:

  • how to deal with the possibility of duplicates
  • the potential impact of NULLS

The impact of NULLS is best described by Jonathan in the article already mentioned so I’m going to use a noddy example to illustrate:

  • the problem of duplicates
  • why suggesting a UNION is not an appropriate generic approach (although it may work for specific examples)
  • why that generic approach should be a UNION ALL plus LNNVL

So… if, for whatever reason, we want to rewrite this statement:

select *
from   t2, t1
where  t2.col1 = t1.par
or     t2.col1 = t1.child;

If this is our setup that includes a noddy duplicate:

SQL> create table t1
  2  (par   number
  3  ,child number);

Table created.

SQL> create table t2
  2  (col1 number
  3  ,col2 varchar2(1));

Table created.

SQL> insert into t1 values (1,1);

1 row created.

SQL> insert into t1 values (1,1);

1 row created.

SQL> insert into t2 values (1,'A');

1 row created.

SQL> select * from t1;

       PAR      CHILD
---------- ----------
         1          1
         1          1

2 rows selected.

SQL> select * from t2;

      COL1 C
---------- -
         1 A

1 row selected.

SQL> 

Such that our target statement returns this:

SQL> select *
  2  from   t2, t1
  3  where  t2.col1 = t1.par
  4  or     t2.col1 = t1.child;

      COL1 C        PAR      CHILD
---------- - ---------- ----------
         1 A          1          1
         1 A          1          1

2 rows selected.

SQL> 

We can see that rewriting with a UNION is not equivalent to the original:

SQL> select *
  2  from   t2, t1
  3  where  t2.col1 = t1.par
  4  union
  5  select *
  6  from   t2, t1
  7  where  t2.col1 = t1.child;

      COL1 C        PAR      CHILD
---------- - ---------- ----------
         1 A          1          1

1 row selected.

SQL> 

And if we use a UNION ALL:

SQL> select *
  2  from   t2, t1
  3  where  t2.col1 = t1.par
  4  union all
  5  select *
  6  from   t2, t1
  7  where  t2.col1 = t1.child;

      COL1 C        PAR      CHILD
---------- - ---------- ----------
         1 A          1          1
         1 A          1          1
         1 A          1          1
         1 A          1          1

4 rows selected.

Then we need LNNVL to eliminate the duplicates from the second branch, the lower half of the statement:

SQL> select *
  2  from   t2, t1
  3  where  t2.col1 = t1.par
  4  union all
  5  select *
  6  from   t2, t1
  7  where  t2.col1 = t1.child
  8  and    lnnvl(t1.par=t1.child);

      COL1 C        PAR      CHILD
---------- - ---------- ----------
         1 A          1          1
         1 A          1          1

2 rows selected.

SQL> 

Plan_hash_value and internal temporary table names

Here’s a little thing about plan_hash_2 that’s come via a pointer from Jonathan Lewis to bug 10162430.

From querying DBA_HIST_SQLSTAT I happened to notice that over a period of 65 days a particular query had 63 different plans (PLAN_HASH_VALUE).

I found that the differences were down done to the different internal temporary table names used for a materialized subquery, i.e. the plans were essentially the same but for the name of SYS_TEMP_0FD9D6786_B6EF87D2, SYS_TEMP_0FD9D6825_BE8671F, SYS_TEMP_0FD9D6684_EAD00827, etc.

Exactly what bug 10162430 addresses is unclear because it is listed as fixed in 11.2.0.3 and my different plan hashes come from an 11.2.0.3 database.

But the key information in the bug description highlighted that PLAN_HASH_2 is not affected by these differing temp table names.

Exactly what plan_hash_2 is and how it differs from plan_hash_value is undocumented but:

  • it is found in V$SQL_PLAN.OTHER_XML;
  • also in the AWR repository DBA_HIST_SQL_PLAN.OTHER_XML;
  • and is used as the enforcing plan hash in sql plan baselines – not exposed in DBA_SQL_PLAN_BASELINES but internally represented in column PLAN_ID in SYS.SQLOBJ$, SYS.SQLOBJ$DATA, SYS.SQLOBJ$AUXDATA.

So, if you regularly look at DBA_HIST_SQLSTAT to look at historic execution statistics of a SQL statement and your queries use materialized subqueries then I recommend adding PLAN_HASH_2 to the output, something like this:

select sn.snap_id
,      sn.end_interval_time
,      st.module
,      st.sql_id
,      st.plan_hash_value
,      (select extractvalue(xmltype(other_xml),'other_xml/info[@type="plan_hash_2"]')
        from   dba_hist_sql_plan hp
        where  hp.sql_id          = st.sql_id
        and    hp.plan_hash_value = st.plan_hash_value
        and    hp.other_xml is not null) plan_hash_2
,      rows_processed_delta rws
,      executions_delta     execs
,      elapsed_time_delta   elp
,      cpu_time_delta       cpu
,      buffer_gets_delta    gets
,      iowait_delta         io
from   dba_hist_snapshot sn
,      dba_hist_sqlstat  st
where  st.snap_id            = sn.snap_id
and    st.sql_id             = '&sql_id'
and    st.elapsed_time_delta > 0
order by sn.snap_id desc; 

And just to show this difference between PHV and PH2:

   SNAP_ID END_INTERVAL_TIME         PLAN_HASH_VALUE PLAN_HASH_2
---------- ------------------------- --------------- ------------
      9131 12/07/2012 00:00               1987570133 3399935153
      9083 11/07/2012 00:00               1567016512 3399935153
      9035 10/07/2012 00:00               3386529786 3399935153
      8891 07/07/2012 00:00               2197008736 3399935153
      8842 05/07/2012 23:30                991904690 3399935153
      8794 04/07/2012 23:30               1331394991 3399935153
      8746 03/07/2012 23:30               2577119670 3399935153
      8699 03/07/2012 00:00               3774955286 3399935153
      8698 02/07/2012 23:30               3774955286 3399935153
      8555 30/06/2012 00:00               3308813978 3399935153
      8507 29/06/2012 00:00                796849390 3399935153
      8459 28/06/2012 00:00                917735929 3399935153
      8410 26/06/2012 23:30                139934889 3399935153
      8363 26/06/2012 00:00               1936665492 3399935153
      8219 23/06/2012 00:00                666306815 3399935153
      8171 22/06/2012 00:00               1053584101 3399935153
      8123 21/06/2012 00:00               1353471101 3399935153
...
Follow

Get every new post delivered to your Inbox.

Join 72 other followers