Remote DML with DBMS_PARALLEL_EXECUTE

An example of sucking data into a table over a db link using DBMS_PARALLEL_EXECUTE.

This particular example is based on something I needed to do in the real world, copying data from one database into another over a db link. Datapump is not available to me. Tables in question happen to be partitioned by a date-like number (boo!) hence some of the specific actions in the detail.

I think it’s a good example of how to use dbms_parallel_execute but also it might be interesting to see how we might combine that functionality with parallel sessions each operating on a single numeric partition.

For setup, let’s create a suitable source table on a remote db.
In this example, I’m recreating the entries in dba_objects for every day for a couple of years.

CREATE TABLE remote_px_test
(dt,owner,object_name,subobject_name,object_id,data_object_id,object_type,created,last_ddl_time,timestamp,status,temporary,generated,secondary,namespace,edition_name)
PARTITION BY RANGE(dt) INTERVAL(1)
(PARTITION p_default VALUES LESS THAN (20120101))
AS
WITH days AS
(SELECT TO_NUMBER(TO_CHAR(TO_DATE(20120101,'YYYYMMDD') + ROWNUM - 1,'YYYYMMDD')) dt
 FROM   dual
 CONNECT BY ROWNUM <= (TRUNC(SYSDATE) - TO_DATE(20120101,'YYYYMMDD')))
SELECT d.dt, o.*
FROM   dba_objects o
CROSS JOIN days d;
SELECT /*+ parallel(16) */ COUNT(*) FROM remote_px_test;
209957272

SELECT round(sum(bytes)/power(1024,3)) FROM user_segments WHERE segment_name = 'REMOTE_PX_TEST';
31

First step is to see how long it takes to do a parallel INSERT SELECT over a db link.

The benefits of parallelisation in such an operation is severely limited because we have a single session over the db link.

Back to the target database.

First create an empty destination table, same as remote.

CREATE TABLE remote_px_test
(dt,owner,object_name,subobject_name,object_id,data_object_id,object_type,created,last_ddl_time,timestamp,status,temporary,generated,secondary,namespace,edition_name)
PARTITION BY RANGE(dt) INTERVAL(1)
(PARTITION p_default VALUES LESS THAN (20100101))
AS
WITH days AS
(SELECT TO_NUMBER(TO_CHAR(TO_DATE(20120101,'YYYYMMDD') + ROWNUM - 1,'YYYYMMDD')) dt
 FROM   dual
 WHERE 1=0)
SELECT d.dt, o.*
FROM   dba_objects o
CROSS JOIN days d;

Now, let’s see how long it takes to do an INSERT SELECT over a db link.
Time is often not a good measure but in this case I’m primarily interested in how long it takes to copy a whole bunch of tables from A to B over a db link.

insert /*+ append */ into remote_px_test l
select * 
from   remote_px_test@d1 r;

209,957,272 rows inserted.

commit;

This executed in 20 minutes.

As mentioned, you could parallelise bits of it either side but the benefit is limited, it might even make things worse thanks to BUFFER SORT operation.

Next let’s compare to method with DBMS_PARALLEL_EXECUTE.

We want some parallel threads to work on independent partitions, doing direct path inserts, concurrently.

First I’m just going to create a view on the SOURCE DB to make my chunking on daily partition interval simpler.

I could create this on the TARGET DB with references to the dictionary tables over db link but it could be significantly slower depending on the number of partitioned tables and whether predicates are being pushed.

CREATE OR REPLACE VIEW vw_interval_partitions
AS
SELECT table_name, partition_name, partition_position, hi
FROM   (SELECT table_name, partition_name, partition_position
        ,      to_char(
                 extractvalue(
                   dbms_xmlgen.getxmltype
                  ('select high_value from user_tab_partitions x'
                 ||' where x.table_name   = '''||t.table_name||''''
                 ||' and   x.partition_name = '''|| t.partition_name|| ''''),'//text()')) hi
        FROM   user_tab_partitions t);

Secondly, I’m going to create a little helper package which will generate the dynamic SQL for our inserts into specific partitions (PARTITION FOR clause not able to use binds).

		
CREATE OR REPLACE PACKAGE sid_data_pkg
AS
  --
  PROCEDURE sid_ipt (
    i_table_name                 IN     VARCHAR2,
    i_table_owner                IN     VARCHAR2,
    i_column_name                IN     VARCHAR2,
    i_dblink                     IN     VARCHAR2,
    i_start_id                   IN     NUMBER,
    i_end_id                     IN     NUMBER
  );
  --
END sid_data_pkg;
/

CREATE OR REPLACE PACKAGE BODY sid_data_pkg
AS
  PROCEDURE sid_ipt (
    i_table_name                 IN     VARCHAR2,
    i_table_owner                IN     VARCHAR2,
    i_column_name                IN     VARCHAR2,
    i_dblink                     IN     VARCHAR2,
    i_start_id                   IN     NUMBER,
    i_end_id                     IN     NUMBER
  )
  AS
    --
    l_cmd CLOB;
    --
  BEGIN
     --
     l_cmd :=
     q'{INSERT /*+ APPEND */}'||chr(10)||
     q'{INTO   }'||i_table_name||chr(10)||
     q'{PARTITION FOR (}'||i_start_id||')'||chr(10)||
     q'{SELECT *}'||chr(10)||
     q'{FROM   }'||CASE WHEN i_table_owner IS NOT NULL THEN i_table_owner||'.' END
                 ||i_table_name
                 ||CASE WHEN i_dblink IS NOT NULL THEN '@'||i_dblink END
                 ||chr(10)||
     q'{WHERE  }'||i_column_name||' < '||i_end_id||chr(10)||
     CASE WHEN i_start_id IS NOT NULL THEN q'{AND   }'||i_column_name||' >= '||i_start_id END;
     --
     --DBMS_OUTPUT.PUT_LINE(l_cmd);
     --
     EXECUTE IMMEDIATE l_cmd;
     --
     COMMIT;
     --
  END sid_ipt;
  --
END sid_data_pkg;
/

Next, truncate our target table again.

Then create our parallel execute task:

begin
  DBMS_PARALLEL_EXECUTE.CREATE_TASK(task_name => 'PX_TEST_TASK');
end;
/

Create the chunks of work to be executed concurrently:

declare
 l_chunk_sql varchar2(1000);
begin
  l_chunk_sql := q'{select (hi - 1) AS partval, hi }'||chr(10)||
                 q'{from   vw_interval_partitions@d1 v }'||chr(10)||
                 q'{where  table_name = 'REMOTE_PX_TEST' }'||chr(10)||
                 q'{order  by partition_position }';
  DBMS_PARALLEL_EXECUTE.CREATE_CHUNKS_BY_SQL(task_name => 'PX_TEST_TASK',sql_stmt => l_chunk_sql, by_rowid => false);
end;
/

Check our task and our chunks:

select * from dba_parallel_execute_tasks;

TASK_OWNER TASK_NAME    CHUNK_TYPE   STATUS  TABLE_OWNER TABLE_NAME NUMBER_COLUMN TASK_COMMENT JOB_PREFIX SQL_STMT LANGUAGE_FLAG EDITION APPLY_CROSSEDITION_TRIGGER FIRE_APPLY_TRIGGER PARALLEL_LEVEL JOB_CLASS
---------- ------------ ------------ ------- ----------- ---------- ------------- ------------ ---------- -------- ------------- ------- -------------------------- ------------------ -------------- ---------
ME_DBA     PX_TEST_TASK NUMBER_RANGE CHUNKED 
select * from dba_parallel_execute_chunks order by chunk_id;

  CHUNK_ID TASK_OWNER TASK_NAME    STATUS     START_ROWID END_ROWID START_ID END_ID   JOB_NAME START_TS END_TS ERROR_CODE ERROR_MESSAGE
---------- ---------- ------------ ---------- ----------- --------- -------- -------- -------- -------- ------ ---------- -------------
      3053 ME_DBA     PX_TEST_TASK UNASSIGNED                       20120100 20120101 
      3054 ME_DBA     PX_TEST_TASK UNASSIGNED                       20120101 20120102 
        ...
      4017 ME_DBA     PX_TEST_TASK UNASSIGNED                       20140821 20140822 

 965 rows selected 

Then we run our parallel tasks thus, each executing the helper package and working on individual partitions:

set serveroutput on
DECLARE
  l_task     VARCHAR2(24) := 'PX_TEST_TASK';
  l_sql_stmt VARCHAR2(1000);
BEGIN
  --
  l_sql_stmt := q'{begin sid_data_pkg.sid_ipt ('REMOTE_PX_TEST','ME_DBA','DT','D1',:start_id,:end_id); end;}';
  --
  DBMS_PARALLEL_EXECUTE.RUN_TASK(l_task, l_sql_stmt, DBMS_SQL.NATIVE,parallel_level => 16);
  --
  dbms_output.put_line(DBMS_PARALLEL_EXECUTE.TASK_STATUS(l_task));
  --
end;
/

This executed in 2 minutes and returned code 6 which is FINISHED (without error).

Status of individual chunks can be checked via DBA_PARALLEL_EXECUTE_CHUNKS.

PLSQL Challenge Q4 Playoff

Do you play the PL/SQL challenge?

It was definitely tough.

The results of the Q4 2010 Playoff have been published.

When the answers were published, I thought I had a good chance of sneaking into the Top 10 but I fluked top spot.

I was so chuffed when I saw the preliminary results at the weekend. Couldn’t believe it. Made my weekend. And I didn’t even realise there was a prize – bonus.

It’s interesting to see the results of the past playoffs as well.
There are quite a few consistent performers in there.

Of whom though, a special mischievous thought goes out to Gary Myers who, in the height of Ashes fever, enjoyed the Q3 playoff results from an Aussie vs UK slant. So, now that’s both the the Ashes and the Q4 Playoff. Sorry Gary ;)

(Who am I kidding, I’d swap geographic locations any day…)

No doubt mine will be a typical British response though, failure to qualify for the next playoff is virtually guaranteed.

I’m thinking this achievement would look good listed on a CV under Hobbies :)

But it’s good stuff, you should do it.

Follow

Get every new post delivered to your Inbox.

Join 75 other followers