Fun with distributed transactions
January 15, 2013 5 Comments
Three items within:
Part 1 – Subquery materialisation not possible (fixed in 12c)
Part 2 – Materialised views
Part 3 – Lock scripts
Part 1 – Subquery materialisation not possible
Yesterday I was reminder of a familiar issue, officially listed as a bug, that can occur when combining subquery materialisation with distributed transactions.
I had previously tuned a query for a colleague, the best approach for which was using implicit materialised subquery factoring (implicit because it wasn’t hinted as /*+ materalize */ but was referenced multiple times in the main query) and I had provided such a solution, without even a thought to the performance threat that the query might be running in an XA transaction.
As a result, unsurprisingly with the benefit of hindsight, the query was not performing as expected from the application in said XA transaction.
Let me provide a refresher on the issue.
create table t1 as select rownum col1 from dual connect by rownum <= 100000; alter session set statistics_level = all; with x as (select count(*) col1 from t1) select x1.*, x2.* from x x1, x x2; select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));
provides
PLAN_TABLE_OUTPUT --------------------------------------------------------------------------------------------- SQL_ID aqsc5m0wtj7wn, child number 0 ------------------------------------- with x as (select count(*) col1 from t1) select x1.*, x2.* from x x1, x x2 Plan hash value: 1115620710 --------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | --------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 1 | | 1 | TEMP TABLE TRANSFORMATION | | 1 | | 1 | | 2 | LOAD AS SELECT | | 1 | | 1 | | 3 | SORT AGGREGATE | | 1 | 1 | 1 | | 4 | TABLE ACCESS FULL | T1 | 1 | 122K| 100K| | 5 | MERGE JOIN CARTESIAN | | 1 | 1 | 1 | | 6 | VIEW | | 1 | 1 | 1 | | 7 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6717_BCA95716 | 1 | 1 | 1 | | 8 | BUFFER SORT | | 1 | 1 | 1 | | 9 | VIEW | | 1 | 1 | 1 | | 10 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6717_BCA95716 | 1 | 1 | 1 | --------------------------------------------------------------------------------------------- Note ----- - dynamic sampling used for this statement 27 rows selected
As you might expect, subquery materialisation using a temp table.
But if you execute the query within a distributed transaction, it will probably not use a temporary table (see original post above about why I say probably).
I’m using a simple insert script over a db link to start the distributed transaction but in my real world occurrence it was the JDBC XA driver.
drop table t2; create table t2 (col1 number); alter system flush shared_pool; select count(*) from gv$sql where sql_text like 'with x%'; alter session set statistics_level = all; insert into t2@mydba values(1); with x as (select count(*) col1 from t1) select x1.*, x2.* from x x1, x x2; select * from table(dbms_xplan.display_cursor(null,null,'allstats last')); rollback;
Which gives:
PLAN_TABLE_OUTPUT --------------------------------------------------------------------------------------- SQL_ID aqsc5m0wtj7wn, child number 0 ------------------------------------- with x as (select count(*) col1 from t1) select x1.*, x2.* from x x1, x x2 Plan hash value: 3433669518 --------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | --------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.08 | 314 | | 1 | NESTED LOOPS | | 1 | 1 | 1 |00:00:00.08 | 314 | | 2 | VIEW | | 1 | 1 | 1 |00:00:00.04 | 157 | | 3 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.04 | 157 | | 4 | TABLE ACCESS FULL| T1 | 1 | 122K| 100K|00:00:00.01 | 157 | | 5 | VIEW | | 1 | 1 | 1 |00:00:00.04 | 157 | | 6 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.04 | 157 | | 7 | TABLE ACCESS FULL| T1 | 1 | 122K| 100K|00:00:00.01 | 157 | --------------------------------------------------------------------------------------- Note ----- - dynamic sampling used for this statement 24 rows selected
As mentioned, the distributed transaction bypasses materialisation which could have a very negative effect on the performance of a particular query where that materialisation was key.
Part II – Materialized views.
This trouble with distributed transactions reminds me of a particular bugbear with materialised views.
This restriction is listed here in the documentation
http://docs.oracle.com/cd/E11882_01/server.112/e10706/repmview.htm
I’m surprised that a bigger deal is not made of it, given the proliferation of XA drivers and distributed transactions in many enterprise applications.
This is demonstrated quite easily, again using a db link to do the distributed transaction.
drop table t2; create table t2 (col1 number ,constraint pk_t2 primary key(col1)); create materialized view log on t2; create materialized view mv2 build immediate refresh fast on commit as select * from t2; insert into t2@mydba values (1); commit;
Which, on commit, gives:
SQL Error: ORA-02050: transaction 26.8.810414 rolled back, some remote DBs may be in-doubt ORA-02050: transaction 26.8.810414 rolled back, some remote DBs may be in-doubt ORA-02051: another session in same transaction failed ORA-02063: preceding 2 lines from MYDBA 02050. 00000 - "transaction %s rolled back, some remote DBs may be in-doubt" *Cause: network or remote failure in 2PC. *Action: Notify operations; remote DBs will automatically re-sync when the failure is repaired.
I keep meaning to double check that the same applies when using XA driver and Java but there’s no reason to think it wouldn’t be the same.
Tom Kyte has mentioned this several times on AskTom, “recently” here.
I’m not quite sure I get his comment about OLTP databases and distributed transactions but, not for the first time, I’ve found myself wanting to use MVs to mitigate certain performance problems and have run into this restriction
Part III – Locking scripts
There are a fair number of scripts out there on t’internet for showing / diagnosing locking.
During my special distributed transaction day yesterday, I was reminded that anything lock related that starts with V$SESSION or depends on / makes assumptions about SID/SESSION_ID can be misleading.
I initially thought something a little unusual was up because neither V$SESSION or ASH could tell me who was blocking a bunch of requested mode 6 TX locks. But I thought some of the blocking diagnosis was missing because of cross-node RAC complications.
But it actually was because there were transactions holding locks but no sessions.
Admittedly, this was a slightly strange situation, but essentially what happened was that an XA JDBC transaction that was also mixing with ActiveMQ ended up causing what I suppose would essentially be orphaned transactions, and locks held by transactions that had no associated sessions.
This may be an “exotic” situation but V$LOCKED_OBJECT, for example, exposes SESSION_ID but no SESSION_SERIAL# so itself can be very misleading if you go off joining willy nilly by SESSION_ID because SIDs get reused quickly.
Similarly, DBA_BLOCKERS and DBA_WAITERS both showed zilch.
A more reliable starting point was to ignore V$LOCKED_OBJECT.SESSION_ID and pay more attention to the transction information, joining to V$TRANSACTION and then outer joining to V$SESSION, like so:
select lo.inst_id , lo.object_id , lo.session_id , lo.os_user_name , lo.process , lo.locked_mode , ob.owner , ob.object_name , tx.addr , tx.start_time txn_start_time , tx.status , s.* from gv$locked_object lo , dba_objects ob , gv$transaction tx , gv$session s where ob.object_id = lo.object_id and tx.xidusn (+) = lo.xidusn and tx.xidslot (+) = lo.xidslot and tx.xidsqn (+) = lo.xidsqn and s.taddr (+) = tx.addr;
Which showed me, as mentioned, a whole bunch of locks in these orphaned, PREPARED distributed transactions dating back a few hours.
Re: Part II – Materialized Views
It’s worse than that. REFRESH FAST ON COMMIT fails if you access any remote object. including via a SELECT.
Example:
drop table t2;
create table t2
(col1 number
,constraint pk_t2 primary key(col1));
create materialized view log on t2;
create materialized view mv2
build immediate
refresh fast on commit
as
select * from t2;
insert into t2 values (1); — Insert into local table
select * from t1@mydba; — Select remote table
commit;
The COMMIT causes:
> ORA-02050: transaction 3.14.2891669 rolled back, some remote DBs may be in-doubt
> ORA-02051: another session or branch in same transaction failed or finalized
This, in my opinion, makes REFRESH FAST ON COMMIT unusable.
It is an known Oracle bug (#1385495). It was raised in 8.1.6 and, according to Oracle Support, will be fixed in 12c.
Yep, I know it is Kevin.
I don’t get why a bigger deal is not made of this.
More than once I’ve wanted to use fast refresh on commit MVs to mitigate the performance of reporting in a mixed workload environment and have hit this restriction.
It would be interesting to know what the technical issue is behind this incompibility.
That bug you mentioned seems to refer to an ORA-00600 in particular circumstances and not the root incompatibility.
Pingback: Fun with Distributed Transactions II « OraStory
Note also that PARALLEL DML is not possible within a distributed transaction.
http://docs.oracle.com/cd/E11882_01/server.112/e25523/parallel003.htm#i1007101
Pingback: Blocking Prepared XA Transaction | OraStory