Blocking Prepared XA Transaction

There was an oracle-l thread last month about blocking sessions which could not be identified.

I replied back about a very niche scenario which I’d seen a few times before. And it’s just happened again so I thought it would be useful to just post some evidence on this.

Here are a couple of articles posted previously on XA transactions:

First, a reminder, from the posts above, that a transaction doesn’t have to have a session.

Normal “vanilla” sessions, there’s a 1:1 relationship. But with an XA transaction, a session can attach and detach – but only one session can be attached to a transaction at any one time.

And a bit of context about these XA transactions.

This is a complex Java application with multiple resources participating in the XA transaction. There was HornetQ involved and HornetQ was apparently timing out and was bounced, the taking down being down probably with a kill -9.

And an XA transaction does what is called a 2-phase commit where the first phase involves the transaction co-ordinator instructing the participating resources to prepare to commit and subsequently the second phase is to commit.

What I have seen many times on this application is that something happens to the transaction co-ordinator in between the PREPARE and the COMMIT and we are left we an orphaned transaction.

This transaction will eventually time-out – depending on the properties set on the transaction by the co-ordinator – at which point it will become an in-doubt transaction and be visible in dba_2pc_pending.

Back to reality… there is nothing in DBA_2PC_PENDING:

select * from dba_2pc_pending;

no rows selected

But we can see there is a transaction in prepared state in V$TRANSACTION, a transaction which started yesterday afternoon:

select * from v$transaction where status = 'PREPARED';
ADDR                 XIDUSN    XIDSLOT     XIDSQN     UBAFIL     UBABLK     UBASQN     UBAREC STATUS           START_TIME           START_SCNB START_SCNW START_UEXT START_UBAFIL START_UBABLK START_UBASQN START_UBAREC SES_ADDR               FLAG SPACE RECURSIVE NOUNDO PTX NAME                                                                                                                                                                                                                                                             PRV_XIDUSN PRV_XIDSLT PRV_XIDSQN PTX_XIDUSN PTX_XIDSLT PTX_XIDSQN     DSCN-B     DSCN-W  USED_UBLK  USED_UREC     LOG_IO     PHY_IO     CR_GET  CR_CHANGE START_DATE            DSCN_BASE  DSCN_WRAP  START_SCN DEPENDENT_SCN XID              PRV_XID          PTX_XID        
---------------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------------- -------------------- ---------- ---------- ---------- ------------ ------------ ------------ ------------ ---------------- ---------- ----- --------- ------ --- ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- -------------------- ---------- ---------- ---------- ------------- ---------------- ---------------- ----------------
0000004A833D6868        691         32     315541         22    1085802     -28624         18 PREPARED         01/19/18 15:39:45    3454176189       2913         13           22      1085802       -28624            9 0000004B82E584A0    4201987 NO    NO        NO     NO                                                                                                                                                                                                                                                                            0          0          0          0          0          0          0          0          1         12     -40287       -163  -66597824      -1385 19-JAN-2018 15:39:45          0          0    1.3E+13             0 B302200095D00400 0000000000000000 0000000000000000 


And this is the script which I mentioned on the oracle-l thread which is one I use to see what locks transaction are holding, when the transaction started and what sessions are attached:

select s.machine
,      lo.inst_id
,      lo.object_id
,      lo.session_id
,      lo.os_user_name
,      lo.process
,      ob.owner
,      ob.object_name
,      ob.subobject_name
,      tx.addr
,      tx.start_time txn_start_time
,      tx.status
,      tx.xid
,      s.*
from   gv$locked_object lo
,      dba_objects      ob
,      gv$transaction    tx
,      gv$session        s
where  ob.object_id = lo.object_id
and    tx.xidusn    (+) = lo.xidusn
and    tx.xidslot   (+) = lo.xidslot
and    tx.xidsqn    (+) = lo.xidsqn
and    s.taddr      (+) = tx.addr
order by txn_start_time, session_id, object_name;

For privacy reasons and as this is a real-world situation and not an isolated test case, I won’t share the output of the script.

But it shows that:

  • the transaction is holding mode 6 exclusive TX row locks on a number of objects
  • that the transaction is in PREPARED
  • and that there is no session attached to the transaction (although v$locked_object does tell us what the SID was when it was there)

Now, from the application perspective, something has apparently rolled back a message perhaps because now HornetQ has been bounced, everything is back up and running and it seems like the message that resulted in our orphaned transaction is being reattempted and is being blocked by the exclusive locks still being held.

From an ACTIVE SESSION HISTORY perspective, this is what we see from this script from which I’ve removed columns for brevity and privacy but left them in so you can see what I run normally:

select count(*) over (partition by h.sample_time) sess_cnt
--,      h.user_id
--,      (select username from dba_users u where u.user_id = h.user_id) u, h.service_hash
,      xid--, sample_id
, sample_time, session_state, session_id, session_serial#,--sql_id,
sql_exec_id, sql_exec_start, event, --p1,
mod(p1,16), blocking_session,blocking_session_serial#--, current_obj#
--,      (select object_name||' - '||subobject_name from dba_objects where object_id = current_obj#) obj
--,      (select sql_fulltext from v$sql s where s.sql_id = h.sql_id and rownum = 1) sqltxt
--,      (select sql_text from dba_hist_sqltext s where s.sql_id = h.sql_id and rownum = 1) sqltxt
--, h.*
from   v$active_session_history h
where event = 'enq: TX - row lock contention'
order by h.sample_id desc;
XID              SESSION_STATE SESSION_ID SESSION_SERIAL# EVENT                                                            MOD(P1,16) BLOCKING_SESSION BLOCKING_SESSION_SERIAL#
---------------- ------------- ---------- --------------- ---------------------------------------------------------------- ---------- ---------------- ------------------------
4F021700A3C00600 WAITING              232           53035 enq: TX - row lock contention                                             6
FC010B00196E0A00 WAITING              471            5205 enq: TX - row lock contention                                             6
670320004FA50300 WAITING             2652           11791 enq: TX - row lock contention                                             6
640204005BA40500 WAITING             4300           49665 enq: TX - row lock contention                                             6

So, you can see we have four sessions being blocked on exclusive mode 6 row level locks and that the blocking session is null. If I showed you the full history then you would see that these sessions have been repeatedly trying for many hours.

BTW, this is RAC but all these sessions are intentionally on the same instance so there’s none of that jiggery-pokery involved.

I imagine at some point there was an internal conversation in Oracle about whether to report blocking session or blocking transaction.

At this point, it’s just a normal lock held by a “normal” transaction which hasn’t committed yet and actually the application just keeps trying to run the transaction waits for 1 minute until they hit the default distributed transaction timeout:

ORA-02049: timeout: distributed transaction waiting for lock

which will be logged somewhere obscure in the application logs – and there’ll be a brief pause and then it starts all over again.

Anyway at this point what to do?

At this point, the transaction hasn’t timed out.

In a few hours, the transaction will time out and become an in-doubt transaction.

Once it does, the application will receive a message:

ORA-01591: lock held by in-doubt distributed transaction

At which time, it can be manually resolved.

Typically, this is what we usually do:

begin
for txn in (select local_tran_id from dba_2pc_pending)
loop
-- if txn is not already forced rollback
execute immediate 'rollback force '''||txn.local_tran_id||'''';
commit;
dbms_transaction.purge_lost_db_entry(txn.local_tran_id);
commit;
end loop;
end;
/

If we were going to intervene right now and resolve it, what could we do?

We could bounce the database. Then the application would become in-doubt and see above.

But bouncing the database is quite drastic.

A lot of the advice in the manual is about what to do once it is IN-DOUBT. Some of it might work now – not sure.

What I advise is that if we know we want to rollback or commit this transaction now and we don’t want to wait for it to become IN-DOUBT which often we can’t wait for then programatically we can attach to the transaction using DBMS_XA and do something with it.

First of all, we need some information about the transaction.
CAVEAT DBA!!! This isn’t exactly well documented but I have found that what we tend to need is to identify the transactions in V$GLOBAL_TRANSACTION which are in COLLECTING state:

select state
,      UTL_RAW.CAST_TO_BINARY_INTEGER (globalid)
,      UTL_RAW.CAST_TO_BINARY_INTEGER (branchid)
,      t.* 
from v$global_transaction t where state = 'COLLECTING';
STATE                                  UTL_RAW.CAST_TO_BINARY_INTEGER(GLOBALID) UTL_RAW.CAST_TO_BINARY_INTEGER(BRANCHID)   FORMATID GLOBALID                                                                                                                         BRANCHID                                                                                                                           BRANCHES   REFCOUNT PREPARECOUNT STATE                                       FLAGS COUPLING      
-------------------------------------- ---------------------------------------- ---------------------------------------- ---------- -------------------------------------------------------------------------------------------------------------------------------- -------------------------------------------------------------------------------------------------------------------------------- ---------- ---------- ------------ -------------------------------------- ---------- ---------------
COLLECTING                                                                   49                                       45     131075 312D2D35363832376566363A393236643A35613562363664363A633738353036                                                                 2D35363832376566363A393236643A35613562363664363A633738353065                                                                              1          1            1 COLLECTING                                      1 TIGHTLY COUPLED 


Then we can replace the formatid, global id and branch id in the script below. Whether you require numbers or raw depends on the signature to DBMS_XA_XID – see documentation.

set serveroutput on
DECLARE
l_xid     DBMS_XA_XID :=
DBMS_XA_XID(131075,
'312D2D35363832376566363A393236643A35613562363664363A633738353036',
'2D35363832376566363A393236643A35613562363664363A633738353065');
l_return  PLS_INTEGER;
BEGIN
l_return := SYS.dbms_xa.xa_rollback(xid =>  l_xid);
dbms_output.put_line(l_return);
END;
/

This approach above comes with no guarantees.
But it has worked for me several times in the past.

Advertisements

Remote Surprise

Here is an example of surprising behaviour from a remote DB from an OTN forum thread

Setup a link to a remote DB (I’ve used an actual remote DB and not tested a loopback)

Remote DB:

create table t1
(col1  varchar2(1));

Local DB:

create or replace view v1 
as 
select count(*) c1 from t1@l1;

Then alternate variations on this sequence of events:

1. On local DB execute SELECT:

SELECT * FROM v1;

2. On remote DB execute

begin
insert into t1 values ('a');
commit;
end;
/

3. On local DB execute SELECT:

SELECT * FROM v1;

4. On local DB execute SELECT:

SELECT * FROM v1;

OR

1. On local DB execute SELECT:

SELECT * FROM v1;

2. On remote DB execute

begin
delete from t1; 
commit;
end;
/

3. On local DB execute:

SELECT * FROM v1;

4. On local DB execute:

SELECT * FROM v1;

You may have to execute several times to catch the “inconsistency” but between the remote INSERT or DELETE and the local SELECT, you should see the odd “old” result which is corrected on the subsequent execution.

For example, on local DB with nothing happening on remote DB between the two SELECTS:

SQL> select * from v1;

        C1
----------
         6

SQL> select * from v1;

        C1
----------
         7

The behaviour is documented.


no practical way exists to keep SCNs in a distributed system absolutely synchronized: a window always exists in which one node may have an SCN that is somewhat in the past with respect to the SCN of another node.

Because of the SCN gap, you can execute a query that uses a slightly old snapshot, so that the most recent changes to the remote database are not seen. In accordance with read consistency, a query can therefore retrieve consistent, but out-of-date data.

And two workarounds documented:

You can use the following techniques to ensure that the SCNs of the two systems are synchronized just before a query:

Because SCNs are synchronized at the end of a remote query, precede each remote query with a dummy remote query to the same site, for example, SELECT * FROM DUAL@REMOTE.

Because SCNs are synchronized at the start of every remote transaction, commit or roll back the current transaction before issuing the remote query.

Fun with Distributed Transactions II

Short on examples, long on words…

Last week I did a quick post about a couple of the more obscure implications of using distributed transactions, in particular:

  • The current impact on subquery materialisations
  • The incompatibility with MVs
  • The unusual situation of transactions with locks but no sessions

Anyway, earlier today I was having a closer look at some “DFS lock handle” wait events in an 11gR1 RAC database (not because there was an obvious problem but because I saw a few of them an wanted to have a closer look).

On a DFS lock handle wait, you have to decode P1 to find out what it’s all about as discussed by Riyaj Shamsudeen.

In my case, these were mostly DX and BB enqueues which are related to my old friend the distributed transaction.

Why am I still banging on about distributed transactions?

Well, you might think that they’re not very common but in the JDBC world they seem to be everywhere.

XA transactions are used (overused) throughout the JDBC world and so they might well be very relevant to any database that you’ve got with a JDBC app sitting atop.

A common pattern seen is to take some message off a queue, do something related to that message in the database, and use an XA transaction so that both stand or fail together.

In my previous post, I mentioned an AskTom thread discussing Materialized Views and distributed transactions.

I added some thoughts to that thread and Tom’s observation was

that the java/jdbc world for some reasons wishes to use an external resource manager so they have to do two phase commits against a single database isn’t what I was talking about. when I’ve seen the XA stuff – there is typically *one* database involved and it makes everything really complex,
hard to understand and slower than it needs to be.

Which might well be a fair point but still doesn’t change the fact that if your insert/update/delete comes in on an XA, then being unable to use a fast refresh on commit MV is really quite a restriction.

Anyway, back to my observations on these enqueues.

There’s not much information related to the BB enqueue but I believe that it’s related to the coordination of global transactions on a RAC cluster.

From 11gR1 distributed transactions can be processed on any instance in the cluster and these BB enqueues seem to be part of that picture. Prior to this change, any branches of a distributed transaction had to execute on the same node.

Whilst we’re on the subject, inevitably there are some bugs around these, in particular the BB enqueue and GTX processes (processes introduced to manage these enhanced cross cluster distributed transaction features).

The DX enqueue is perhaps more interesting.

It looks after “tightly coupled distributed transactions”.

So, WTF are tightly coupled distributed transactions and how do they differ from loosely coupled distributed transactions?

Tightly coupled transactions enable other branches of the transaction – and from our database perspective we’re only talking about multiple sessions in the same database which are part of the same distributed transaction – to:

  • share each others locks and
  • see each others changes.

The DX enqueue helps manage this.

As part of this, what the DX enqueue does is make sure that only one transaction branch is actively executing SQL at any one time.

So that’s potentially a pretty hefty point of serialisation then.

For more information, the Oracle whitepaper “XA and Oracle controlled Distributed Transactions” linked to below is a really good resource particularly the section “Distributed Transactions and Database Locking”.

However… just to emphasise the point …

In a tightly coupled tranasction, the DX enqueue is obtained before executing any statement.

By contrast, loosely coupled tranasctions do not need to get this DX lock before executing a statement, i.e. no serialisation between different transaction branches.

As the developer’s guide below says “loosely coupled transaction branches result in greater concurrency.”

So, not only is there an overhead to XA transactions but there is an additional overhead to tightly coupled transactions.

And, how many applications really use multiple transaction branches in a single database.

Very, very few, I wager!

And if they do, how many of those applications appreciate the serialisation involved anyway?

Now depending on your version, you might not see the DX and BB enqueues in your enqueue statistics.

In 11.1.0.7 I don’t seem to see anything in V$ENQUEUE_STAT for DX or BB.

There are however some relevant session/system statistic buckets:

  • DX/BB enqueue lock foreground requests
  • DX/BB enqueue lock foreground wait time
  • DX/BB enqueue lock background gets
  • DX/BB enqueue lock background get time

For more information see:

Fun with distributed transactions

Three items within:
Part 1 – Subquery materialisation not possible (fixed in 12c)
Part 2 – Materialised views
Part 3 – Lock scripts

Part 1 – Subquery materialisation not possible
Yesterday I was reminder of a familiar issue, officially listed as a bug, that can occur when combining subquery materialisation with distributed transactions.

I had previously tuned a query for a colleague, the best approach for which was using implicit materialised subquery factoring (implicit because it wasn’t hinted as /*+ materalize */ but was referenced multiple times in the main query) and I had provided such a solution, without even a thought to the performance threat that the query might be running in an XA transaction.

As a result, unsurprisingly with the benefit of hindsight, the query was not performing as expected from the application in said XA transaction.

Let me provide a refresher on the issue.

create table t1
as
select rownum col1
from   dual
connect by rownum <= 100000;

alter session set statistics_level = all;

with x as  
(select count(*) col1  
 from   t1)  
select x1.*, x2.*
from x x1, x x2;  

select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));

provides

PLAN_TABLE_OUTPUT 
---------------------------------------------------------------------------------------------
SQL_ID  aqsc5m0wtj7wn, child number 0
-------------------------------------                              
with x as   (select count(*) col1    from   t1)   select x1.*, x2.*
from x x1, x x2

Plan hash value: 1115620710 

---------------------------------------------------------------------------------------------
| Id  | Operation                  | Name                        | Starts | E-Rows | A-Rows |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |                             |      1 |        |      1 |
|   1 |  TEMP TABLE TRANSFORMATION |                             |      1 |        |      1 |
|   2 |   LOAD AS SELECT           |                             |      1 |        |      1 |
|   3 |    SORT AGGREGATE          |                             |      1 |      1 |      1 |
|   4 |     TABLE ACCESS FULL      | T1                          |      1 |    122K|    100K|
|   5 |   MERGE JOIN CARTESIAN     |                             |      1 |      1 |      1 |
|   6 |    VIEW                    |                             |      1 |      1 |      1 |
|   7 |     TABLE ACCESS FULL      | SYS_TEMP_0FD9D6717_BCA95716 |      1 |      1 |      1 |
|   8 |    BUFFER SORT             |                             |      1 |      1 |      1 |
|   9 |     VIEW                   |                             |      1 |      1 |      1 |
|  10 |      TABLE ACCESS FULL     | SYS_TEMP_0FD9D6717_BCA95716 |      1 |      1 |      1 |
---------------------------------------------------------------------------------------------

Note
-----  
   - dynamic sampling used for this statement
                  

 27 rows selected 

As you might expect, subquery materialisation using a temp table.

But if you execute the query within a distributed transaction, it will probably not use a temporary table (see original post above about why I say probably).

I’m using a simple insert script over a db link to start the distributed transaction but in my real world occurrence it was the JDBC XA driver.

drop table t2;

create table t2
(col1 number);

alter system flush shared_pool;

select count(*) from gv$sql where sql_text like 'with x%';

alter session set statistics_level = all;

insert into t2@mydba values(1);

with x as  
(select count(*) col1  
 from   t1)  
select x1.*, x2.*
from x x1, x x2;  

select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));

rollback;

Which gives:

PLAN_TABLE_OUTPUT  
---------------------------------------------------------------------------------------
SQL_ID  aqsc5m0wtj7wn, child number 0
-------------------------------------
with x as   (select count(*) col1    from   t1)   select x1.*, x2.*
from x x1, x x2

Plan hash value: 3433669518

---------------------------------------------------------------------------------------
| Id  | Operation            | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |      1 |        |      1 |00:00:00.08 |     314 |
|   1 |  NESTED LOOPS        |      |      1 |      1 |      1 |00:00:00.08 |     314 |
|   2 |   VIEW               |      |      1 |      1 |      1 |00:00:00.04 |     157 |
|   3 |    SORT AGGREGATE    |      |      1 |      1 |      1 |00:00:00.04 |     157 |
|   4 |     TABLE ACCESS FULL| T1   |      1 |    122K|    100K|00:00:00.01 |     157 |
|   5 |   VIEW               |      |      1 |      1 |      1 |00:00:00.04 |     157 |
|   6 |    SORT AGGREGATE    |      |      1 |      1 |      1 |00:00:00.04 |     157 |
|   7 |     TABLE ACCESS FULL| T1   |      1 |    122K|    100K|00:00:00.01 |     157 |
--------------------------------------------------------------------------------------- 

Note
-----
   - dynamic sampling used for this statement


 24 rows selected 

As mentioned, the distributed transaction bypasses materialisation which could have a very negative effect on the performance of a particular query where that materialisation was key.

Part II – Materialized views.

This trouble with distributed transactions reminds me of a particular bugbear with materialised views.

This restriction is listed here in the documentation
http://docs.oracle.com/cd/E11882_01/server.112/e10706/repmview.htm

I’m surprised that a bigger deal is not made of it, given the proliferation of XA drivers and distributed transactions in many enterprise applications.

This is demonstrated quite easily, again using a db link to do the distributed transaction.

drop table t2;

create table t2
(col1 number
,constraint pk_t2 primary key(col1));

create materialized view log on t2;

create materialized view mv2
build immediate
refresh fast on commit
as
select * from t2;

insert into t2@mydba values (1);

commit;

Which, on commit, gives:

SQL Error: ORA-02050: transaction 26.8.810414 rolled back, some remote DBs may be in-doubt
ORA-02050: transaction 26.8.810414 rolled back, some remote DBs may be in-doubt
ORA-02051: another session in same transaction failed
ORA-02063: preceding 2 lines from MYDBA
02050. 00000 -  "transaction %s rolled back, some remote DBs may be in-doubt"
*Cause:    network or remote failure in 2PC.
*Action:   Notify operations; remote DBs will automatically re-sync when the
           failure is repaired.

I keep meaning to double check that the same applies when using XA driver and Java but there’s no reason to think it wouldn’t be the same.

Tom Kyte has mentioned this several times on AskTom, “recently” here.

I’m not quite sure I get his comment about OLTP databases and distributed transactions but, not for the first time, I’ve found myself wanting to use MVs to mitigate certain performance problems and have run into this restriction

Part III – Locking scripts
There are a fair number of scripts out there on t’internet for showing / diagnosing locking.

During my special distributed transaction day yesterday, I was reminded that anything lock related that starts with V$SESSION or depends on / makes assumptions about SID/SESSION_ID can be misleading.

I initially thought something a little unusual was up because neither V$SESSION or ASH could tell me who was blocking a bunch of requested mode 6 TX locks. But I thought some of the blocking diagnosis was missing because of cross-node RAC complications.

But it actually was because there were transactions holding locks but no sessions.

Admittedly, this was a slightly strange situation, but essentially what happened was that an XA JDBC transaction that was also mixing with ActiveMQ ended up causing what I suppose would essentially be orphaned transactions, and locks held by transactions that had no associated sessions.

This may be an “exotic” situation but V$LOCKED_OBJECT, for example, exposes SESSION_ID but no SESSION_SERIAL# so itself can be very misleading if you go off joining willy nilly by SESSION_ID because SIDs get reused quickly.

Similarly, DBA_BLOCKERS and DBA_WAITERS both showed zilch.

A more reliable starting point was to ignore V$LOCKED_OBJECT.SESSION_ID and pay more attention to the transction information, joining to V$TRANSACTION and then outer joining to V$SESSION, like so:

select lo.inst_id
,      lo.object_id
,      lo.session_id
,      lo.os_user_name
,      lo.process
,      lo.locked_mode
,      ob.owner
,      ob.object_name
,      tx.addr
,      tx.start_time txn_start_time
,      tx.status
,      s.*
from   gv$locked_object lo
,      dba_objects      ob
,      gv$transaction    tx
,      gv$session        s
where  ob.object_id = lo.object_id
and    tx.xidusn    (+) = lo.xidusn
and    tx.xidslot   (+) = lo.xidslot
and    tx.xidsqn    (+) = lo.xidsqn
and    s.taddr      (+) = tx.addr; 

Which showed me, as mentioned, a whole bunch of locks in these orphaned, PREPARED distributed transactions dating back a few hours.