SQL utils using XML

You may have previously seen a short post I did on a SQL statement to identify which statements are using dynamic sampling.

If not, quick recap:

SELECT p.sql_id, t.val
FROM   v$sql_plan p
,      xmltable('for $i in /other_xml/info
                 where $i/@type eq "dynamic_sampling"
                 return $i'
                passing xmltype(p.other_xml)
                columns attr varchar2(50) path '@type',
                        val  varchar2(50) path '/') t
WHERE  p.other_xml IS NOT NULL;

This uses the incredibly powerful XMLTABLE functionality, there’s so much that can be done with it.

Here are a couple of other utilities I used recently which also highlight the powerful convenience of SQL and XML.

First up, I don’t know if this is useful to anyone but I had a crappy refresh script which should have been creating table partitions with SEGMENT CREATION DEFERRED but wasn’t.

So there was a reasonable amount of space wastage caused by empty segments.

How to identify? See below.

Could be combined with DBMS_SPACE_ADMIN.DROP_EMPTY_SEGMENTS to clean up?

WITH subq_pos_empty AS
(SELECT t.table_owner
 ,      t.table_name
 ,      t.partition_name
 ,      x.cnt
 FROM   dba_segments         s
 ,      dba_tab_partitions   t
 ,      xmltable('for $i in /ROWSET/ROW/CNT
                  return $i'
                 passing xmltype(
                          dbms_xmlgen.getxml
                          ('select count(*) cnt '
                         ||'from '||t.table_owner||'.'||t.table_name||' PARTITION ('||t.partition_name||') '
                         --||'SAMPLE(.01)' -- If you want to sample to speed up unexpected large seg counts                  
                           ))
                 columns cnt number path '/') x
 WHERE  s.segment_type       = 'TABLE PARTITION'
 --AND    t.table_owner      LIKE 'XYZ%'
 AND    t.table_owner        = s.owner
 AND    t.table_name         = s.segment_name
 AND    t.partition_name     = s.partition_name
 AND    t.num_rows           = 0
 AND    t.partition_position > 1)
SELECT *
FROM   subq_pos_empty
WHERE  cnt = 0
ORDER BY 
       table_owner
,      table_name
,      partition_name;

For example:

SQL> create table t1
  2  (col1 date
  3  ,col2 number)
  4  partition by range(col1) interval (numtodsinterval(1,'DAY'))
  5  (PARTITION p0 values less than (to_Date(20130101,'YYYYMMDD')) segment creation immediate
  6  ,PARTITION p1 values less than (to_Date(20130102,'YYYYMMDD'))  segment creation immediate)
  7  ;

Table created.

SQL> exec dbms_stats.gather_table_stats(USER,'T1');

PL/SQL procedure successfully completed.

SQL> WITH subq_pos_empty AS
  2  (SELECT t.table_owner
  3   ,      t.table_name
  4   ,      t.partition_name
  5   ,      x.cnt
  6   FROM   dba_segments         s
  7   ,      dba_tab_partitions   t
  8   ,      xmltable('for $i in /ROWSET/ROW/CNT
  9                    return $i'
 10                   passing xmltype(
 11                            dbms_xmlgen.getxml
 12                            ('select count(*) cnt '
 13                           ||'from '||t.table_owner||'.'||t.table_name||' PARTITION ('||t.partition_name||') '
 14                           --||'SAMPLE(.01)' -- If you want to sample to speed up unexpected large seg counts                                      
 15                             ))
 16                   columns cnt number path '/') x
 17   WHERE  s.segment_type       = 'TABLE PARTITION'
 18   --AND    t.table_owner      LIKE 'XYZ%'
 19   AND    t.table_name         = 'T1' -- Comment out
 20   AND    t.table_owner        = s.owner
 21   AND    t.table_name         = s.segment_name
 22   AND    t.partition_name     = s.partition_name
 23   AND    t.num_rows           = 0
 24   AND    t.partition_position > 1)
 25  SELECT *
 26  FROM   subq_pos_empty
 27  WHERE  cnt = 0
 28  ORDER BY
 29         table_owner
 30  ,      table_name
 31  ,      partition_name;

TABLE_OWNER                    TABLE_NAME                     PARTITION_NAME                        CNT
------------------------------ ------------------------------ ------------------------------ ----------
PGPS_UAT1                      T1                             P1                                      0

Secondly, a helper for partitions and that nasty LONG column which can be used for partition maintenance to roll off oldest partitions:

SELECT table_name
,      partition_name
,      hi
FROM   (SELECT t.table_name
        ,      t.partition_name
        ,      t.partition_position
        ,      x.hi
        FROM   user_tab_partitions t
        ,      xmltable('for $i in /ROWSET/ROW/HI
                         return $i'
                        passing xmltype(
                                dbms_xmlgen.getxml
                                ('select high_value hi from user_tab_partitions x'
                               ||' where x.table_name     = '''||t.table_name||''''
                               ||' and   x.partition_name = '''|| t.partition_name|| ''''))
                        columns hi number path '/') x
        --WHERE  partition_position > 1
        --AND    table_name  = i_table_name
        )
--WHERE hi       <= i_date_yyyymmdd
;   

This works an awful lot more easily if you have range/interval partitioning on a number – which most people probably don’t have.

For the more normal DATE range partitioning, it’s only slightly more fiddly.

I haven’t spent too long thinking about it so there may be a better way, but I tried to avoid the deprecated EXTRACTVALUE approach:

SELECT t.table_name
,      t.partition_name
,      t.partition_position
,      to_date(x2.dt,'YYYYMMDDHH24MISS') hi
FROM   user_tab_partitions t
,      xmltable('for $i in /ROWSET/ROW/HI
                 return $i'
                passing xmltype(
                          dbms_xmlgen.getxml
                          ('select high_value hi from user_tab_partitions x'
                         ||' where x.table_name     = '''||t.table_name||''''
                         ||' and   x.partition_name = '''|| t.partition_name|| ''''))
                columns dt varchar2(4000) path '/') x
,      xmltable('for $i in /ROWSET/ROW/DT
                 return $i'
                passing xmltype(dbms_xmlgen.getxml(q'[select to_char(]'||x.dt||q'[,'YYYYMMDDHH24MISS') dt from dual]'))
                columns dt varchar2(16) path '/') x2
;

For example:

SQL> alter session set nls_date_format = 'DD-MON-YYYY HH24:MI';

SQL> SELECT t.table_name
  2  ,      t.partition_name
  3  ,      t.partition_position
  4  ,      to_date(x2.dt,'YYYYMMDDHH24MISS') hi
  5  FROM   user_tab_partitions t
  6  ,      xmltable('for $i in /ROWSET/ROW/HI
  7                   return $i'
  8                  passing xmltype(
  9                            dbms_xmlgen.getxml
 10                            ('select high_value hi from user_tab_partitions x'
 11                           ||' where x.table_name     = '''||t.table_name||''''
 12                           ||' and   x.partition_name = '''|| t.partition_name|| ''''))
 13                  columns dt varchar2(4000) path '/') x
,      xmltable('for $i in /ROWSET/ROW/DT
 14   15                   return $i'
 16                  passing xmltype(dbms_xmlgen.getxml(q'[select to_char(]'||x.dt||q'[,'YYYYMMDDHH24MISS') dt from dual]'))
 17                  columns dt varchar2(16) path '/') x2
 18  WHERE  t.table_name = 'T1';

TABLE_NAME                     PARTITION_NAME                 PARTITION_POSITION HI
------------------------------ ------------------------------ ------------------ -----------------
T1                             P0                                              1 01-JAN-2013 00:00
T1                             P1                                              2 02-JAN-2013 00:00

SQL>

Warning about the XMLTABLE approach – if running on versions less than 11.2, you may occasionally run into some ORA-00600 bugs.

Failed Logins can cause row cache lock on dc_users

One “feature” to be aware of in old-fashioned Oracle 11g – took me a while to figure it out, so just in case anyone else gets hit by it…

I changed the password of a schema in a test database to stop anything connecting whilst that schema was being logically refreshed.

Following this change, this environment started crawling but, despite being heavily used, no-one was complaining apart from me.

I couldn’t connect, just seemed to be “hanging” and for any existing connections certain statements were very slow.

I could see lots of row cache locks, the p1 of which decoded to dc_users.

Spent a while trying to figure it out, and eventually asked for a couple of second opinions on at the environment – they had no problems connecting, could see the row cache locks and eventually bounced the database.

Nothing got any better.

Another strange thing was that when looking at the sessions via ASH, they were all SYS sessions but with a session module that could not have been SYS – i.e application connections, etc.

So, that pointed at recursive SQL.

select count(*) over (partition by sample_time) sess_cnt
,      (select username from dba_users u where u.user_id = h.user_id) u
,      sample_time, session_state, session_id, sql_id
,      module, event, p1, blocking_session
from   dba_hist_active_sess_history h
where  event = 'row cache lock'
order by sample_time desc;
SESS_CNT U     SAMPLE_TIME                   SESSION_STATE SESSION_ID MODULE                  EVENT             P1 BLOCKING_SESSION
-------- ----- ----------------------------- ------------- ---------- ----------------------- ----------------- -- ----------------
      19 SYS   11-JUL-13 10.19.15.935000000  WAITING             1828 sqlplus@xx (TNS V1-V3)  row cache lock     7                  
      19 SYS   11-JUL-13 10.19.15.935000000  WAITING             2172 JDBC Thin Client        row cache lock     7             1898 
      19 SYS   11-JUL-13 10.19.15.935000000  WAITING             1837 JDBC Thin Client        row cache lock     7             2144 
      19 SYS   11-JUL-13 10.19.15.935000000  WAITING             1848 JDBC Thin Client        row cache lock     7             1898 
      19 SYS   11-JUL-13 10.19.15.935000000  WAITING             1880 sqlplus@xx (TNS V1-V3)  row cache lock     7             2144 
      19 SYS   11-JUL-13 10.19.15.935000000  WAITING             1886 JDBC Thin Client        row cache lock     7             1909 
      19 SYS   11-JUL-13 10.19.15.935000000  WAITING             2012 JDBC Thin Client        row cache lock     7             2144 
      19 SYS   11-JUL-13 10.19.15.935000000  WAITING             2026 sqlplus@xx (TNS V1-V3)  row cache lock     7             1898 
      19 SYS   11-JUL-13 10.19.15.935000000  WAITING             2031 JDBC Thin Client        row cache lock     7             1898 
      19 SYS   11-JUL-13 10.19.15.935000000  WAITING             2041 JDBC Thin Client        row cache lock     7             2144 
      19 SYS   11-JUL-13 10.19.15.935000000  WAITING             2058 sqlplus@xx (TNS V1-V3)  row cache lock     7             1898 
      19 SYS   11-JUL-13 10.19.15.935000000  WAITING             2063 JDBC Thin Client        row cache lock     7             2144 
      19 SYS   11-JUL-13 10.19.15.935000000  WAITING             2064 SQL Developer           row cache lock     7             2144 
      19 SYS   11-JUL-13 10.19.15.935000000  WAITING             2073 sqlplus@xx (TNS V1-V3)  row cache lock     7             1898 
      19 SYS   11-JUL-13 10.19.15.935000000  WAITING             2084 sqlplus@xx (TNS V1-V3)  row cache lock     7                  
      19 SYS   11-JUL-13 10.19.15.935000000  WAITING             2105 sqlplus@xx (TNS V1-V3)  row cache lock     7             1898 
      19 SYS   11-JUL-13 10.19.15.935000000  WAITING             2123 sqlplus@xx (TNS V1-V3)  row cache lock     7             1898 
      19 SYS   11-JUL-13 10.19.15.935000000  WAITING             2144 JDBC Thin Client        row cache lock     7             2041 
      19 SYS   11-JUL-13 10.19.15.935000000  WAITING             1832 sqlplus@xx (TNS V1-V3)  row cache lock     7             2144 

 19 rows selected 
select trunc(sample_time), count(*), min(sample_time), max(sample_time)
from   dba_hist_active_Sess_history
where  event = 'row cache lock'
group by trunc(sample_time)
order by trunc(sample_time);
TRUNC(SAMPLE_TIME) COUNT(*) MIN(SAMPLE_TIME)              MAX(SAMPLE_TIME)            
------------------ -------- ----------------------------- -----------------------------
02-JUL-13                10 02-JUL-13 04.08.48.955000000  02-JUL-13 16.12.30.128000000  
03-JUL-13                 2 03-JUL-13 11.42.30.367000000  03-JUL-13 22.02.38.886000000  
04-JUL-13                 6 04-JUL-13 04.10.25.568000000  04-JUL-13 22.02.52.536000000  
05-JUL-13                16 05-JUL-13 04.09.00.444000000  05-JUL-13 22.01.13.067000000  
06-JUL-13                 2 06-JUL-13 01.06.31.261000000  06-JUL-13 14.07.15.208000000  
08-JUL-13                26 08-JUL-13 03.10.53.909000000  08-JUL-13 22.00.45.416000000  
09-JUL-13                19 09-JUL-13 04.06.55.191000000  09-JUL-13 20.23.16.801000000  
10-JUL-13             48293 10-JUL-13 04.00.41.732000000  10-JUL-13 19.00.04.750000000  
11-JUL-13              5412 11-JUL-13 09.20.40.833000000  11-JUL-13 10.41.49.610000000  

 9 rows selected 

Eventually the penny dropped …

See bug 7715339 – Logon failures causes “row cache lock” waits – Allow disable of logon delay [ID 7715339.8]

In 11g there is an intentional delay between allowing failed logon
attempts to retry. For some specific application types this can cause
a problem as the row cache entry is locked for the duration of the
delay . This can lead to excessive row cache lock waits for DC_USERS
for specific users / schemas.

After 3 successive failures a sleep delay is introduced starting
at 3 seconds and extending to 10 seconds max. During each delay
the user X row cache lock is held in exclusive mode preventing
any concurrent logon attempt as user X (and preventing any
other operation which would need the row cache lock for user X).

That explains a) the problem and b) why the impact was limited.

ORs, IN lists and LNNVL

I’ve previously written about manually rewriting an OR condition into a UNION ALL using LNNVL.

This is a description of a performance issue observed in the real world from the optimizer coming up with a CONCATENATION operation against many child operations including an INLIST operator and other children which then have to use LNNVL as a filter.

Really this is nothing more than an illustration of our old performance killing friend the function call from within SQL, particularly from within predicates, and the problem of context switching.

Ok. So, the real world example is a crazy query.
It contains 1091 OR clauses each stipulating a unique key lookup.
Yes, “1091” OR clauses.
I know, I know.
Probably itself a workaround to avoid the 1000 limit on an IN list, who knows?
Anyway.

I’m going to use a table just to illustrate the execution plan and then use some metrics from the execution of the real world example.

drop table t1;

create table t1
(col1 number
,col2 number
,col3 varchar2(10)
,constraint pk_t1 primary key (col1, col2));

insert into t1
select mod(rownum,10000)+1 col1
,      ceil(rownum/10000)  col2
,      rpad('X',10,'X')    col3
from   dual
connect by rownum <= 100000
order by col1, col2;

commit;

If we use a simple multi-column OR condition, we don’t get the desired LNNVLs.

explain plan for
select *
from   t1
where (col1 = 1
and    col2 = 1)
or    (col1 = 2
and    col2 = 2)
or    (col1 = 3
and    col2 = 3)
or    (col1 = 3
and    col2 = 4);

select * from table(dbms_xplan.display);

The optimizer just uses an INLIST iterator:

--------------------------------------------------------------------------------------
| Id  | Operation                    | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |       |  1129 | 37257 |     5   (0)| 00:00:01 |
|   1 |  INLIST ITERATOR             |       |       |       |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID| T1    |  1129 | 37257 |     5   (0)| 00:00:01 |
|*  3 |    INDEX UNIQUE SCAN         | PK_T1 |     7 |       |     4   (0)| 00:00:01 |
--------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   3 - access(("COL1"=1 AND "COL2"=1 OR "COL1"=2 AND "COL2"=2 OR "COL1"=3 AND 
              "COL2"=3 OR "COL1"=3 AND "COL2"=4))

However, if we combine an IN list with some of those ORs:

explain plan for
select *
from   t1
where (col1 = 1
and    col2 = 1)
or    (col1 = 2
and    col2 = 2)
or    (col1 = 3
and    col2 IN (3,4));

select * from table(dbms_xplan.display);

Which gives

---------------------------------------------------------------------------------------
| Id  | Operation                     | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |       |  1140 | 37620 |     2   (0)| 00:00:01 |
|   1 |  CONCATENATION                |       |       |       |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID | T1    |    11 |   363 |     1   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN           | PK_T1 |     1 |       |     2   (0)| 00:00:01 |
|   4 |   INLIST ITERATOR             |       |       |       |            |          |
|   5 |    TABLE ACCESS BY INDEX ROWID| T1    |  1129 | 37257 |     1   (0)| 00:00:01 |
|*  6 |     INDEX UNIQUE SCAN         | PK_T1 |     1 |       |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   3 - access("COL1"=3)
       filter("COL2"=3 OR "COL2"=4)
   6 - access(("COL1"=1 AND "COL2"=1 OR "COL1"=2 AND "COL2"=2))
       filter(LNNVL("COL1"=3) OR LNNVL("COL2"=3) AND LNNVL("COL2"=4))

We get the CONCATENATION of the simpler OR conditions which are combined into a single INLIST ITERATOR operations with another child operation which applies LNNVLs in the filter predicate.

Now imagine we have a much larger number of the simpler OR predicates combined with liberal scatterings of such IN clauses throughout our 1091 OR’d predicates.

Perhaps we’d be a bit concerned about the context switching and increased cpu usage?

Looking at my real world example right here… there’s no point me pasting in the thousands of lines.

If I run the real world behemoth:

Client elapsed time 108 seconds

DBMS_XPLAN.DISPLAY_CURSOR tells me it ran in 1.26 seconds:

plan hash value: 4262066066  
 
-------------------------------------------------------------------------------------------------------------------- 
| Id  | Operation                     | Name            | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  | 
--------------------------------------------------------------------------------------------------------------------  
|   0 | SELECT STATEMENT              |                 |      1 |        |   1628 |00:00:01.26 |    3352 |    418 | 
|   1 |  CONCATENATION                |                 |      1 |        |   1628 |00:00:01.26 |    3352 |    418 |  
|   2 |   INLIST ITERATOR             |                 |      1 |        |    966 |00:00:00.94 |    2735 |    389 | 
|   3 |    TABLE ACCESS BY INDEX ROWID| XXXXXXXXXX      |    966 |    624 |    966 |00:00:00.94 |    2735 |    389 | 
|*  4 |     INDEX RANGE SCAN          | XXXXXXXXXX_UK01 |    966 |    624 |    966 |00:00:00.86 |    1915 |    380 | 
|   5 |   TABLE ACCESS BY INDEX ROWID | XXXXXXXXXX      |      1 |      1 |      2 |00:00:00.01 |       4 |      0 |  
|*  6 |    INDEX RANGE SCAN           | XXXXXXXXXX_UK01 |      1 |      1 |      2 |00:00:00.01 |       3 |      0 |  
....
| 253 |   TABLE ACCESS BY INDEX ROWID | XXXXXXXXXX      |      1 |      2 |     47 |00:00:00.01 |      14 |      1 |  
|*254 |    INDEX RANGE SCAN           | XXXXXXXXXX_UK01 |      1 |      2 |     47 |00:00:00.01 |       3 |      1 |   
--------------------------------------------------------------------------------------------------------------------  

Predicate Information (identified by operation id):
--------------------------------------------------- 
   4 - access(((("XXXXXXXXXX"."VERSION"=1 AND "XXXXXXXXXX"."ID_XXXXXXXXXX"=1611722) OR             
              ("XXXXXXXXXX"."VERSION"=1 AND "XXXXXXXXXX"."ID_XXXXXXXXXX"=1611795) OR ("XXXXXXXXXX"."VERSION"=1 AND                                       
              "XXXXXXXXXX"."ID_XXXXXXXXXX"=1611863) OR ("XXXXXXXXXX"."VERSION"=1 AND
              "XXXXXXXXXX"."ID_XXXXXXXXXX"=1612023) OR ("XXXXXXXXXX"."VERSION"=1 AND 			  .....
  6 - access "XXXXXXXXXX"."ID_XXXXXXXXXX"=2046939) 
       filter(((LNNVL("XXXXXXXXXX"."VERSION"=1) OR LNNVL("XXXXXXXXXX"."ID_XXXXXXXXXX"=1611722)) AND 
           (LNNVL("XXXXXXXXXX"."VERSION"=1) OR LNNVL("XXXXXXXXXX"."ID_XXXXXXXXXX"=1611795)) AND 
              (LNNVL("XXXXXXXXXX"."VERSION"=1) OR LNNVL("XXXXXXXXXX"."ID_XXXXXXXXXX"=1611863)) AND 
.....

DBMS_MONITOR tells me the query ran in 1.68 seconds:

Global Information
 Status              :  DONE (ALL ROWS)     
 Instance ID         :  2                   
 Session ID          :  1220                
 SQL ID              :  5fvt3tfbgmqp3       
 SQL Execution ID    :  33554432            
 Plan Hash Value     :  4262066066          
 Execution Started   :  04/05/2013 14:53:42 
 First Refresh Time  :  04/05/2013 14:53:42 
 Last Refresh Time   :  04/05/2013 14:53:45 

--------------------------------------------------------------------
| Elapsed |   Cpu   |    IO    | Cluster  | Fetch | Buffer | Reads |
| Time(s) | Time(s) | Waits(s) | Waits(s) | Calls |  Gets  |       |
--------------------------------------------------------------------
|    1.68 |    0.50 |     1.13 |     0.05 |     4 |   3352 |   418 |
--------------------------------------------------------------------

Fortunately V$SQL tells me the real story:

select sql_id, executions, rows_processed, elapsed_time/1000/1000, cpu_time/1000/1000 
from v$sql where sql_id = '5fvt3tfbgmqp3';

Gives

SQL_ID        EXECUTIONS ROWS_PROCESSED ELAPSED_TIME/1000/1000 CPU_TIME/1000/1000
------------- ---------- -------------- ---------------------- ------------------
5fvt3tfbgmqp3          1           1628              105.81511         104.581102

Then again, if you write nasty code, nasty things can happen.

Which of my sql statements are using dynamic sampling?

From a reply I gave on an OTN forum thread, how to see all queries currently in memory which use dynamic sampling?

Dynamic sampling is an attribute in V$SQL_PLAN.OTHER_XML.

In 11gR1, it says “yes”.
In 11gR2, it gives the level.

Using EXTRACTVALUE which is deprecated in 11.2:

select p.sql_id,  extractvalue(h.column_value,'/info') lvl
from   v$sql_plan p
,      table(xmlsequence(extract(xmltype(p.other_xml),'/other_xml/info'))) h
where  p.other_xml is not null
and    extractvalue(h.column_value,'/info/@type') = 'dynamic_sampling';

Another approach using one of the prefered alternatives – XMLTABLE:

select p.sql_id, t.val
from   v$sql_plan p
,      xmltable('for $i in /other_xml/info
                 where $i/@type eq "dynamic_sampling"
                 return $i'
                passing xmltype(p.other_xml)
                columns attr varchar2(50) path '@type',
                        val  varchar2(50) path '/') t
where  p.other_xml is not null;

Warning: I tried a number of approaches with XMLTABLE and whilst they all worked in 11.2.0.3, in 11.1.0.7 they all ran into the error below. It might well just be a local issue but I did not investigate:

ORA-00600: internal error code, arguments: [qctVCO:csform], [0], [0], [0], [0], [112], [2], [224], [], [], [], [] 

Fun with Distributed Transactions II

Short on examples, long on words…

Last week I did a quick post about a couple of the more obscure implications of using distributed transactions, in particular:

  • The current impact on subquery materialisations
  • The incompatibility with MVs
  • The unusual situation of transactions with locks but no sessions

Anyway, earlier today I was having a closer look at some “DFS lock handle” wait events in an 11gR1 RAC database (not because there was an obvious problem but because I saw a few of them an wanted to have a closer look).

On a DFS lock handle wait, you have to decode P1 to find out what it’s all about as discussed by Riyaj Shamsudeen.

In my case, these were mostly DX and BB enqueues which are related to my old friend the distributed transaction.

Why am I still banging on about distributed transactions?

Well, you might think that they’re not very common but in the JDBC world they seem to be everywhere.

XA transactions are used (overused) throughout the JDBC world and so they might well be very relevant to any database that you’ve got with a JDBC app sitting atop.

A common pattern seen is to take some message off a queue, do something related to that message in the database, and use an XA transaction so that both stand or fail together.

In my previous post, I mentioned an AskTom thread discussing Materialized Views and distributed transactions.

I added some thoughts to that thread and Tom’s observation was

that the java/jdbc world for some reasons wishes to use an external resource manager so they have to do two phase commits against a single database isn’t what I was talking about. when I’ve seen the XA stuff – there is typically *one* database involved and it makes everything really complex,
hard to understand and slower than it needs to be.

Which might well be a fair point but still doesn’t change the fact that if your insert/update/delete comes in on an XA, then being unable to use a fast refresh on commit MV is really quite a restriction.

Anyway, back to my observations on these enqueues.

There’s not much information related to the BB enqueue but I believe that it’s related to the coordination of global transactions on a RAC cluster.

From 11gR1 distributed transactions can be processed on any instance in the cluster and these BB enqueues seem to be part of that picture. Prior to this change, any branches of a distributed transaction had to execute on the same node.

Whilst we’re on the subject, inevitably there are some bugs around these, in particular the BB enqueue and GTX processes (processes introduced to manage these enhanced cross cluster distributed transaction features).

The DX enqueue is perhaps more interesting.

It looks after “tightly coupled distributed transactions”.

So, WTF are tightly coupled distributed transactions and how do they differ from loosely coupled distributed transactions?

Tightly coupled transactions enable other branches of the transaction – and from our database perspective we’re only talking about multiple sessions in the same database which are part of the same distributed transaction – to:

  • share each others locks and
  • see each others changes.

The DX enqueue helps manage this.

As part of this, what the DX enqueue does is make sure that only one transaction branch is actively executing SQL at any one time.

So that’s potentially a pretty hefty point of serialisation then.

For more information, the Oracle whitepaper “XA and Oracle controlled Distributed Transactions” linked to below is a really good resource particularly the section “Distributed Transactions and Database Locking”.

However… just to emphasise the point …

In a tightly coupled tranasction, the DX enqueue is obtained before executing any statement.

By contrast, loosely coupled tranasctions do not need to get this DX lock before executing a statement, i.e. no serialisation between different transaction branches.

As the developer’s guide below says “loosely coupled transaction branches result in greater concurrency.”

So, not only is there an overhead to XA transactions but there is an additional overhead to tightly coupled transactions.

And, how many applications really use multiple transaction branches in a single database.

Very, very few, I wager!

And if they do, how many of those applications appreciate the serialisation involved anyway?

Now depending on your version, you might not see the DX and BB enqueues in your enqueue statistics.

In 11.1.0.7 I don’t seem to see anything in V$ENQUEUE_STAT for DX or BB.

There are however some relevant session/system statistic buckets:

  • DX/BB enqueue lock foreground requests
  • DX/BB enqueue lock foreground wait time
  • DX/BB enqueue lock background gets
  • DX/BB enqueue lock background get time

For more information see:

RAC – It’s all about the node affinity

Or what happens when you don’t have node affinity.

Here’s an old, short post that was lying around unpublished.

SQL executed on the 14th on node 2, on 15th on node 1.

DT SQL ID Node Rows Execs Elapsed Cluster Wait Time
15-NOV 7287jpw5vtm8j 1 4129697 7591 32625 32175
14-NOV 7287jpw5vtm8j 2 19703872 30909 78 60

Timings in seconds.

32625 seconds is just over 9 hours, of which 32175 is cluster wait time.

Database services and application datasource config have now been reconfigured to the extent that there is node affinity and the data required for the workload on node 1 is completely segregated from the workload on node 2.

If you use RAC as a black box and you have problems with performance or problems with predictability of performance then look into whether node affinity, or a lack thereof, is your problem.

Fun with distributed transactions

Three items within:
Part 1 – Subquery materialisation not possible (fixed in 12c)
Part 2 – Materialised views
Part 3 – Lock scripts

Part 1 – Subquery materialisation not possible
Yesterday I was reminder of a familiar issue, officially listed as a bug, that can occur when combining subquery materialisation with distributed transactions.

I had previously tuned a query for a colleague, the best approach for which was using implicit materialised subquery factoring (implicit because it wasn’t hinted as /*+ materalize */ but was referenced multiple times in the main query) and I had provided such a solution, without even a thought to the performance threat that the query might be running in an XA transaction.

As a result, unsurprisingly with the benefit of hindsight, the query was not performing as expected from the application in said XA transaction.

Let me provide a refresher on the issue.

create table t1
as
select rownum col1
from   dual
connect by rownum <= 100000;

alter session set statistics_level = all;

with x as  
(select count(*) col1  
 from   t1)  
select x1.*, x2.*
from x x1, x x2;  

select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));

provides

PLAN_TABLE_OUTPUT 
---------------------------------------------------------------------------------------------
SQL_ID  aqsc5m0wtj7wn, child number 0
-------------------------------------                              
with x as   (select count(*) col1    from   t1)   select x1.*, x2.*
from x x1, x x2

Plan hash value: 1115620710 

---------------------------------------------------------------------------------------------
| Id  | Operation                  | Name                        | Starts | E-Rows | A-Rows |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |                             |      1 |        |      1 |
|   1 |  TEMP TABLE TRANSFORMATION |                             |      1 |        |      1 |
|   2 |   LOAD AS SELECT           |                             |      1 |        |      1 |
|   3 |    SORT AGGREGATE          |                             |      1 |      1 |      1 |
|   4 |     TABLE ACCESS FULL      | T1                          |      1 |    122K|    100K|
|   5 |   MERGE JOIN CARTESIAN     |                             |      1 |      1 |      1 |
|   6 |    VIEW                    |                             |      1 |      1 |      1 |
|   7 |     TABLE ACCESS FULL      | SYS_TEMP_0FD9D6717_BCA95716 |      1 |      1 |      1 |
|   8 |    BUFFER SORT             |                             |      1 |      1 |      1 |
|   9 |     VIEW                   |                             |      1 |      1 |      1 |
|  10 |      TABLE ACCESS FULL     | SYS_TEMP_0FD9D6717_BCA95716 |      1 |      1 |      1 |
---------------------------------------------------------------------------------------------

Note
-----  
   - dynamic sampling used for this statement
                  

 27 rows selected 

As you might expect, subquery materialisation using a temp table.

But if you execute the query within a distributed transaction, it will probably not use a temporary table (see original post above about why I say probably).

I’m using a simple insert script over a db link to start the distributed transaction but in my real world occurrence it was the JDBC XA driver.

drop table t2;

create table t2
(col1 number);

alter system flush shared_pool;

select count(*) from gv$sql where sql_text like 'with x%';

alter session set statistics_level = all;

insert into t2@mydba values(1);

with x as  
(select count(*) col1  
 from   t1)  
select x1.*, x2.*
from x x1, x x2;  

select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));

rollback;

Which gives:

PLAN_TABLE_OUTPUT  
---------------------------------------------------------------------------------------
SQL_ID  aqsc5m0wtj7wn, child number 0
-------------------------------------
with x as   (select count(*) col1    from   t1)   select x1.*, x2.*
from x x1, x x2

Plan hash value: 3433669518

---------------------------------------------------------------------------------------
| Id  | Operation            | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |      1 |        |      1 |00:00:00.08 |     314 |
|   1 |  NESTED LOOPS        |      |      1 |      1 |      1 |00:00:00.08 |     314 |
|   2 |   VIEW               |      |      1 |      1 |      1 |00:00:00.04 |     157 |
|   3 |    SORT AGGREGATE    |      |      1 |      1 |      1 |00:00:00.04 |     157 |
|   4 |     TABLE ACCESS FULL| T1   |      1 |    122K|    100K|00:00:00.01 |     157 |
|   5 |   VIEW               |      |      1 |      1 |      1 |00:00:00.04 |     157 |
|   6 |    SORT AGGREGATE    |      |      1 |      1 |      1 |00:00:00.04 |     157 |
|   7 |     TABLE ACCESS FULL| T1   |      1 |    122K|    100K|00:00:00.01 |     157 |
--------------------------------------------------------------------------------------- 

Note
-----
   - dynamic sampling used for this statement


 24 rows selected 

As mentioned, the distributed transaction bypasses materialisation which could have a very negative effect on the performance of a particular query where that materialisation was key.

Part II – Materialized views.

This trouble with distributed transactions reminds me of a particular bugbear with materialised views.

This restriction is listed here in the documentation

http://docs.oracle.com/cd/E11882_01/server.112/e10706/repmview.htm

I’m surprised that a bigger deal is not made of it, given the proliferation of XA drivers and distributed transactions in many enterprise applications.

This is demonstrated quite easily, again using a db link to do the distributed transaction.

drop table t2;

create table t2
(col1 number
,constraint pk_t2 primary key(col1));

create materialized view log on t2;

create materialized view mv2
build immediate
refresh fast on commit
as
select * from t2;

insert into t2@mydba values (1);

commit;

Which, on commit, gives:

SQL Error: ORA-02050: transaction 26.8.810414 rolled back, some remote DBs may be in-doubt
ORA-02050: transaction 26.8.810414 rolled back, some remote DBs may be in-doubt
ORA-02051: another session in same transaction failed
ORA-02063: preceding 2 lines from MYDBA
02050. 00000 -  "transaction %s rolled back, some remote DBs may be in-doubt"
*Cause:    network or remote failure in 2PC.
*Action:   Notify operations; remote DBs will automatically re-sync when the
           failure is repaired.

I keep meaning to double check that the same applies when using XA driver and Java but there’s no reason to think it wouldn’t be the same.

Tom Kyte has mentioned this several times on AskTom, “recently” here.

I’m not quite sure I get his comment about OLTP databases and distributed transactions but, not for the first time, I’ve found myself wanting to use MVs to mitigate certain performance problems and have run into this restriction

Part III – Locking scripts
There are a fair number of scripts out there on t’internet for showing / diagnosing locking.

During my special distributed transaction day yesterday, I was reminded that anything lock related that starts with V$SESSION or depends on / makes assumptions about SID/SESSION_ID can be misleading.

I initially thought something a little unusual was up because neither V$SESSION or ASH could tell me who was blocking a bunch of requested mode 6 TX locks. But I thought some of the blocking diagnosis was missing because of cross-node RAC complications.

But it actually was because there were transactions holding locks but no sessions.

Admittedly, this was a slightly strange situation, but essentially what happened was that an XA JDBC transaction that was also mixing with ActiveMQ ended up causing what I suppose would essentially be orphaned transactions, and locks held by transactions that had no associated sessions.

This may be an “exotic” situation but V$LOCKED_OBJECT, for example, exposes SESSION_ID but no SESSION_SERIAL# so itself can be very misleading if you go off joining willy nilly by SESSION_ID because SIDs get reused quickly.

Similarly, DBA_BLOCKERS and DBA_WAITERS both showed zilch.

A more reliable starting point was to ignore V$LOCKED_OBJECT.SESSION_ID and pay more attention to the transction information, joining to V$TRANSACTION and then outer joining to V$SESSION, like so:

select lo.inst_id
,      lo.object_id
,      lo.session_id
,      lo.os_user_name
,      lo.process
,      lo.locked_mode
,      ob.owner
,      ob.object_name
,      tx.addr
,      tx.start_time txn_start_time
,      tx.status
,      s.*
from   gv$locked_object lo
,      dba_objects      ob
,      gv$transaction    tx
,      gv$session        s
where  ob.object_id = lo.object_id
and    tx.xidusn    (+) = lo.xidusn
and    tx.xidslot   (+) = lo.xidslot
and    tx.xidsqn    (+) = lo.xidsqn
and    s.taddr      (+) = tx.addr; 

Which showed me, as mentioned, a whole bunch of locks in these orphaned, PREPARED distributed transactions dating back a few hours.

Follow

Get every new post delivered to your Inbox.

Join 75 other followers