Real Time SQL Monitoring

I’m a sucker for convenience.

I’ve always liked the /*+ GATHER_PLAN_STATISTICS */ hint when followed by

SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR(NULL,NULL,'ALLSTATS LAST'));

Yes, I know you can get the same information via other means but I don’t care, I love it (having regressed to a 9i site for nearly a year and only back to a civilised Oracle version for a couple of weeks, I’m probably guilty of a little overexuberance – it’s not often that can be said about me).

There’s always been a slight conundrum though in that you need to wait for your SQL to complete before you get your execution plan and runtime statistics back.

But, in 11g, wait no more.

You’ve got REAL TIME SQL MONITORING.

It’s cool, it’s convenient. What’s not to like?

I know I’ve been back off 11g for about a year now, but how did I miss this previously?

This is old news – but how come more people don’t use it? (Other than they’re not on 11, of course).

Obviously, there are GUIs and buttons and HTML versions, etc.

Me – I like my command line and text reports :)

Anyway, just to demo an example.

I’ve pasted the text output of the report into Excel because

  1. WordPress only allows me upload a small set of filetypes and
  2. Excel was the only one that could cope nicely with the width (“never mind the quality, feel the width” as they say).

This demo was done at the end of the SQL – i.e. it wasn’t real-time monitoring but if you do it whilst the SQL is running, there’s extra progress information which comes from V$SESSION_LONGOPS – i.e. roughly how far through that HASH JOIN are we?

Also, this statement was run in parallel and it produced a really nice summary of parallel execution details.

SELECT dbms_sqltune.report_sql_monitor('8dwq85mf22bs2') 
FROM   dual;

can produce a report a bit like this
forecast_sql_monitoring

Other links:
Performance Tuning Guide
Kerry Osborne – Real time SQL Monitoring
Greg Rahn – Real-time SQl Monitoring Using…

IO Calibration

I’ve mentioned elsewhere that I’m in the early stages of an upgrade – 9.2.0.8 on 32 bit Solaris to 11.2.0.2 on 64 bit Linux.
(So, not exactly an illustration in implementing change incrementally. :) )

Initial run of application code was slow, very slow. Across the board. Some plans had changed. But average IO seems much slower than the benchmarks using for comparison.

Statspack shows that a single block read takes on average 4 ms in 9.2.0.8 production.

For 11gR2 on the new kit (which isn’t destined for production but which is at least meant to be used to provide a ballpark comparison), AWR shows average single block read times of more like 14ms.

But we’re not really comparing apples with apples. So many things are different, it’s more like apples and oranges – massive jump in Oracle version, new kit, different OS, different configurations, direct IO, etc, etc.

I have been advocating simplicity when approaching problems, especially with so many changes – divide & conquer, distill & simplify.

For example, let’s say that IO looks slow.

For now, let’s not use the application batch to test kernel settings, ODM, etc
(Don’t get me wrong – it will be important to use a workload absolutely representative of the the application, I just don’t think it might not be now).

Let’s benchmark using, for example, Oracle ORIONto see if we can achieve IO rates that someone, somewhere must know we need – they’ve purchased the kit after all – and then use that to test changes, use that to compare new environments against a benchmark, etc.
(Then, when we’re happy with simpler things, we can start adding back in and piecing things together again)

Anyway, today I came across an alternative to Orion that I can’t recall being previously aware of.

DBMS_RESOURCE_MANAGER.CALIBRATE_IO

I like – it seems pretty neat to me.

Yet another feature which seems to have sneaked under my radar somehow (along with real time SQL monitoring – a quick blog post mention which I’ve half written but not published yet).

What does this CALIBRATE_IO do?

It’s like an easier way to do the ORION thing.

From the CALIBRATE_IO.PDF available via Metalink Doc Id 727062.1 “Configuring and using Calibrate I/O”:

When Calibrate I/O is invoked it will generate I/O intensive read-only random I/O (db_block_size)
and large-block (1MByte) sequential I/O workloads. Unlike various external I/O calibration tools, this
tool uses the Oracle code stack and runs in the database, issuing I/O against blocks stored in the
database. The results, therefore, much more closely match the actual database performance.
Once the workload execution is completed, a summary of the results is provided.

I ran it like this (DBA for env is away, SysAdmin sort of suggested 15 as NUM_DISKS but when I ran with 15 initially, it ran forever and it looked like the waits weren’t instrumented properly.)

SQL> declare
  2    l_latency   integer;
  3    l_iops      integer;     
  4    l_mbps      integer; 
  5  begin    
  6    dbms_resource_manager.calibrate_io
  7    (5,10,l_iops,l_mbps,l_latency);
  8    dbms_output.put_line ('I/O Ops/sec = '||l_iops);
  9    dbms_output.put_line ('Actual Latency = '||l_latency);
 10    dbms_output.put_line ('MB/sec = '||l_mbps);
 11  end;
 12  /

And got the following output:

I/O Ops/sec = 52
Actual Latency = 18
MB/sec = 93

You can also get this information by querying DBA_RSRC_IO_CALIBRATE:

SQL> select * from dba_rsrc_io_calibrate;

START_TIME                        END_TIME                            MAX_IOPS   MAX_MBPS  MAX_PMBPS    LATENCY NUM_PHYSICAL_DISKS
--------------------------------- --------------------------------- ---------- ---------- ----------
04-MAR-11 12.55.31.610306 PM      04-MAR-11 01.08.30.898526 PM              52         93          7         18                  5

To be honest, I have no feel for some of these numbers – no frame of reference.
But it’s not great, right?

Some other articles on CALIBRATE_IO:
Jose Valerio – DBMS_RESOURCE_MANAGER.CALIBRATE_IO 11gR2
Tim Hall – Measuring Storage Performance for Oracle Systems
Arup Nanda – Resource Manager IO Callibration in 11g

11g extended SQL_TRACE event

Not for the first time, I’ve gone looking for more detail on this extended functionality in 11g.
And not for the first time, I’ve started off looking for dbms_monitor extended functionality and not found it and then wasted time trying to remember what I’m looking for:

Note to self – what I’m looking for is not dbms_monitor but simply:

alter session|system set events 'sql_trace[SQL:<sql_id>]...';

For further details, see:
http://oraclue.com/2009/03/24/oracle-event-sql_trace-in-11g/
http://tech.e2sn.com/oracle/troubleshooting/oradebug-doc

Upgrade experiences

I’m planning for an Oracle upgrade, 9i to 11gR2. Cross-platform – Sun Solaris to Linux. Different Endian.

It’s just under 1TB so not big these days.

Downtime shouldn’t be too much of an issue. It’s not like it needs to be done in an hour or anything demanding like that. As long as it can be done within a day.

Within the client, there’s no official build for 11g on Solaris, so I think that might rule out solutions like an in situ upgrade to 11g on Solaris followed immediately by the move & conversion to Linux.

Imp/Exp seems to be the recommended approach given all of the above.

There are lots of resources out there to help with an upgrade.

So, no shortage of advice and guidance.

But I’m also interested in hearing some real world experiences.

So, if you’ve been through something similar, please let me know.

Tell me how you did it and how it went.

Not waiting and not on SQL*Net Message from client

I’m in the early stages of investigating a performance degradation on 11.1.0.6, just tracing the session at the moment and waiting for it to complete.

This is distilled from a POC for an application persisting generic messages off a message bus.

At the moment, there’s no requirement to query/inspect the messages themselves so they’re being passed down as a BLOB.

I’ve reproduced the problem with a standalone piece of PL/SQL which removes any distractions from the rest of the application tech stack. This will be useful to raise an SR if necessary.

Performance degradation for a collection of 1000 blobs goes from roughly 1-2 seconds on a 9.2.0.8 corporate DB or on 10.2.0.4 on my laptop to currently over 1 hour on 11.1.0.6 corporate DB.

Obviously there are a lot of variations between environments and versions that might account for reasonable differences in performance.
I’ve not yet started investigating any of these differences in settings, parameters, etc but there’s something very bad about this sort of degradation.

Current session wait says that I’ve been waiting on SQL*Net message from client for pretty much all this time. So, a good illustration that this is wait is not always an idle event.

Edit 1
I’ve been a bit dim on this one. I’m NOT waiting on SQL*Net message from client. But I am in a strange place somewhere.

I was being lazy, using Toad to monitor and not looking at the information properly – stupid, stupid, stupid (not that Toad doesn’t display the requisite information properly, just that when you look at things from different perspectives sometimes you miss something that you would have seen when looking at it in a different way).

My session is active. But my wait state is WAITED_KNOWN_TIME so that wait was the last known wait, WAIT_TIME is > 0, SECONDS_IN_WAIT is ticking up to 20000.

Session statistics show that session is doing stuff.

Need to look at this from OS upwards – oradebug, pstack, etc. Or get someone to do it because I don’t have access :(
End Edit 1

I’m currently gathering the evidence. It’s a bit slow because I don’t have full access to the 11g environment.

SQL to reproduce is below.

DROP TABLE sr_payload;
DROP TYPE tt_sr_payload;
DROP TYPE to_sr_payload;

CREATE TABLE sr_payload
(id         RAW(16) NOT NULL
,lastupdate DATE    NOT NULL
,payload    BLOB    NOT NULL)
LOB (payload) 
STORE AS sr_payload_payload
 (NOCACHE);
 
ALTER TABLE sr_payload ADD CONSTRAINT pk_sr_payload PRIMARY KEY (id);

CREATE OR REPLACE TYPE to_sr_payload AS OBJECT
(id         RAW(16)
,lastupdate DATE
,payload    BLOB);
/

CREATE OR REPLACE TYPE tt_sr_payload AS TABLE OF to_sr_payload;
/

CREATE OR REPLACE PACKAGE pkg_payload_sr
AS
  --
  PROCEDURE p_upsert_payload_mtc (
    i_payload                  IN     tt_sr_payload
  );
  -- 
  PROCEDURE p_upsert_payload_itc (
    i_payload                  IN     tt_sr_payload
  );
  --
END;
/

CREATE OR REPLACE PACKAGE BODY pkg_payload_sr
AS
  --
  PROCEDURE p_upsert_payload_mtc (
    i_payload                  IN     tt_sr_payload
  )
  AS
  BEGIN
     --
     MERGE
     INTO  sr_payload p
     USING (SELECT id         id
            ,      lastupdate lastupdate
            ,      payload    payload
            FROM   TABLE ( CAST ( i_payload AS tt_sr_payload) ) )x
     ON    (p.id = x.id)
     WHEN MATCHED THEN
            UPDATE
            SET    payload    = empty_blob()
            ,      lastupdate = SYSDATE
     WHEN NOT MATCHED THEN
          INSERT
          (      id
          ,      lastupdate
          ,      payload)
          VALUES
          (      x.id
          ,      x.lastupdate
          ,      x.payload);
     --
  END p_upsert_payload_mtc;
  -- 
  PROCEDURE p_upsert_payload_itc (
    i_payload                  IN     tt_sr_payload
  )
  AS
  BEGIN
     --
     INSERT
     INTO   sr_payload
     (      id
     ,      lastupdate
     ,      payload)
     SELECT x.id         id
     ,      x.lastupdate lastupdate
     ,      x.payload    payload
     FROM   TABLE ( CAST ( i_payload AS tt_sr_payload) ) x
     WHERE  NOT EXISTS (SELECT 1
                        FROM   sr_payload p
                        WHERE  p.id = x.id);  
     --
  END p_upsert_payload_itc;
  --
END pkg_payload_sr;
/

DECLARE  
 l_start NUMBER;  
 l_stop  NUMBER;  
 l_pl    tt_sr_payload := tt_sr_payload(); 
BEGIN  
  FOR i IN 1 .. 1000 
  LOOP
      l_pl.EXTEND();      
      l_pl(l_pl.LAST) := to_sr_payload(SYS_GUID,SYSDATE,TO_BLOB('12312312312'));  
  END LOOP;
  dbms_output.put_line('  ');
  l_start := dbms_utility.get_time(); 
  pkg_payload_sr.p_upsert_payload_itc(l_pl); 
  l_stop  := dbms_utility.get_time(); 
  dbms_output.put_line('Start pl: '||l_start); 
  dbms_output.put_line('Stop  pl: '||l_stop); 
  COMMIT;
END;
/

DROP PACKAGE pkg_payload_sr;
DROP TABLE sr_payload;
DROP TYPE tt_sr_payload;
drop type to_sr_payload;

Edit 2
Some results from ps and pstack** shows:

XXXXXXX$ ps -o user,pid,s,pcpu,time,etime,wchan,comm -p 27974

USER       PID S %CPU     TIME     ELAPSED WCHAN  COMMAND
oracle   27974 R 99.2 05:36:56    05:39:35 -      oracle

XXXXXXXX$ pstack 27974
#0  0x000000000718ab58 in kolrgrfc ()
#1  0x00000000019fa69f in kolrrdl ()
#2  0x00000000019fa7ce in kolradc ()
#3  0x0000000000e0df3f in __PGOSF350_kokegPromoteLob ()
#4  0x0000000000e0decb in kokegApplyOp ()
#5  0x0000000000e0de8e in kokegOpnGarbagePromote ()
#6  0x0000000000e0dc14 in kokegGarbageCollectBinds ()
#7  0x000000000705edb3 in kxsiis ()
#8  0x00000000070d0074 in __PGOSF445_qerltcNoKdtBufferedInsRowCBK ()
#9  0x000000000151924b in qerltcLoadStateMachine ()
#10 0x000000000151b64f in __PGOSF455_qerltcInsertSelectRop ()
#11 0x0000000003668205 in qerstRowP ()
#12 0x0000000003668205 in qerstRowP ()
#13 0x00000000014cd178 in qerjoFetch ()
#14 0x00000000036683f6 in qerstFetch ()
#15 0x0000000006f60b39 in rwsfcd ()
#16 0x00000000036683f6 in qerstFetch ()
#17 0x00000000070ce5d2 in qerltcFetch ()
#18 0x00000000036683f6 in qerstFetch ()
#19 0x000000000702a1db in insexe ()
#20 0x00000000070dc89a in opiexe ()
#21 0x00000000070e4513 in opipls ()
#22 0x0000000006fc979a in opiodr ()
#23 0x00000000070671b9 in __PGOSF123_rpidrus ()
#24 0x0000000007189474 in skgmstack ()
#25 0x0000000007067525 in rpidru ()
#26 0x00000000070667c0 in rpiswu2 ()
#27 0x0000000007065de3 in rpidrv ()
#28 0x000000000704e8c3 in psddr0 ()
#29 0x000000000704e594 in psdnal ()
#30 0x00000000071ce501 in pevm_EXECC ()
#31 0x00000000071c74ca in pfrinstr_EXECC ()
#32 0x00000000071c6721 in pfrrun_no_tool ()
#33 0x00000000071c51bd in pfrrun ()
#34 0x00000000071cb2ba in plsql_run ()
#35 0x00000000071c1f54 in peicnt ()
#36 0x00000000071c11fa in kkxexe ()
#37 0x00000000070dee11 in opiexe ()
#38 0x000000000154dbc1 in kpoal8 ()
#39 0x0000000006fc979a in opiodr ()
#40 0x000000000713cd4b in ttcpip ()
#41 0x0000000000e1630f in opitsk ()
#42 0x0000000000e18c50 in opiino ()
#43 0x0000000006fc979a in opiodr ()
#44 0x0000000000e12244 in opidrv ()
#45 0x00000000011b393a in sou2o ()
#46 0x0000000000920183 in opimai_real ()
#47 0x00000000011b7871 in ssthrdmain ()
#48 0x00000000009200af in main ()

Reading from bottom to top and using Metalink note 175982.1, the kol functions deal with “support for object Lob buffering , object lob evaluation and object Language/runtime functions for Opaque types” – sounds about right for my test case.

* I’m not saying that sending down a collection of 1000 blobs is necessarily a good thing but I’m just using it to compare good and bad.

** This is just a test case on a dev box. Note what Tanel has to say particularly about using pstack on Linux on production.

Documentation in need of confidence booster

Some people attach a lot of weight to what the Oracle documentation says.

Rightly so, most of the time. It is a hugely significant resource.

But there are some documentation bugs / errors.

I was answering a character set posting on the OTN forums and when providing some links to the doco, I noticed the quote below in the official documentation:


I think these secions that describe encoding are out of date. For example, do Java and Microsoft Windows still use UCS-2? Or do they use UTF-16? I think UNIX uses UTF-32. So I am confused about what is the most current information for these sections.

How reassuring ;)

Beware the compound trigger

In 11g, Oracle introduced a new feature in the compound trigger.

Through many years and versions, the solution to the mutating table / ora-04091 problem was often a clumsy combination of packaged global, before statement trigger, after row trigger and after statement trigger.

However, in 11g, the introduction of the compound trigger provided a much more compact solution, where you could declare your global, your statement level logic and any row level code in a single trigger.

Examples from the Oracle documentation:

Trouble is, under certain simple circumstances, they don’t work.

Take this illustration abstracted from a proposed VPD labelling solution on 11.1.0.6.

The illustration is a little artificial because all distractions have been removed but it’s taken from a real world example.

Essentially, three updates via a package cause an ORA-07445 but three direct update statements do not. Take a look:


dominic@DOM11G>prompt ---------- cleanup previous creates
---------- cleanup previous creates
dominic@DOM11G>drop table compound_trigger_test_tab1;

Table dropped.

Elapsed: 00:00:01.02
dominic@DOM11G>drop package compound_trigger_test_pkg;

Package dropped.

Elapsed: 00:00:00.00
dominic@DOM11G>
dominic@DOM11G>prompt ---------- create table
---------- create table
dominic@DOM11G>
dominic@DOM11G>create table compound_trigger_test_tab1
  2  (col1 number
  3  ,col2 number);

Table created.

Elapsed: 00:00:00.00
dominic@DOM11G>
dominic@DOM11G>prompt ---------- insert data
---------- insert data
dominic@DOM11G>
dominic@DOM11G>insert into compound_trigger_test_tab1 values (1,1);

1 row created.

Elapsed: 00:00:00.00
dominic@DOM11G>
dominic@DOM11G>
dominic@DOM11G>prompt ---------- create compound trigger
---------- create compound trigger
dominic@DOM11G>
dominic@DOM11G>CREATE OR REPLACE TRIGGER compound_trigger_test_tri1
  2  FOR INSERT OR UPDATE ON compound_trigger_test_tab1
  3  REFERENCING NEW AS NEW OLD AS OLD
  4  COMPOUND TRIGGER
  5   --
  6   TYPE r_ctt IS RECORD 
  7   (col1    NUMBER
  8   ,col2    NUMBER);
  9   --
 10   TYPE t_ctt IS TABLE OF r_ctt
 11   INDEX BY PLS_INTEGER;
 12   --
 13   vt_ctt t_ctt;
 14   --
 15  BEFORE STATEMENT
 16  IS
 17  BEGIN
 18     --
 19     DBMS_OUTPUT.PUT_LINE('compound_trigger_test_tri1::before statement');
 20     NULL;
 21     --
 22  END BEFORE STATEMENT;
 23  --
 24  BEFORE EACH ROW
 25  IS
 26  BEGIN
 27      --
 28      DBMS_OUTPUT.PUT_LINE('compound_trigger_test_tri1::before row');
 29      NULL;
 30      --
 31  END BEFORE EACH ROW;
 32  --
 33  AFTER EACH ROW
 34  IS
 35  BEGIN
 36     --
 37     DBMS_OUTPUT.PUT_LINE('compound_trigger_test_tri1::after row');
 38     NULL;
 39     --
 40  END AFTER EACH ROW;
 41  --
 42  AFTER STATEMENT
 43  IS
 44    i PLS_INTEGER;
 45  BEGIN
 46     --
 47     DBMS_OUTPUT.PUT_LINE('compound_trigger_test_tri1::after statement');
 48     i := vt_ctt.FIRST;
 49     --
 50     WHILE (i IS NOT NULL)
 51     LOOP
 52         --
 53         NULL;
 54         --
 55         i := vt_ctt.NEXT(i);
 56         --
 57     END LOOP;
 58     --
 59  END AFTER STATEMENT;
 60  --
 61  END compound_trigger_test_tri1;
 62  /

Trigger created.

Elapsed: 00:00:00.00
dominic@DOM11G>
dominic@DOM11G>show errors
No errors.
dominic@DOM11G>
dominic@DOM11G>
dominic@DOM11G>prompt ---------- create package header
---------- create package header
dominic@DOM11G>
dominic@DOM11G>CREATE OR REPLACE PACKAGE compound_trigger_test_pkg
  2  AS
  3    --
  4    PROCEDURE p_update_tab1 (
  5      i_col1                     in     compound_trigger_test_tab1.col1%TYPE
  6    );
  7    --
  8  END;
  9  /

Package created.

Elapsed: 00:00:00.00
dominic@DOM11G>
dominic@DOM11G>
dominic@DOM11G>prompt ---------- create package body
---------- create package body
dominic@DOM11G>
dominic@DOM11G>CREATE OR REPLACE PACKAGE BODY compound_trigger_test_pkg
  2  AS
  3    --
  4    PROCEDURE p_update_tab1 (
  5      i_col1                     in     compound_trigger_test_tab1.col1%TYPE
  6    )
  7    AS
  8    BEGIN
  9       --
 10       DBMS_OUTPUT.PUT_LINE('compound_trigger_test_pkg.p_update_tab1');
 11       UPDATE compound_trigger_test_tab1
 12       SET    col2 = col2
 13       WHERE  col1 = i_col1;
 14       --
 15    END p_update_tab1;
 16  END;
 17  /

Package body created.

Elapsed: 00:00:00.00
dominic@DOM11G>
dominic@DOM11G>
dominic@DOM11G>prompt ---------- run update via package test 1 - should be ok
---------- run update via package test 1 - should be ok
dominic@DOM11G>exec compound_trigger_test_pkg.p_update_tab1(1);

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.00
dominic@DOM11G>
dominic@DOM11G>prompt ---------- run update via package 2 - should be ok
---------- run update via package 2 - should be ok
dominic@DOM11G>exec compound_trigger_test_pkg.p_update_tab1(2);

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.00
dominic@DOM11G>
dominic@DOM11G>prompt ---------- run update via package 3 - should go bang
---------- run update via package 3 - should go bang
dominic@DOM11G>exec compound_trigger_test_pkg.p_update_tab1(3);
BEGIN compound_trigger_test_pkg.p_update_tab1(3); END;

*
ERROR at line 1:
ORA-03113: end-of-file on communication channel


Elapsed: 00:00:00.05
dominic@DOM11G>conn dominic@dom11g
Enter password: *******
Connected.
dominic@DOM11G>UPDATE compound_trigger_test_tab1
  2  SET    col2 = col2
  3  WHERE  col1 = 1;

1 row updated.

Elapsed: 00:00:00.00
dominic@DOM11G>
dominic@DOM11G>UPDATE compound_trigger_test_tab1
  2  SET    col2 = col2
  3  WHERE  col1 = 2;

0 rows updated.

Elapsed: 00:00:00.00
dominic@DOM11G>
dominic@DOM11G>UPDATE compound_trigger_test_tab1
  2  SET    col2 = col2
  3  WHERE  col1 = 3;

0 rows updated.

Elapsed: 00:00:00.00
dominic@DOM11G>

Alert log and trace file reveal the cause to be an ORA-07445:


Exception [type: SIGSEGV, Address not mapped to object] [ADDR:0x448] [PC:0x1B6A3B8, kkxtexe()+988]
Errors in file /app/oracle/diag/rdbms/dom11g/DOM11G/trace/dom11G_ora_4860.trc  (incident=845912):
ORA-07445: exception encountered: core dump [kkxtexe()+988] [SIGSEGV] [ADDR:0x448] [PC:0x1B6A3B8] [Address not mapped to object] []
ORA-21779: duration not active

So, for the moment, I would not go near them.
I’ve got an SR raised (it’s all I seem to doing these days is find bugs in 11.1.0.6 (more on the incompatibility between VPD and Materialized Views another time) and 11.1.0.7)
It’s back to multiple row and statement level triggers.

Follow

Get every new post delivered to your Inbox.

Join 72 other followers