Thoughts on TM_DELTA_TIME

What are these columns in V$ACTIVE_SESSION_HISTORY – TM_DELTA_TIME, TM_DELTA_CPU_TIME and TM_DELTA_DB_TIME?

What does the doco say?

  • TM_DELTA_TIME – Time interval (in microseconds) over which TM_DELTA_CPU_TIME and TM_DELTA_DB_TIME are accumulated
  • TM_DELTA_CPU_TIME – Amount of time this session spent on CPU over the last TM_DELTA_TIME microseconds
  • TM_DELTA_DB_TIME – Amount of time spent by this session in database calls over the last TM_DELTA_TIME microseconds

Need a reminder on TIME MODEL STATISTICS?

The most important of the time model statistics is DB time.

This statistics represents the total time spent in database calls and is an indicator of the total instance workload.

It is calculated by aggregating the CPU and wait times of all sessions not waiting on idle wait events (non-idle user sessions).

i.e. DB TIME is time spent either actively working or actively waiting

What this effectively means is that TM_DELTA_CPU_TIME and TM_DELTA_DB_TIME have no context without TM_DELTA_TIME.

So when looking at TM_DELTA_CPU_TIME and TM_DELTA_DB_TIME, all other sample times/periods are pretty much irrelevant apart from TM_DELTA_TIME – the latter is the only value that gives context to the other two.

TM_DELTA_DB_TIME is time spent actively working or actively waiting, some of this might be time spent on CPU – TM_DELTA_CPU_TIME – the rest will be time spent actively waiting on some non-idle wait events plus some overheads no doubt.

And TM_DELTA_DB_TIME should never exceed TM_DELTA_TIME, right?

The difference between TM_DELTA_TIME and TM_DELTA_DB_TIME should be idle time.

Sounds like something that might be quite useful particularly to counterbalance some of the pitfalls of looking at sampled data?

How might we investigate this further?

Shall we start with a plsql loop that should do nothing but sit on CPU for a short while?

SQL> select * from v$version;

BANNER
-------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
PL/SQL Release 11.2.0.3.0 - Production
CORE    11.2.0.3.0      Production
TNS for Linux: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production

SQL> select distinct sid from v$mystat;

       SID
----------
       642

SQL> alter session set plsql_optimize_level = 0;

Session altered.

SQL> set timing on
SQL> DECLARE
  2  BEGIN
  3   FOR i IN 1 .. 1000000000 LOOP
  4     NULL;
  5   END LOOP;
  6  END;
  7  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:16.46
SQL> 

What does ASH say?

SQL> select to_char(sample_time,'HH24:MI:SS')
  2  ,      is_awr_sample
  3  ,      event
  4  ,      session_state
  5  ,      to_char(tm_delta_time) tm_delta_time
  6  ,      to_char(tm_delta_cpu_time) tm_delta_cpu_time
  7  ,      to_char(tm_delta_db_time) tm_delta_db_time
  8  from   v$active_session_history
  9  where  session_id = 642
 10  and    sql_id     = '6b5drmtrvs6vv'
 11  order by sample_id desc;

TO_CHAR( I EVENT                     SESSION TM_DELTA_TIME   TM_DELTA_CPU_TIME TM_DELTA_DB_TIME
-------- - ------------------------- ------- --------------- ----------------- ----------------
14:45:25 N                           ON CPU
14:45:24 N                           ON CPU  2019831         2016694           2019831
14:45:23 N                           ON CPU
14:45:22 N                           ON CPU  2019978         2017693           2019978
14:45:21 N                           ON CPU
14:45:20 N                           ON CPU  2019848         2017694           2019848
14:45:19 N                           ON CPU
14:45:18 Y                           ON CPU
14:45:17 N                           ON CPU  2019934         2017693           2019934
14:45:16 N                           ON CPU
14:45:15 N                           ON CPU  2019803         2017693           2019803
14:45:14 N                           ON CPU
14:45:13 N                           ON CPU  2019925         2017693           2019925
14:45:12 N                           ON CPU
14:45:11 N                           ON CPU  4711327         1921708           1926278
14:45:10 N                           ON CPU  40504273        130979            134479

16 rows selected.

SQL> 

So, in this example, roughly every two seconds I get feedback of what my session was doing for that DELTA TIME period.

Unsurprisingly, given what we were doing, the session mainly spent 100% of it’s time in TM_DELTA_DB_TIME actively working, and TM_DELTA_CPU_TIME is the major component of that.

The difference between CPU_TIME – DB_TIME would be the time spent actively waiting plus some overhead.

If did more waiting on a non-idle wait event – IO for example – our DB_TIME would be similar but the element of CPU_TIME would be smaller, e.g.

SQL> DECLARE
  2  BEGIN
  3   FOR i IN (SELECT * FROM all_source) LOOP
  4     NULL;
  5   END LOOP;
  6  END;
  7  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:37.82

which gives

SQL> select to_char(sample_time,'HH24:MI:SS')
  2  ,      is_awr_sample
  3  ,      event
  4  ,      session_state
  5  ,      to_char(tm_delta_time) tm_delta_time
  6  ,      to_char(tm_delta_cpu_time) tm_delta_cpu_time
  7  ,      to_char(tm_delta_db_time) tm_delta_db_time
  8  from   v$active_session_history
  9  where  session_id = 642
 10  and    sql_id     IN ('2nj4dm3qwmkqh','34y8jhfh5w4q9')
 11  order by sample_id desc;

TO_CHAR( I EVENT                     SESSION TM_DELTA_TIME   TM_DELTA_CPU_TIME TM_DELTA_DB_TIME
-------- - ------------------------- ------- --------------- ----------------- ----------------
15:05:43 N                           ON CPU  2019899         1089835           2019899
15:05:42 Y db file scattered read    WAITING
15:05:41 N db file scattered read    WAITING 2020879         1359793           2020879
15:05:40 N                           ON CPU
15:05:39 N db file scattered read    WAITING 2022891         1487774           2022891
15:05:38 N db file scattered read    WAITING
15:05:37 N db file scattered read    WAITING 2021913         1123828           2021913
15:05:36 N                           ON CPU
15:05:35 N db file scattered read    WAITING 2036914         953856            2036914
15:05:34 N                           ON CPU
15:05:33 N db file scattered read    WAITING 1017936         686895            1017936
15:05:32 Y db file scattered read    WAITING 2039867         897864            2039867
15:05:31 N db file scattered read    WAITING
15:05:30 N db file scattered read    WAITING 1928599         1035843           1928599
15:05:29 N                           ON CPU
15:05:28 N                           ON CPU  2069218         1185819           2069218
15:05:27 N                           ON CPU
15:05:26 N                           ON CPU  2092901         840872            2092901
15:05:25 N                           ON CPU
15:05:24 N db file scattered read    WAITING 1998272         1044841           1998272
15:05:23 N db file scattered read    WAITING
15:05:22 Y db file scattered read    WAITING 2052570         1131828           2052570
15:05:21 N                           ON CPU
15:05:20 N db file scattered read    WAITING 2021863         1411786           2021863
15:05:19 N                           ON CPU
15:05:18 N db file scattered read    WAITING 1674137         797879            1674137
15:05:17 N db file scattered read    WAITING
15:05:16 N                           ON CPU  2350338         270958            2350338
15:05:15 N                           ON CPU
15:05:14 N db file scattered read    WAITING 1944057         165975            1944057
15:05:13 N db file scattered read    WAITING
15:05:12 Y db file scattered read    WAITING 2078929         1194819           2078929
15:05:11 N db file scattered read    WAITING
15:05:10 N                           ON CPU  2052015         1266807           2052015
15:05:09 N                           ON CPU
15:05:08 N db file scattered read    WAITING 985873021       915860            1846906
15:05:07 N                           ON CPU
15:05:06 N                           ON CPU  6746807         836873            843056

82 rows selected.

SQL> 

As expected, we see from TM_DELTA_DB_TIME that we are mainly spending 100% of our time actively working or actively waiting.

But, because we’ve had to do some IO, the proportion of that time which is CPU has reduced and the difference between TM_DELTA_DB_TIME and TM_DELTA_CPU_TIME should be roughly the time spent actively waiting during that TM_DELTA_TIME.

But in this output immediately above, do we start to see some of the limitations in this data?

What about this?

TO_CHAR( I EVENT                     SESSION TM_DELTA_TIME   TM_DELTA_CPU_TIME TM_DELTA_DB_TIME
-------- - ------------------------- ------- --------------- ----------------- ----------------
15:05:08 N db file scattered read    WAITING 985873021       915860            1846906

Why the big TM_DELTA_TIME all of a sudden? What’s that all about?

I’ve no idea and if anyone does then please comment.

I would presume that it’s a bug in the accounting.

And if it is a bug, it seriously limits the usefulness of this data for now and the ability to build diagnostic scripts on top of it.

So, moving quickly to how this data is stored down into the repository – DBA_HIST_ACTIVE_SESS_HISTORY.

You probably already know that every 10th second’s worth of the ASH buffer is marked for storage in the repository, as marked by V$ACTIVE_SESSION_HISTORY.IS_AWR_SAMPLE (included in the output above).

From the last buffer output above executing the loop of cpu+io workload, it should be these rows:

TO_CHAR( I EVENT                     SESSION TM_DELTA_TIME   TM_DELTA_CPU_TIME TM_DELTA_DB_TIME
-------- - ------------------------- ------- --------------- ----------------- ----------------
15:05:42 Y db file scattered read    WAITING
15:05:32 Y db file scattered read    WAITING 2039867         897864            2039867
15:05:22 Y db file scattered read    WAITING 2052570         1131828           2052570
15:05:12 Y db file scattered read    WAITING 2078929         1194819           2078929

My initial thoughts were to wonder whether these TM_DELTA figures are persisted “as is”.
In which case, their value would be seriously limited.

Don’t want to wait for the flushing of ASH buffer to the repository?
We can force the flush by creating an AWR snapshot.

SQL> exec dbms_workload_repository.create_snapshot;

PL/SQL procedure successfully completed.

SQL> 

And wham, bam, we can check out what has actually been persisted to DBA_HIST_ACTIVE_SESSION_HISTORY:

SQL> select to_char(sample_time,'HH24:MI:SS')
  2  ,      event
  3  ,      session_state
  4  ,      to_char(tm_delta_time) tm_delta_time
  5  ,      to_char(tm_delta_cpu_time) tm_delta_cpu_time
  6  ,      to_char(tm_delta_db_time) tm_delta_db_time
  7  from   dba_hist_active_sess_history
  8  where  session_id = 642
  9  and    sql_id     IN ('2nj4dm3qwmkqh','34y8jhfh5w4q9')
 10  order by sample_id desc;

TO_CHAR( EVENT                     SESSION TM_DELTA_TIME   TM_DELTA_CPU_TIME TM_DELTA_DB_TIME
-------- ------------------------- ------- --------------- ----------------- ----------------
15:05:42 db file scattered read    WAITING 9120533         5612146           703679604684133
15:05:32 db file scattered read    WAITING 10128857        7202905           281471853489246
15:05:22 db file scattered read    WAITING 10042965        7493861           20148167
15:05:12 db file scattered read    WAITING 1194282071      20199931          31190611

9 rows selected.

SQL> 

So, the good news would be that the values from the buffer are not just copied down but seem to be an aggregation, presumably of the sampled rows. I’m not sure because some of the numbers add up, some don’t.

But the big, bad news is that there are similar accounting bugs – note the values in the first two rows for TM_DELTA_DB_TIME.

It should not be possible for TM_DELTA_DB_TIME to be greater than TM_DELTA_TIME.

As a result, this data seems occasionally unreliable and this unreliability undermines it’s usefulness.

Unless of course someone else has another explanation?

Buggy overhead of SQL Plan Baseline Capture

There’s at least one bug related to the performance overhead of SQL Baseline Capture and ASH is less than clear with recursive SQL

First up – apologies, I’ve not got a reproducible test case because I flushed the shared pool between tests and the whole performance degradation scenario disappeared. I should have known when one of the Metalink notes talked about the problem disappearing when bouncing the instance that I shouldn’t have touched the shared pool. Now I can’t reproduce. I’m going to keep trying, but for now, let me tell you what I’ve got (I am trying to be more precise in my blogging but yet again it’s something woolly or nothing).

With that out the way, we’re in the initial stages of an upgrade from 9.2.0.8 on 32-bit Solaris to 11.2.0.2 on 64-bit Linux, a one-step all-or-nothing change-everything at once upgrade.

Plan was to run with optimizer_features_enable set to 9.2.0, various other parameters set to current production settings and to capture 9i plans into a baseline.

Initial runs of “The Batch” started, all processes reported as slow. The initial major contributor turns out to be recursive SQL related to SQL Plan Baseline Capture (see bug#11600319, also bug #11719151 and doc id 1295054.1 – thanks to Timur Akhmadeev for the pointers)

I haven’t yet raised this with Oracle Support because a) I can’t face the wait for a response and b) local red tape means that patches need to be approved by the DB Engineering team and approval for non-production environments is difficult (I’m not going there, not now).

Bug # 11719151 – “RECURSIVE MERGE WHEN optimizer_capture_sql_plan_baseline=true DOING FTS” – is certainly relevant.
The MERGE statement into SQLOBJ$AUXDATA does use a FULL TABLE SCAN. But if I were to show you the AWR reports then you’d see the MERGE statement into SQLOBJ$DATA was the bigger problem and the MERGE statement into SQLOBJ$ also not insignificant.

Pick a metric, any metric:
3 out of the top 10 by Elapsed Time, 2 out of the top 5, 1 of the top 1.

        Elapsed                  Elapsed Time
        Time (s)    Executions  per Exec (s)  %Total   %CPU    %IO    SQL Id
---------------- -------------- ------------- ------ ------ ------ -------------
         6,481.5        242,111          0.03   43.2   22.6   56.1 1vxm21mhmgy07
MERGE INTO sqlobj$data USING dual ON (:1 IS NULL) WHEN MATCHED THEN UPDATE SET comp_data = :2 WHERE

         2,403.4        242,108          0.01   16.0   95.4    3.9 44z7snw61qx9x
MERGE INTO sqlobj$auxdata USING dual ON (:1 IS NULL) WHEN MATCHED THEN UPDATE SET description = :2,

           898.7        242,111          0.00    6.0   98.6     .1 7xm5j53mxbtpn
MERGE INTO sqlobj$ USING dual ON (:1 IS NULL) WHEN MATCHED THEN UPDATE SET name = :2,

By CPU Time:

    CPU                   CPU per           Elapsed
  Time (s)  Executions    Exec (s) %Total   Time (s)   %CPU    %IO    SQL Id
---------- ------------ ---------- ------ ---------- ------ ------ -------------
   2,292.8      242,108       0.01   39.9    2,403.4   95.4    3.9 44z7snw61qx9x
MERGE INTO sqlobj$auxdata USING dual ON (:1 IS NULL) WHEN MATCHED THEN UPDATE SET description = :2,


   1,464.0      242,111       0.01   25.5    6,481.5   22.6   56.1 1vxm21mhmgy07
MERGE INTO sqlobj$data USING dual ON (:1 IS NULL) WHEN MATCHED THEN UPDATE SET comp_data = :2 WHERE

     885.7      242,111       0.00   15.4      898.7   98.6     .1 7xm5j53mxbtpn
MERGE INTO sqlobj$ USING dual ON (:1 IS NULL) WHEN MATCHED THEN UPDATE SET name = :2,

By Gets:

     Buffer                 Gets              Elapsed
      Gets   Executions   per Exec   %Total   Time (s)   %CPU    %IO    SQL Id
----------- ----------- ------------ ------ ---------- ------ ------ -----------
 49,599,141     242,108        204.9   46.9    2,403.4   95.4    3.9 44z7snw61qx
MERGE INTO sqlobj$auxdata USING dual ON (:1 IS NULL) WHEN MATCHED THEN UPDATE SET description = :2,

 29,107,449     242,111        120.2   27.5    6,481.5   22.6   56.1 1vxm21mhmgy
MERGE INTO sqlobj$data USING dual ON (:1 IS NULL) WHEN MATCHED THEN UPDATE SET comp_data = :2 WHERE

 16,671,818     242,111         68.9   15.7      898.7   98.6     .1 7xm5j53mxbt
MERGE INTO sqlobj$ USING dual ON (:1 IS NULL) WHEN MATCHED THEN UPDATE SET name = :2,

By executions:

 Executions   Rows Processed  Rows per Exec   Time (s)   %CPU    %IO    SQL Id
------------ --------------- -------------- ---------- ------ ------ -----------
     242,111         242,111            1.0    6,481.5   22.6   56.1 1vxm21mhmgy
MERGE INTO sqlobj$data USING dual ON (:1 IS NULL) WHEN MATCHED THEN UPDATE SET comp_data = :2 WHERE

     242,111         242,111            1.0      898.7   98.6     .1 7xm5j53mxbt
MERGE INTO sqlobj$ USING dual ON (:1 IS NULL) WHEN MATCHED THEN UPDATE SET name = :2,


     242,111         242,074            1.0       34.6   95.7     .2 9zr2sbbp1d4
SELECT /*+ INDEX(sqlobj$ (name obj_type)) */ signature, category 
FROM sqlobj$ WHERE name = :1 AND obj_type = :2

     242,108         242,108            1.0    2,403.4   95.4    3.9 44z7snw61qx
MERGE INTO sqlobj$auxdata USING dual ON (:1 IS NULL) WHEN MATCHED THEN UPDATE SET description = :2,

So, you get the message.

Furthermore, at the top of every Segment Statistic by X Metric, it’s either one of the LOBs or the LOB Index related to these three tables.

By the way, what’s in these SQLOBJ$ tables – no surprises it’s the baseline data.
See also:
Kerry Osborne – Do SQL Baselines Use Hints/
Christian Antognini – SQL Profiles in the Data Dictionary
Christian Antognini – Troubleshooting Oracle Performance

If I move over to the ASH data and give an example of what that was showing, then we can go and look at those SQLOBJ$ tables for the SQL concerned.

The ASH data – probably my most favorite feature of the last few years – I’m only displaying a little bit of in terms of height and width. But it illustrates the second point I wanted to highlight – that recursive SQL is not properly reported. Whilst the SQL_ID column differs from the TOP_LEVEL_SQL_ID (which I’ve omitted to keep it slim), this is still the SQL Id of a nested lookup, not the SQL Id of the recursive MERGE which you might have expected – I can see pros and cons of both I suppose.

  1  select to_char(sample_time,'HH24:MI:SS') sample_time, session_id, sql_id
  2  ,      sql_plan_operation, event
  3  from   v$active_session_history
  4  where  session_id      = 1388
  5  and    session_serial# = 147
  6  and    sample_id between 14596749 and 14596800
  7* order by sample_id
SQL> /

SAMPLE_T SESSION_ID SQL_ID        SQL_PLAN_OPERATION             EVENT
-------- ---------- ------------- ------------------------------ -----------------------------------
10:57:04       1388 d99anrdgbb4jf SELECT STATEMENT
10:57:05       1388                                              log file sync
10:57:06       1388 d99anrdgbb4jf
10:57:07       1388 d99anrdgbb4jf MERGE                          direct path write
10:57:08       1388 d99anrdgbb4jf MERGE                          flashback log file sync
10:57:09       1388 d99anrdgbb4jf                                direct path read
10:57:10       1388 d99anrdgbb4jf MERGE                          buffer busy waits
10:57:11       1388 d99anrdgbb4jf SELECT STATEMENT               latch: shared pool
10:57:12       1388 d99anrdgbb4jf MERGE                          direct path read
10:57:13       1388 d99anrdgbb4jf TABLE ACCESS
10:57:14       1388 d99anrdgbb4jf INDEX
10:57:15       1388 d99anrdgbb4jf MERGE                          direct path write
10:57:16       1388 d99anrdgbb4jf TABLE ACCESS
10:57:17       1388 d99anrdgbb4jf TABLE ACCESS
10:57:18       1388 d99anrdgbb4jf                                direct path read
10:57:19       1388 d99anrdgbb4jf TABLE ACCESS
10:57:20       1388 d99anrdgbb4jf MERGE
10:57:21       1388                                              log file sync
10:57:22       1388                                              log file sync
10:57:23       1388 d99anrdgbb4jf MERGE                          direct path write
10:57:24       1388 d99anrdgbb4jf MERGE                          direct path write
10:57:25       1388 d99anrdgbb4jf TABLE ACCESS
10:57:26       1388 d99anrdgbb4jf                                direct path read
10:57:27       1388 d99anrdgbb4jf                                direct path read
10:57:28       1388 d99anrdgbb4jf MERGE                          db file sequential read
10:57:29       1388 d99anrdgbb4jf MERGE
10:57:30       1388                                              log file sync
10:57:31       1388 d99anrdgbb4jf MERGE                          db file sequential read
10:57:32       1388 d99anrdgbb4jf MERGE
10:57:33       1388 d99anrdgbb4jf INDEX
10:57:34       1388 d99anrdgbb4jf MERGE                          buffer busy waits
10:57:35       1388 d99anrdgbb4jf MERGE                          buffer busy waits
10:57:36       1388 d99anrdgbb4jf MERGE                          flashback log file sync
10:57:37       1388 d99anrdgbb4jf MERGE                          direct path write
10:57:38       1388 d99anrdgbb4jf MERGE                          db file sequential read
10:57:39       1388 d99anrdgbb4jf MERGE                          db file sequential read
10:57:40       1388 d99anrdgbb4jf MERGE                          buffer busy waits
10:57:41       1388 d99anrdgbb4jf MERGE                          enq: TX - index contention
10:57:42       1388 d99anrdgbb4jf MERGE                          buffer busy waits
10:57:43       1388 d99anrdgbb4jf MERGE                          db file sequential read
10:57:44       1388 d99anrdgbb4jf MERGE                          db file sequential read
10:57:45       1388                                              log file sync
10:57:46       1388 d99anrdgbb4jf TABLE ACCESS
10:57:47       1388 d99anrdgbb4jf MERGE                          flashback log file sync
10:57:48       1388 d99anrdgbb4jf MERGE JOIN
10:57:49       1388 d99anrdgbb4jf MERGE                          db file sequential read
10:57:50       1388 d99anrdgbb4jf MERGE                          db file sequential read
10:57:51       1388 d99anrdgbb4jf TABLE ACCESS
10:57:52       1388 d99anrdgbb4jf MERGE                          direct path write
10:57:53       1388 d99anrdgbb4jf TABLE ACCESS
10:57:54       1388 d99anrdgbb4jf MERGE                          db file sequential read
10:57:55       1388 d99anrdgbb4jf SELECT STATEMENT

52 rows selected.

SQL> 

By the way, I was also a little surprised to see the direct path read/write operations in the ASH data but these SQLOBJ$ tables that we’re constantly MERGING into recursively have LOBS and further investigation revealed these events to be associated with those LOB segments.

As you can see from above, lots of MERGE operations which have nothing directly to do with SQL_ID d99anrdgbb4jf which looks like the statement below. By the time I’ve got to some of this data for the purposes of this, it’s gone from V$SQL so I’m fetched it from AWR.

SQL> select * from table(dbms_xplan.display_awr('d99anrdgbb4jf'));

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------
SQL_ID d99anrdgbb4jf
--------------------
SELECT XXXX_XXX FROM XXXX_XXXX WHERE XXXX_XXXX = 'MSD' AND XXXX_XXXX =:B1

Plan hash value: 878181930

-------------------------------------------------------------------------------
| Id  | Operation                   | Name            | Rows  | Bytes | Cost  |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                 |       |       |     2 |
|   1 |  TABLE ACCESS BY INDEX ROWID| XXXX_XXXX       |     1 |    18 |     2 |
|   2 |   INDEX RANGE SCAN          | XXXX_XXXX_IDX_1 |     1 |       |     2 |
-------------------------------------------------------------------------------

Note
-----
   - cpu costing is off (consider enabling it)

19 rows selected.

SQL> 

Baselines are stored according to signature. SQL statements have hash_values, sql ids and also exact/force matching signatures. By the way, the latter can be translated from the text of the sql statement using DBMS_SQLTUNE.SQLTEXT_TO_SIGNATURE using the FORCE_MATCH parameter for either the exact or force matching signature.

In AWR, we’ve only got the FORCE_MATCHING_SIGNATURE of this sql statement sql_id ‘d99anrdgbb4jf’ but take it was the same as the EXACT_MATCHING_SIGNATURE.

SQL> select distinct sql_id, force_matching_signature, to_char(force_matching_signature)
  2  from dba_hist_sqlstat 
  3  where sql_id = 'd99anrdgbb4jf'; 

SQL_ID        FORCE_MATCHING_SIGNATURE TO_CHAR(FORCE_MATCHING_SIGNATURE)
------------- ------------------------ ----------------------------------------
d99anrdgbb4jf               3.9882E+17 398819076860959259

SQL> 

I used this signature to see what was happening with the baseline data.

There’s not much point in me pasting the outputs of those tables. I could see that the SYS.SQLOBJ$.LAST_EXECUTED, SYS.SQLOBJ$AUXDATA.CREATED (!?!) and SYS.SQLOBJ$AUXDATA.LAST_MODIFIED were being updated although not the additional metrics like BUFFER_GETS, EXECUTIONS, etc in SYS.SQLOBJ$DATA.

But then I flushed the cache and like that … the performance degradation was gone and not reproducible… for now.

That’s all there is really.

Oh, by the way, I did trace one session while it was reproducing. Below are some of the relevant bits.
I was using a simple slow-by-slow lookup because I was then going to use concurrent sessions which initially were causing me problems with SMO rowcache lock contention – quite possibly related again to the baselines, but I never got that far into the investigation before we flashed it all back and turned off capture.
Also, looping lookups were relevant to one of the bug notes above and are very much representative of some of the legacy Pro*C codebase.

...
BEGIN
 DBMS_OUTPUT.DISABLE;
 FOR r IN (SELECT * FROM t1)
 LOOP
     FOR x IN (SELECT id FROM t1 WHERE code = r.code)
     LOOP
       DBMS_OUTPUT.PUT_LINE(x.id);
     END LOOP;
 END LOOP;
END;

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.01       0.01          0          0          0           0
Execute      1      1.07       1.10          0          0          0           0
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        2      1.08       1.11          0          0          0           0

Misses in library cache during parse: 1
Optimizer mode: CHOOSE
Parsing user id: 4106  

Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net break/reset to client                   2        0.00          0.00
  SQL*Net message to client                       1        0.00          0.00
  SQL*Net message from client                     1        0.00          0.00
  log file sync                                   1        0.06          0.06
********************************************************************************

....

SQL ID: 0jjc60pmrntdv
Plan Hash: 3617692013
SELECT * 
FROM
 T1


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          1           0
Execute      1      0.00       0.00          0          0          0           0
Fetch       66      0.01       0.01          0        279          0        6600
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total       68      0.01       0.02          0        279          1        6600

Misses in library cache during parse: 1
Optimizer mode: CHOOSE
Parsing user id: 4106     (recursive depth: 1)

Rows     Row Source Operation
-------  ---------------------------------------------------
   6600  TABLE ACCESS FULL T1 (cr=279 pr=0 pw=0 time=13232 us cost=498 size=22700000 card=100000)

********************************************************************************

SQL ID: 33kddu0j5ykq6
Plan Hash: 2946553998
SELECT ID 
FROM
 T1 WHERE CODE = :B1 


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute   6507     18.15      37.59       6504      54980       6523           0
Fetch     6506      0.28       0.28          0      26024          0        6506
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total    13014     18.43      37.88       6504      81004       6523        6506

Misses in library cache during parse: 1
Misses in library cache during execute: 1
Optimizer mode: CHOOSE
Parsing user id: 4106     (recursive depth: 1)

Rows     Row Source Operation
-------  ---------------------------------------------------
      1  TABLE ACCESS BY INDEX ROWID T1 (cr=4 pr=0 pw=0 time=233 us cost=2 size=26 card=1)
      1   INDEX UNIQUE SCAN I1 (cr=3 pr=0 pw=0 time=211 us cost=1 size=0 card=1)(object id 97386)


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  latch: cache buffers chains                    22        0.00          0.00
  library cache: mutex X                          4        0.00          0.00
  latch: shared pool                              8        0.00          0.00
********************************************************************************

SQL ID: 9zr2sbbp1d4fs
Plan Hash: 2462756431
SELECT /*+ INDEX(sqlobj$ (name obj_type)) */ signature, category              
FROM
 sqlobj$                                                                  
  WHERE name = :1                                                             
          AND obj_type = :2


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse      435      0.01       0.01          0          0          0           0
Execute    435      0.06       0.06          0          0          0           0
Fetch      435      0.01       0.02          1        871          0         434
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total     1305      0.09       0.10          1        871          0         434

Misses in library cache during parse: 1
Misses in library cache during execute: 1
Optimizer mode: CHOOSE
Parsing user id: SYS   (recursive depth: 2)

Rows     Row Source Operation
-------  ---------------------------------------------------
      0  INDEX RANGE SCAN I_SQLOBJ$NAME_TYPE (cr=3 pr=1 pw=0 time=10195 us cost=1 size=54 card=1)(object id 199)


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  Disk file operations I/O                        1        0.00          0.00
  db file sequential read                         1        0.00          0.00
********************************************************************************

SQL ID: 47y3mqvyhpkvs
Plan Hash: 1888265482
SELECT obj_type, plan_id, name, flags, last_executed                          
FROM
 sqlobj$                                                                  
  WHERE signature = :1                                                        
          AND category = :2


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        1      0.00       0.00          0          2          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        3      0.00       0.00          0          2          0           0

Misses in library cache during parse: 1
Misses in library cache during execute: 1
Optimizer mode: CHOOSE
Parsing user id: SYS   (recursive depth: 2)

Rows     Row Source Operation
-------  ---------------------------------------------------
      0  INDEX RANGE SCAN SQLOBJ$_PKEY (cr=2 pr=0 pw=0 time=51 us cost=1 size=74 card=1)(object id 198)

********************************************************************************

....

SQL ID: 7xm5j53mxbtpn
Plan Hash: 2110601055
MERGE INTO sqlobj$                                                            
  USING dual ON (:1 IS NULL)                                                  
    WHEN MATCHED THEN                                                         
        UPDATE SET name = :2,                                                 
                     flags = :3,                                              
                       last_executed = :4                                     
              WHERE signature = :5                                            
                      AND category = :6                                       
                        AND obj_type = :7                                     
                          AND plan_id = :8                                    
                    WHEN NOT MATCHED THEN                                     
                        INSERT (signature, category, obj_type, plan_id, name, 
                                  flags, last_executed)                       
                            VALUES (:9, :10, :11, :12, :13, :14, :15)


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse     6505      0.35       0.37          0          0          0           0
Execute   6505     29.78      31.78         52     430259      13126        6505
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total    13010     30.13      32.16         52     430259      13126        6505

Misses in library cache during parse: 1
Misses in library cache during execute: 2
Optimizer mode: CHOOSE
Parsing user id: SYS   (recursive depth: 2)

Rows     Row Source Operation
-------  ---------------------------------------------------
      0  MERGE  SQLOBJ$ (cr=2 pr=0 pw=0 time=748 us)
      1   VIEW  (cr=2 pr=0 pw=0 time=202 us)
      1    MERGE JOIN OUTER (cr=2 pr=0 pw=0 time=198 us cost=7 size=162488 card=2138)
      1     TABLE ACCESS FULL DUAL (cr=2 pr=0 pw=0 time=169 us cost=2 size=2 card=1)
      0     VIEW  (cr=0 pr=0 pw=0 time=7 us cost=5 size=158212 card=2138)
      0      FILTER  (cr=0 pr=0 pw=0 time=3 us)
      0       INDEX FAST FULL SCAN SQLOBJ$_PKEY (cr=0 pr=0 pw=0 time=0 us cost=9 size=158212 card=2138)(object id 198)


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  latch: cache buffers chains                    23        0.00          0.00
  db file sequential read                        52        0.89          1.03
  buffer busy waits                              65        0.00          0.00
  log file switch completion                      4        0.24          0.44
  library cache: mutex X                          2        0.00          0.00
********************************************************************************

SQL ID: 44z7snw61qx9x
Plan Hash: 1133963577
MERGE INTO sqlobj$auxdata                                                     
  USING dual ON (:1 IS NULL)                                                  
    WHEN MATCHED THEN                                                         
        UPDATE SET description = :2,                                          
                     creator = nvl(:3, creator),                              
                       origin = :4,                                           
                         version = :5,                                        
                           created = :6,                                      
                             last_modified = :7,                              
                               last_verified = nvl(:8, last_verified),        
                                 parse_cpu_time = null,                       
                                   optimizer_cost = nvl(:9, optimizer_cost),  
                                     module = nvl(:10, module),               
                                       action = nvl(:11, action),             
                                         priority = nvl(:12, priority),       
                                           optimizer_env = nvl(:13, 
  optimizer_env),                                      bind_data = nvl(:14, 
  bind_data),                                              
  parsing_schema_name = nvl(:15, parsing_schema_name),                        
    executions = nvl(:16, executions),                                        
      elapsed_time = nvl(:17, elapsed_time),                                  
        cpu_time = nvl(:18, cpu_time),                                        
          buffer_gets = nvl(:19, buffer_gets),                                
            disk_reads = nvl(:20, disk_reads),                                
              direct_writes = nvl(:21, direct_writes),                        
                rows_processed = nvl(:22, rows_processed),                    
                  fetches = nvl(:23, fetches),                                
                    end_of_fetch_count = nvl(:24, end_of_fetch_count),        
                      task_id = nvl(:25, task_id),                            
                        task_exec_name = nvl(:26, task_exec_name),            
                          task_obj_id = nvl(:27, task_obj_id),                
                            task_fnd_id = nvl(:28, task_fnd_id),              
                              task_rec_id = nvl(:29, task_rec_id),            
                                flags = 0,                                    
                                  spare1 = null,                              
                                    spare2 = null                             
                           WHERE signature = :30                              
                                   AND category = :31                         
                                     AND obj_type = :32                       
                                       AND plan_id = :33                      
                                 WHEN NOT MATCHED THEN                        
                                     INSERT (signature, category, obj_type, 
  plan_id,                                       description, creator, origin,
   version,                                        created, last_modified, 
  last_verified, parse_cpu_time,                        optimizer_cost, 
  module, action, priority,                                     optimizer_env,
   bind_data, parsing_schema_name, executions,                    
  elapsed_time, cpu_time, buffer_gets, disk_reads,                            
    direct_writes, rows_processed, fetches,end_of_fetch_count,                
      task_id, task_exec_name, task_obj_id, task_fnd_id,                      
        task_rec_id, flags, spare1, spare2)                                   
  VALUES (:34, :35, :36, :37,                                                 
            :38, :39, :40, :41,                                               
              :42, :43, null, null,                                           
                :44, :45, :46, :47,                                           
                  :48, :49, :50, :51,                                         
                    :52, :53, :54, :55,                                       
                      :56, :57, :58, :59,                                     
                        :60, :61, :62, :63,                                   
                          :64, 0, null, null)


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse     6504      0.47       0.49          0          0          0           0
Execute   6504     66.69      70.05        429    1631914      14302        6504
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total    13008     67.16      70.55        429    1631914      14302        6504

Misses in library cache during parse: 1
Misses in library cache during execute: 2
Optimizer mode: CHOOSE
Parsing user id: SYS   (recursive depth: 2)

Rows     Row Source Operation
-------  ---------------------------------------------------
      0  MERGE  SQLOBJ$AUXDATA (cr=3 pr=0 pw=0 time=810 us)
      1   VIEW  (cr=2 pr=0 pw=0 time=134 us)
      1    MERGE JOIN OUTER (cr=2 pr=0 pw=0 time=122 us cost=18 size=784646 card=2138)
      1     TABLE ACCESS FULL DUAL (cr=2 pr=0 pw=0 time=91 us cost=2 size=2 card=1)
      0     VIEW  (cr=0 pr=0 pw=0 time=15 us cost=16 size=780370 card=2138)
      0      FILTER  (cr=0 pr=0 pw=0 time=2 us)
      0       TABLE ACCESS FULL SQLOBJ$AUXDATA (cr=0 pr=0 pw=0 time=0 us cost=20 size=780370 card=2138)


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  Disk file operations I/O                        2        0.00          0.00
  db file sequential read                       429        0.29          2.56
  buffer busy waits                              58        0.00          0.00
  latch: cache buffers chains                    25        0.00          0.00
  log file switch completion                      1        0.05          0.05
  latch: object queue header operation            1        0.00          0.00
  library cache: mutex X                          2        0.00          0.00
********************************************************************************

SQL ID: 1vxm21mhmgy07
Plan Hash: 984916430
MERGE INTO sqlobj$data                                                        
  USING dual ON (:1 IS NULL)                                                  
    WHEN MATCHED THEN                                                         
        UPDATE SET comp_data = :2                                             
          WHERE signature = :3                                                
                  AND category = :4                                           
                    AND obj_type = :5                                         
                      AND plan_id = :6                                        
                WHEN NOT MATCHED THEN                                         
                    INSERT (signature, category, obj_type, plan_id, comp_data,
                              spare1, spare2)                                 
                        VALUES (:7, :8, :9, :10, :11, null, null)


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse     6053      0.35       0.36          0          0          0           0
Execute   6053     38.42     192.12      14420     500403     235414        6053
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total    12106     38.78     192.48      14420     500403     235414        6053

Misses in library cache during parse: 1
Misses in library cache during execute: 2
Optimizer mode: CHOOSE
Parsing user id: SYS   (recursive depth: 2)

Rows     Row Source Operation
-------  ---------------------------------------------------
      0  MERGE  SQLOBJ$DATA (cr=43 pr=2 pw=1 time=8478 us)
      1   VIEW  (cr=2 pr=0 pw=0 time=140 us)
      1    MERGE JOIN OUTER (cr=2 pr=0 pw=0 time=131 us cost=9 size=303596 card=2138)
      1     TABLE ACCESS FULL DUAL (cr=2 pr=0 pw=0 time=109 us cost=2 size=2 card=1)
      0     VIEW  (cr=0 pr=0 pw=0 time=5 us cost=7 size=299320 card=2138)
      0      FILTER  (cr=0 pr=0 pw=0 time=3 us)
      0       INDEX FAST FULL SCAN SQLOBJ$DATA_PKEY (cr=0 pr=0 pw=0 time=0 us cost=6 size=299320 card=2138)(object id 205)


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  db file sequential read                      8367        1.04         43.76
  flashback log file sync                      7828        0.26         25.34
  direct path write                            6052        0.47         36.62
  direct path read                             6045        0.42         16.34
  enq: HW - contention                         1138        0.89         24.42
  Disk file operations I/O                       12        0.00          0.00
  latch: cache buffers chains                   860        0.00          0.03
  buffer busy waits                             764        1.00          6.00
  read by other session                          24        0.03          0.34
  latch: enqueue hash chains                      2        0.00          0.00
  enq: TX - index contention                     57        0.05          0.21
  buffer deadlock                                11        0.00          0.00
  latch free                                      2        0.00          0.00
  library cache: mutex X                          8        0.00          0.00
  latch: cache buffers lru chain                  4        0.00          0.00
  log file switch (checkpoint incomplete)         3        0.14          0.28
  enq: TX - allocate ITL entry                    1        0.00          0.00
  latch: undo global data                         1        0.00          0.00

.....


SQL ID: 1zatw56z32vcg
Plan Hash: 1469453527
SELECT obj_type, plan_id, description, creator, origin,                       
         created, last_modified                                               
    
FROM
 sqlobj$auxdata                                                           
  WHERE signature = :1                                                        
          AND category = :2


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse     6505      0.10       0.12          0          0          0           0
Execute   6505      0.91       0.96          0          0          0           0
Fetch    13010      0.30       0.29          0      26020          0        6505
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total    26020      1.32       1.38          0      26020          0        6505

Misses in library cache during parse: 1
Misses in library cache during execute: 1
Optimizer mode: CHOOSE
Parsing user id: SYS   (recursive depth: 2)

Rows     Row Source Operation
-------  ---------------------------------------------------
      1  TABLE ACCESS BY INDEX ROWID SQLOBJ$AUXDATA (cr=4 pr=0 pw=0 time=87 us cost=2 size=315 card=1)
      1   INDEX RANGE SCAN I_SQLOBJ$AUXDATA_PKEY (cr=3 pr=0 pw=0 time=51 us cost=1 size=0 card=1)(object id 209)


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  latch: cache buffers chains                     5        0.00          0.00
********************************************************************************

SQL ID: bt51szvfrcdpn
Plan Hash: 1876815449
SELECT obj_type, plan_id, comp_data                                           
FROM
 sqlobj$data                                                              
  WHERE signature = :1                                                        
          AND category = :2


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse     6504      0.13       0.12          0          0          0           0
Execute   6504      0.82       0.82          0          0          0           0
Fetch    13008      0.28       0.32          0      19516          0        6504
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total    26016      1.24       1.27          0      19516          0        6504

Misses in library cache during parse: 1
Misses in library cache during execute: 1
Optimizer mode: CHOOSE
Parsing user id: SYS   (recursive depth: 2)

Rows     Row Source Operation
-------  ---------------------------------------------------
      1  INDEX RANGE SCAN SQLOBJ$DATA_PKEY (cr=3 pr=0 pw=0 time=89 us cost=1 size=140 card=1)(object id 205)


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  latch: cache buffers chains                    17        0.00          0.00
********************************************************************************

SQL ID: 1vxm21mhmgy07
Plan Hash: 2462756431
MERGE INTO sqlobj$data                                                        
  USING dual ON (:1 IS NULL)                                                  
    WHEN MATCHED THEN                                                         
        UPDATE SET comp_data = :2                                             
          WHERE signature = :3                                                
                  AND category = :4                                           
                    AND obj_type = :5                                         
                      AND plan_id = :6                                        
                WHEN NOT MATCHED THEN                                         
                    INSERT (signature, category, obj_type, plan_id, comp_data,
                              spare1, spare2)                                 
                        VALUES (:7, :8, :9, :10, :11, null, null)


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse     6523      0.18       0.19          0          0          0           0
Execute   6523      3.15       8.90       1059      34474      17486         452
Fetch     6071      0.22       0.20          0      12142          0        6071
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total    19117      3.55       9.30       1059      46616      17486        6523

Misses in library cache during parse: 0
Optimizer mode: CHOOSE
Parsing user id: SYS   (recursive depth: 2)

Rows     Row Source Operation
-------  ---------------------------------------------------
      0  MERGE  SQLOBJ$DATA (cr=79 pr=3 pw=1 time=16359 us)
   2440   VIEW  (cr=43 pr=0 pw=0 time=15141 us)
   2440    MERGE JOIN OUTER (cr=43 pr=0 pw=0 time=13065 us cost=9 size=303596 card=2138)
      1     TABLE ACCESS FULL DUAL (cr=2 pr=0 pw=0 time=59 us cost=2 size=2 card=1)
   2440     VIEW  (cr=41 pr=0 pw=0 time=7394 us cost=7 size=299320 card=2138)
   2440      FILTER  (cr=41 pr=0 pw=0 time=6050 us)
   2440       INDEX FAST FULL SCAN SQLOBJ$DATA_PKEY (cr=41 pr=0 pw=0 time=3971 us cost=6 size=299320 card=2138)(object id 205)


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  db file sequential read                       607        0.09          1.96
  direct path read                              450        0.03          0.51
  flashback log file sync                       592        0.01          1.51
  direct path write                             452        0.02          1.02
  latch: cache buffers chains                    72        0.00          0.00
  buffer busy waits                              54        0.00          0.01
  enq: HW - contention                           70        0.04          0.61
  enq: TX - index contention                      8        0.00          0.00
  read by other session                           1        0.01          0.01
********************************************************************************

SQL ID: 44z7snw61qx9x
Plan Hash: 1133963577

*** 2011-03-01 11:38:27.933
MERGE INTO sqlobj$auxdata                        
                               USING dual ON (:1 IS NULL)                     
                                 WHEN MATCHED THEN                            
                                     UPDATE SET description = :2,             
                                                  creator = nvl(:3, creator), 
                                                    origin = :4,              
                                                      version = :5,           
                                                        created = :6,         
                                                          last_modified = :7, 
                                                            last_verified = 
  nvl(:8, last_verified),                                       
  parse_cpu_time = null,                                                      
    optimizer_cost = nvl(:9, optimizer_cost),                                 
      module = nvl(:10, module),                                              
        action = nvl(:11, action),                                            
          priority = nvl(:12, priority),                                      
            optimizer_env = nvl(:13, optimizer_env),                          
              bind_data = nvl(:14, bind_data),                                
                parsing_schema_name = nvl(:15, parsing_schema_name),          
                  executions = nvl(:16, executions),                          
                    elapsed_time = nvl(:17, elapsed_time),                    
                      cpu_time = nvl(:18, cpu_time),                          
                        buffer_gets = nvl(:19, buffer_gets),                  
                          disk_reads = nvl(:20, disk_reads),                  
                            direct_writes = nvl(:21, direct_writes),          
                              rows_processed = nvl(:22, rows_processed),      
                                fetches = nvl(:23, fetches),                  
                                  end_of_fetch_count = nvl(:24, 
  end_of_fetch_count),                            task_id = nvl(:25, task_id),
                                                    task_exec_name = nvl(:26, 
  task_exec_name),                                    task_obj_id = nvl(:27, 
  task_obj_id),                                          task_fnd_id = 
  nvl(:28, task_fnd_id),                                          task_rec_id 
  = nvl(:29, task_rec_id),                                          flags = 0,
                                                                      spare1 =
   null,                                                                
  spare2 = null                                                      WHERE 
  signature = :30                                                             
    AND category = :31                                                        
      AND obj_type = :32                                                      
        AND plan_id = :33                                                     
  WHEN NOT MATCHED THEN                                                       
      INSERT (signature, category, obj_type, plan_id,                         
                description, creator, origin, version,                        
                  created, last_modified, last_verified, parse_cpu_time,      
                    optimizer_cost, module, action, priority,                 
                      optimizer_env, bind_data, parsing_schema_name, 
  executions,                    elapsed_time, cpu_time, buffer_gets, 
  disk_reads,                              direct_writes, rows_processed, 
  fetches,end_of_fetch_count,                    task_id, task_exec_name, 
  task_obj_id, task_fnd_id,                            task_rec_id, flags, 
  spare1, spare2)                                   VALUES (:34, :35, :36, 
  :37,                                                           :38, :39, 
  :40, :41,                                                           :42, 
  :43, null, null,                                                         
  :44, :45, :46, :47,                                                         
    :48, :49, :50, :51,                                                       
      :52, :53, :54, :55,                                                     
        :56, :57, :58, :59,                                                   
          :60, :61, :62, :63,                                                 
            :64

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0        252          2           1
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        2      0.00       0.00          0        252          2           1

Misses in library cache during parse: 0
Optimizer mode: CHOOSE
Parsing user id: SYS   (recursive depth: 2)

Rows     Row Source Operation
-------  ---------------------------------------------------
      0  MERGE  SQLOBJ$AUXDATA (cr=252 pr=0 pw=0 time=4892 us)
   2440   VIEW  (cr=252 pr=0 pw=0 time=7181 us)
   2440    MERGE JOIN OUTER (cr=252 pr=0 pw=0 time=5835 us cost=18 size=784646 card=2138)
      1     TABLE ACCESS FULL DUAL (cr=2 pr=0 pw=0 time=71 us cost=2 size=2 card=1)
   2440     VIEW  (cr=250 pr=0 pw=0 time=4912 us cost=16 size=780370 card=2138)
   2440      FILTER  (cr=250 pr=0 pw=0 time=3808 us)
   2440       TABLE ACCESS FULL SQLOBJ$AUXDATA (cr=250 pr=0 pw=0 time=2953 us cost=20 size=780370 card=2138)

********************************************************************************

SQL ID: bsa0wjtftg3uw
Plan Hash: 2020579421
select file# 
from
 file$ where ts#=:1


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch       15      0.00       0.00          0         29          0          14
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total       17      0.00       0.00          0         29          0          14

Misses in library cache during parse: 1
Misses in library cache during execute: 1
Optimizer mode: CHOOSE
Parsing user id: SYS   (recursive depth: 3)

Rows     Row Source Operation
-------  ---------------------------------------------------
     14  TABLE ACCESS BY INDEX ROWID FILE$ (cr=29 pr=0 pw=0 time=88 us cost=1 size=35 card=5)
     14   INDEX RANGE SCAN I_FILE2 (cr=15 pr=0 pw=0 time=151 us cost=1 size=0 card=5)(object id 44)




********************************************************************************

OVERALL TOTALS FOR ALL NON-RECURSIVE STATEMENTS

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        2      0.01       0.01          0          0          0           0
Execute      2      1.08       1.14          0          0          1           0
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      1.09       1.15          0          0          1           0

Misses in library cache during parse: 1

Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  log file sync                                   3        0.06          0.07
  SQL*Net message to client                       6        0.00          0.00
  SQL*Net message from client                     6       27.68         47.53
  reliable message                                1        0.00          0.00
  Disk file operations I/O                        3        0.00          0.00
  Parameter File I/O                              8        0.01          0.02
  SQL*Net break/reset to client                   2        0.00          0.00
  direct path read                             6503        0.75         18.96
  latch: cache buffers chains                    78        0.00          0.00
  buffer busy waits                             103        0.00          0.00
  library cache: mutex X                         17        0.00          0.00


OVERALL TOTALS FOR ALL RECURSIVE STATEMENTS

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse    39047      1.64       1.70          0          0          1           0
Execute  45549    158.03     342.53      22465    2652285     286877       19521
Fetch    39115      1.12       1.16          1      84889          0       32634
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total   123711    160.80     345.40      22466    2737174     286878       52155

Misses in library cache during parse: 21
Misses in library cache during execute: 17

Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  Disk file operations I/O                       15        0.00          0.00
  db file sequential read                      9457        1.04         49.35
  flashback log file sync                      8420        0.26         26.86
  direct path write                            6504        0.47         37.64
  direct path read                             6495        0.42         16.86
  enq: HW - contention                         1208        0.89         25.03
  latch: cache buffers chains                  1024        0.00          0.04
  buffer busy waits                             941        1.00          6.02
  read by other session                          25        0.03          0.36
  latch: enqueue hash chains                      2        0.00          0.00
  library cache: mutex X                         16        0.00          0.00
  enq: TX - index contention                     65        0.05          0.21
  buffer deadlock                                11        0.00          0.00
  log file switch completion                      5        0.24          0.50
  latch free                                      2        0.00          0.00
  latch: shared pool                              8        0.00          0.00
  latch: object queue header operation            1        0.00          0.00
  latch: cache buffers lru chain                  4        0.00          0.00
  log file switch (checkpoint incomplete)         3        0.14          0.28
  enq: TX - allocate ITL entry                    1        0.00          0.00
  latch: undo global data                         1        0.00          0.00

    4  user  SQL statements in session.
19540  internal SQL statements in session.
19544  SQL statements in session.
********************************************************************************


....

       1  session in tracefile.
       4  user  SQL statements in trace file.
   19540  internal SQL statements in trace file.
   19544  SQL statements in trace file.
      20  unique SQL statements in trace file.
 3288358  lines in trace file.
     394  elapsed seconds in trace file.

Plans gone AWRy

This is a follow-up to a database performance problem which happened a couple of weeks back, first briefly mentioned in this previous article of mine.

You’ve probably heard it a thousand times before and there’s no significant variance in this case nor revelation in resolution:

  • The Business have reported a performance problem.
  • Some reports which normally take a couple of minutes have been taking closer to one hour.
  • Some DBAs / managers / developers know that when this problem has happened previously, if they flushed the shared pool, the problem mysteriously went away – so that’s what they’ve already done.

The first area of comment is the flushing of the shared pool.

Ok, so it’s a bit of a case of using a sledgehammer to crack a nut but the business have deadlines. With production performance problems and business deadlines, sometimes you’ve got no choice.

Then again, nobody knew it would work – fortunately, from some perspectives, it did work. But, until the root cause is identified and fixed, it’s always going to be uncertain. And “flush the shared pool” is always going to be the battle cry of the panicked, like “reboot your PC”.

Clearly, this is a system that can cope with a flushing of the shared pool. Not true of many.

So, the shared pool has been flushed and we’ve thrown the baby – the evidence & some of the causes – out with the bath water.

However, if you’re appropriately licensed on 10g onwards, you’ve got AWR and ASH to fall back on.

Quickly though, how long should your AWR retention be?

The default is 7 days.
But, as Doug points out in one of his comments, in some organisations by the time you’ve got access five or six days might have passed.

35 and 42 days are oft-mentioned sensible suggestions, mostly so that a) you’ve got a decent time buffer to react and b) you’ve got significant cover to allow for most business cycles of activity.

It’s set to 7 here which is a bit of pain.
Fortunately, I got access to the data within a couple of hours of requesting an account. But even so, I was working on other stuff at the same time and would occasionally think of things to double check but it was too late – the AWR and ASH data was long gone (also I’m writing this article about two weeks after I originally intended to and I wanted to post the queries and data from the actual problem but what with the disruption of the snow and Christmas, the data to match the queries has gone).

Anyway, you’ve been told something has been slow, the problem’s over now, the pool’s flushed, but people want an explanation – there’s an invASHtigation to be done.

In general, when I get access to production, I get data dictionary access. No access to underlying objects, no access to GUIs.

This is part of what I call a blind investigation.
I’m new to the system so I don’t know much about it, I’m not familiar with the SQL or objects, etc.
But I might have looked at some Statspack or AWR reports to get a general picture of activity.

DBA_HIST_ACTIVE_SESS_HISTORY is nearly always my starting place, or V$ACTIVE_SESSION_HISTORY if the data has not been flushed yet, to look for non-standard waiting.

First up though, you need to have asked some basic questions:

  • Is/Was the problem system-wide or limited to a few sessions?
  • If limited, limited to who? Connecting as which user? Connecting via which application?
  • What was the timeframe of the problem

If the problem is system-wide, identifying it from looking at the ASH data for particular sessions might be tricky.

If the database is not the source of the problem, then using AWR and ASH might not give any clues.

It’s not terribly scientific, but paging quickly through and aggregating the ASH data can often give a good “feel” for whether the database was experiencing any non-standard extra time in any particular timeframe (although comparing system-level pictures of activity like statspack and AWR reports might be better).

In this case, I opened up with a query a bit like this, targetting a small timeframe when the problem was almost certainly to have been occurring for the end-user:

select to_char(sample_time,'HH24:MI:SS') time, session_id sid
, session_serial# s#, user_id usr, session_state state
, time_waited waited, blocking_session blk_by, blocking_session_status blk_stat
, sql_id, sql_child_number child_id, sql_plan_hash_value hash
, sql_plan_operation plan_op
, event, p1text, p1, p2text, p2, p3text p3, wait_class
, top_level_sql_id top_id, sql_exec_start ex_start
, plsql_entry_object_id pl_obj, plsql_entry_subprogram_id pl_prog
from   dba_hist_active_sess_history
where sample_time between <start_time> and <end_time>
order by time desc;

Then adding a predicate to run a similar query targetted on the particular user/s experiencing the problem:

and user_id = <user_id>

Followed by zooming in on a particular session if one stands out in the data above:

and session_id = <session_id>
and  session_serial# = <serial#>

With this particular performance problem, it was quickly obvious that this was a case of good plan gone bad with the affected sessions not blocked or waiting but mostly ON CPU running the same SQL_ID and the same SQL_PLAN_HASH_VALUE.

By changing the query to look at counts of a particular hash, it might be possible to find if it’s a particular plan hash that is causing a problem and when it came along:

select sql_plan_hash_value, COUNT(*), MIN(sql_exec_start), MAX(sql_exec_start)
from   dba_hist_active_sess_history
where  sql_id = <sql_id>
group by sql_plan_hash_value;

This query revealed a big skew towards a particular plan which had first appeared in the sampled data at 6 in the morning that day.

Another view which may reinforce your thinking here is DBA_HIST_SQL_STAT. For example, you can get a rough idea of the relative work of one plan over another using a query like this:

select plan_hash_value, optimizer_cost, optimizer_mode, optimizer_env_hash_value, buffer_gets_delta/executions_delta
from   dba_hist_sqlstat
where sql_id = <sql_id>
and   executions_delta > 0;

The plans were gone out of memory but there’s a handy function to pull the plan from AWR:

SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_AWR(<sql_id>,<plan_hash_value>,<db_id>,<format>));

Which works really well as long as you don’t hit the display issues you can get with large plans.

If you do hit these display issues then you might prefer to pull your plans from dba_hist_sql_plan for example using a variation of Tanel’s script here.

select
    id      xms_id,
    lpad(' ',depth*1,' ')||operation || ' ' || options xms_plan_step, 
    object_name     xms_object_name,
    cost        xms_opt_cost,
    cardinality xms_opt_card,
    bytes       xms_opt_bytes,
    optimizer   xms_optimizer
from 
    dba_hist_sql_plan 
where 
    plan_hash_value = <plan_hash>
order by id

One big, big downside of pulling the plan from AWR and not from v$sql_plan is that the access_predicates and filter_predicates columns are not populated. The columns are there but the information has not been copied across. A big oversight in the implementation surely.

Having got a bad plan and some not-so-bad plans (same sql id, different plan hash values), then we can start to look at the differences and what might have caused them.

I’m having trouble with uploading and/or presenting these plans in a way that works. WordPress won’t let me upload text files, the plans are 3000-odd lines long and the various options that I’ve tried so far do not give a satisfactory result.

Actual plans aside, suffice to say that there are significant differences between good and bad plans with the bad plans generally employing full table scans against a big table rather than index lookups but there can be an awful lot of distractions in a big plan.

So … so far… good plan, bad plan, same sql, obvious differences. Why? Where else can we look?

There’s one more place which might help – DBA_HIST_SQLBIND.
This should have captured the peeked binds used when our bad plan was parsed along with any binds captured at regular intervals. Unfortunately, we can only tie in with the SQL_ID as there’s no sql plan hash value here so we’re really looking at LAST_CAPTURED in conjunction with the dates/times from the other views.

So. In conclusion…

The details are a bit vague and I apologise for this. I took too long before starting this article and ended up with a good illustration of why 7 days is too short for your AWR retention. However, AWR & ASH can be a very good way of identifying the causes of plan changes.

In this case, it seems highly likely that a particular bit of sql got a bad plan.
It might not have been bad for the particular combination of binds which were peeked at at parse time, but at least bad for the majority of uses that day.

There is, sadly, still just an element of theory about the causes of this particular performance problem.
I can see the bad plan. I can see the plans which work better in that the performance is acceptable for everyone (better does not necessarily mean good). I can see the binds which produced the better plans. I can see the binds which produced the band plan. But, as yet, running the query with the binds which produced the bad plan is not producing the bad plan :(.

And that’s the outstanding problem – if you can’t reproduce something at will, how do you know if you’ve fixed something? Can you prove you’ve understood the whole problem? You can’t know. Not for sure.

I hope to figure out the best way to upload these plans in another article.

This particular problem is almost certainly a case of bind variable peeking – arguably where bind variables shouldn’t be used. On a reporting system where the reports typically take some minutes, the overhead of hard parsing shouldn’t really be a consideration – this seems to be a system where the mantra “you must use bind variables” has been applied inappropriately. The reports take in date parameters which might be historic date ranges or might just be the latest month – a case of skew, a case of one plan does not suit all, a case for literals.

Any ideas, thoughts, mistakes spotted, how to post big plans, let me know.

Some other ASH / AWR resources:
Tracking the bind variable – Dion Cho
Doug’s AWR tagged posts
Doug’s ASH tagged posts
Tracking Parameter Changes – Kerry Osborne

Follow

Get every new post delivered to your Inbox.

Join 75 other followers