Did it really fix it 1: OFE & nasty FIRST_ROWS plan on 11.2.0.4

Plan degradations on upgrade are normal.

This one’s no different.

On further investigation, turned out application was setting optimizer_mode = first_rows somewhere.

First point about this is that first_rows really shouldn’t be used either as a hint or as an optimizer_mode.

What does FIRST_ROWS mean?
From 11g doco:

The optimizer uses a mix of costs and heuristics to find a best plan for fast delivery of the first few rows.

If any sort of FIRST_ROWS type intervention is required, then it might be better to be using a more modern, fully-costed FIRST_ROWS_n (where n = 1 | 10 | 100 | 1000).
If it’s even required.

The 12c documentation is clearer about this:

FIRST_ROWS is available for backward compatibility and plan stability; use FIRST_ROWS_n instead.

But I don’t think we should attach too much credence to that statement not being in the 11g documentation.

Here’s the problem plan.

Pre upgrade, snippet of the plan looked as per below.
Please note position of T1.

-------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                   | Name                        | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                            |                             |    34 | 13668 |  3003   (1)| 00:00:22 |       |       |
|   1 |  SORT ORDER BY                              |                             |    34 | 13668 |  3003   (1)| 00:00:22 |       |       |
|   2 |   NESTED LOOPS OUTER                        |                             |    34 | 13668 |  3002   (1)| 00:00:22 |       |       |
|   3 |    NESTED LOOPS OUTER                       |                             |    34 | 12648 |  2968   (1)| 00:00:22 |       |       |
|   4 |     NESTED LOOPS OUTER                      |                             |    34 | 12274 |  2935   (1)| 00:00:22 |       |       |
|   5 |      NESTED LOOPS OUTER                     |                             |    34 | 11900 |  2867   (1)| 00:00:21 |       |       |
|   6 |       NESTED LOOPS OUTER                    |                             |    34 | 11628 |  2833   (1)| 00:00:21 |       |       |
|   7 |        NESTED LOOPS OUTER                   |                             |    34 | 10404 |  2799   (1)| 00:00:21 |       |       |
|   8 |         NESTED LOOPS OUTER                  |                             |    34 |  8228 |  2765   (1)| 00:00:20 |       |       |
|   9 |          NESTED LOOPS                       |                             |    34 |  6732 |  2731   (1)| 00:00:20 |       |       |
|  10 |           NESTED LOOPS                      |                             |    69 | 12282 |  2524   (1)| 00:00:19 |       |       |
|  11 |            NESTED LOOPS                     |                             |    69 |  9591 |  2386   (1)| 00:00:18 |       |       |
|* 12 |             TABLE ACCESS FULL               | T1                          |  1658 |   213K|   725   (3)| 00:00:06 |       |       |
|* 13 |             TABLE ACCESS BY INDEX ROWID     | T2                          |     1 |     7 |     1   (0)| 00:00:01 |       |       |
|* 14 |              INDEX UNIQUE SCAN              | T2_U1                       |     1 |       |     0   (0)| 00:00:01 |       |       |
|  15 |            TABLE ACCESS BY INDEX ROWID      | T3                          |     1 |    39 |     2   (0)| 00:00:01 |       |       |
|* 16 |             INDEX UNIQUE SCAN               | T3_U1                       |     1 |       |     1   (0)| 00:00:01 |       |       |
|  17 |           TABLE ACCESS BY GLOBAL INDEX ROWID| T4                          |     1 |    20 |     3   (0)| 00:00:01 | ROWID | ROWID |
|* 18 |            INDEX RANGE SCAN                 | T4_N1                       |     1 |       |     2   (0)| 00:00:01 |       |       |

Post upgrade, plan has degraded to:

--------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                    | Name                        | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                             |                             |    69 | 27738 |   508M  (3)|999:59:59 |       |       |
|   1 |  NESTED LOOPS OUTER                          |                             |    69 | 27738 |   508M  (3)|999:59:59 |       |       |
|   2 |   NESTED LOOPS OUTER                         |                             |    69 | 26979 |   508M  (3)|999:59:59 |       |       |
|   3 |    NESTED LOOPS OUTER                        |                             |    69 | 26220 |   508M  (3)|999:59:59 |       |       |
|   4 |     NESTED LOOPS OUTER                       |                             |    69 | 23184 |   508M  (3)|999:59:59 |       |       |
|   5 |      NESTED LOOPS OUTER                      |                             |    69 | 22632 |   508M  (3)|999:59:59 |       |       |
|   6 |       NESTED LOOPS OUTER                     |                             |    69 | 20562 |   508M  (3)|999:59:59 |       |       |
|   7 |        NESTED LOOPS OUTER                    |                             |    69 | 16146 |   508M  (3)|999:59:59 |       |       |
|   8 |         NESTED LOOPS                         |                             |    69 | 13662 |   508M  (3)|999:59:59 |       |       |
|   9 |          NESTED LOOPS                        |                             |  1675 |   312K|   508M  (3)|999:59:59 |       |       |
|  10 |           NESTED LOOPS                       |                             |   703K|    39M|  3476K  (1)| 06:54:55 |       |       |
|* 11 |            TABLE ACCESS BY GLOBAL INDEX ROWID| T4                          |   703K|    13M|  2068K  (1)| 04:06:54 | ROWID | ROWID |
|  12 |             INDEX FULL SCAN                  | T4_N2                       |  2240K|       |  7943   (1)| 00:00:57 |       |       |
|  13 |            TABLE ACCESS BY INDEX ROWID       | T3                          |     1 |    39 |     2   (0)| 00:00:01 |       |       |
|* 14 |             INDEX UNIQUE SCAN                | T3_U1                       |     1 |       |     1   (0)| 00:00:01 |       |       |
|* 15 |           TABLE ACCESS FULL                  | T1                          |     1 |   132 |   718   (3)| 00:00:06 |       |       |

In the post-upgrade plan, the most pertinent part was step 15.

A FULL TABLE SCAN as the inner rowsource of a NESTED LOOP, in other words for every row in the driving rowsource do a full table scan.”

Standard reaction to such a degradation is to test the SQL with OPTIMIZER_FEATURES_ENABLE (OFE).

With OFE set either via OPT_PARAM hint or via ALTER SESSION then the SQL reverted to pre-upgrade plan.

This is good!

But… how much confidence can we have that this actually fixed the problem?

It would be great if we could find a bug which was definitively responsible for this change in execution plan.

OFE is an umbrella but it doesn’t turn off ALL fixes/optimizer features in a release.

Many of those fixes covered by the OFE umbrella are represented in V$SYSTEM_FIX_CONTROL

So, I went looking for anything which sounded related and went back to default 11.2.0.4 OFE and starting turning off the BUGNOs one-by-one – no difference, still a bad plan

Then still at OFE 11.2.0.4, I turned off all the BUGNOs…. no difference, still a bad plan.

At this point, my confidence in OFE as a fix for this problem was starting to waiver.

So, I tried to reduce the problem SQL to it’s smallest form representing the problem.

alter session set optimizer_mode = first_rows;
alter session set optimizer_features_enable = '11.2.0.4';

 explain plan for
 SELECT *
 FROM   T1
 ,      T3
 ,      T4
 WHERE  T1.COL1             = T3.COL1
 AND    T3.COL2             = T4.COL2
 AND    T1.DATE_COL        '+OUTLINE'));

This distilled statement reproduced the issue at the heart of the original statement:

----------------------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name    | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |         |  6029 |  4533K|  1612M  (3)|999:59:59 |       |       |
|   1 |  NESTED LOOPS                        |         |  6029 |  4533K|  1612M  (3)|999:59:59 |       |       |
|   2 |   NESTED LOOPS                       |         |  2237K|   795M|  6551K  (1)| 13:01:53 |       |       |
|   3 |    TABLE ACCESS BY GLOBAL INDEX ROWID| T4      |  2240K|   188M|  2068K  (1)| 04:06:53 | ROWID | ROWID |
|   4 |     INDEX FULL SCAN                  | T4_N2   |  2240K|       |  7943   (1)| 00:00:57 |       |       |
|   5 |    TABLE ACCESS BY INDEX ROWID       | T3      |     1 |   285 |     2   (0)| 00:00:01 |       |       |
|*  6 |     INDEX UNIQUE SCAN                | T3_U1   |     1 |       |     1   (0)| 00:00:01 |       |       |
|*  7 |   TABLE ACCESS FULL                  | T1      |     1 |   397 |   718   (3)| 00:00:06 |       |       |
----------------------------------------------------------------------------------------------------------------

Is this test case a 100% indication that OFE wasn’t applicable to the original problem?
Not sure.
But at this point I had littleconfidence that OFE provided any sort of reliable fix to the original SQL statement.

I don’t know what the internal heuristics are/were for FIRST_ROWS.

But that is a stupid place for a full table scan regardless of the number of rows expected to return.

If we remove one of the T1 predicates above and the various row estimates go up then the FTS moves to a driving position.

I could/should get a 10053 trace but FIRST_ROWS shouldn’t be in use in the first place.

Solution – hunt down where the app code is doing the FIRST_ROWS thing and exterminate.

All the other possible OPTIMIZER_MODE modes gave a plan for the original SQL which started like this:

--------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                   | Name | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                            |      |    35 | 14105 |  1386   (2)| 00:00:10 |       |       |
|   1 |  SORT ORDER BY                              |      |    35 | 14105 |  1386   (2)| 00:00:10 |       |       |
|   2 |   NESTED LOOPS OUTER                        |      |    35 | 14105 |  1385   (2)| 00:00:10 |       |       |
|*  3 |    HASH JOIN OUTER                          |      |    35 | 12565 |  1350   (2)| 00:00:10 |       |       |
|*  4 |     HASH JOIN OUTER                         |      |    35 | 11305 |  1315   (2)| 00:00:10 |       |       |
|   5 |      NESTED LOOPS OUTER                     |      |    35 |  9065 |  1280   (2)| 00:00:10 |       |       |
|   6 |       NESTED LOOPS OUTER                    |      |    35 |  8015 |  1245   (2)| 00:00:09 |       |       |
|   7 |        NESTED LOOPS OUTER                   |      |    35 |  7735 |  1210   (2)| 00:00:09 |       |       |
|   8 |         NESTED LOOPS OUTER                  |      |    35 |  7350 |  1176   (2)| 00:00:09 |       |       |
|   9 |          NESTED LOOPS                       |      |    35 |  6965 |  1106   (2)| 00:00:08 |       |       |
|  10 |           NESTED LOOPS                      |      |    69 | 12351 |   899   (3)| 00:00:07 |       |       |
|* 11 |            HASH JOIN                        |      |    69 |  9660 |   761   (3)| 00:00:06 |       |       |
|* 12 |             TABLE ACCESS FULL               | T2   |   237 |  1659 |    36   (6)| 00:00:01 |       |       |
|* 13 |             TABLE ACCESS FULL               | T1   |  1671 |   217K|   725   (3)| 00:00:06 |       |       |

So…. when optimizer upgrade problems strike, you should always try OFE.

But it’s also worth further investigation to confirm that OFE really is a reliable fix for the problem.

SQL Tuning: Thinking in Sets / How and When to be Bushy

Below is a SQL statement from a performance problem I was looking at the other day.

This is a real-world bit of SQL which has slightly simplified and sanitised but, I hope, without losing the real-worldliness of it and the points driving this article.

You don’t really need to be familiar with the data or table structures (I wasn’t) as this is a commentary on SQL structure and why sometimes a rewrite is the best option.

SELECT bd.trade_id
,      bdp.portfolio_id
,      bd.deal_id 
,      bd.book_id
,      pd.deal_status   prev_deal_status
FROM   deals            bd
,      portfolios       bdp
,      deals            pd
,      portfolios       pdp
-- today's data
WHERE  bd.business_date         = :t_date
AND    bd.src_bus_date          < :t_date
AND    bd.type                  = 'Trade'
AND    bdp.ref_portfolio_id     = bd.book_id
-- yesterday's data
AND    pd.business_date         = :y_date
AND    pd.type                  = 'Trade'
AND    pdp.ref_portfolio_id     = pd.book_id
-- some join columns
AND    bd.trade_id              = pd.trade_id
AND    bdp.portfolio_id         = pdp.portfolio_id;

There is no particular problem with how the SQL statement is written per se.

It is written in what seems to be a standard developer way.

Call it the “lay everything on the table” approach.

This is a common developer attitude:

“Let’s just write a flat SQL structure and let Oracle figure the best way out.”

Hmmm… Let’s look at why this can be a problem.

First, what is the essential requirement of the SQL?

Compare information (deal status) that we had yesterday for a subset of deals/trades

Something like that anyway…

So … What is the problem?

The Optimizer tends to rewrite and transform any SQL we give it anyway and tries to flatten it out.

The SQL above is already flat so isn’t that a good thing? Shouldn’t there be less work for the optimizer to do?

No, not necessarily. Flat SQL immediately restricts our permutations.

The problem comes with how Oracle can take this flat SQL and join the relevant row sources to efficiently get to the relevant data.

Driving Rowsource

Let’s assume that we should drive from today’s deal statuses (where we actually drive from will depend on what the optimizer estimates / costs).

SELECT ...
FROM   deals            bd
,      portfolio        bdp
...
-- today's data
WHERE  bd.business_date         = :t_date
AND    bd.src_bus_date          < :t_date
AND    bd.type                  = 'Trade'
AND    bdp.ref_portfolio_id     = bd.book_id
....

Where do we go from here?

We want to join from today’s deals to yesterdays deals.

But the data for the two sets of deals data is established via the two table join (DEALS & PORTFOLIOS).

thinking_in_Sets_venn

We want to join on TRADE_ID which comes from the two DEALS tables and PORTFOLIO_ID which comes from the two PORTFOLIOS tables.

SELECT ...
FROM   ...
,      deals            pd
,      portfolios       pdp
WHERE  ...
-- yesterday's data
AND    pd.business_date         = :y_date
AND    pd.type                  = 'Trade'
AND    pdp.ref_portfolio_id     = pd.book_id
...

And joined to via:

AND    bd.trade_id              = pd.trade_id
AND    bdp.portfolio_id         = pdp.portfolio_id

So from our starting point of today’s business deals, we can either go to PD or to PDP, but not to both at the same time.

Hang on? What do you mean not to both at the same time?

For any multi-table join involving more than two tables, the Optimizer evaluates the different join tree permutations.

Left-Deep Tree

Oracle has a tendency to choose what is called a left-deep tree.

If you think about a join between two rowsources (left and right), a left-deep is one where the second child (the right input) is always a table.

NESTED LOOPS are always left-deep.

HASH JOINS can be left-deep or right-deep (normally left-deep as already mentioned)

Zigzags are also possible, a mixture of left-deep and right-deep.

Below is an image of a left-based tree based on the four table join above.

left_tree

Here is an execution plan which that left-deep tree might represent:

---------------------------------------------------------------
| Id  | Operation                               | Name        |
---------------------------------------------------------------
|   0 | SELECT STATEMENT                      |               |
|   1 |  NESTED LOOPS                         |               |
|   2 |   NESTED LOOPS                        |               |
|   3 |    NESTED LOOPS                       |               |
|   4 |     NESTED LOOPS                      |               |
|*  5 |      TABLE ACCESS BY ROWID            | DEALS         |
|*  6 |       INDEX RANGE SCAN                | DEALS_IDX01   |
|   7 |      TABLE ACCESS BY INDEX ROWID      | PORTFOLIOS    |
|*  8 |       INDEX UNIQUE SCAN               | PK_PORTFOLIOS |
|*  9 |     TABLE ACCESS BY INDEX ROWID       | DEALS         |
|* 10 |      INDEX RANGE SCAN                 | DEALS_IDX01   |
|* 11 |    INDEX UNIQUE SCAN                  | PK_PORTFOLIOS |
|* 12 |   TABLE ACCESS BY INDEX ROWID         | PORTFOLIOS    |
---------------------------------------------------------------

Predicate Information (identified by operation id): 
--------------------------------------------------- 
 
   5 - filter("BD"."TYPE"='Trade' AND "BD"."SRC_BUS_DATE"<:t_date) 
   6 - access("BD"."BUSINESS_DATE"=:t_date) 
   8 - access("BD"."BOOK_ID"="BDP"."REF_PORTFOLIO_ID") 
   9 - filter(("BD"."TYPE"='Trade' AND "BD"."TRADE_ID"="PD"."TRADE_ID")) 
  10 - access("PD"."BUSINESS_DATE"=:y_date) 
  11 - access("PD"."BOOK_ID"="PDP"."REF_PORTFOLIO_ID") 
  12 - filter("BDP"."PORTFOLIO_ID"="PDP"."PORTFOLIO_ID")

Right-Deep Tree

A right-deep tree is one where the first child, the left input, is a table.

Illustration not specific to the SQL above:

right_tree

Bushy Tree

For this particular SQL, this is more what we are looking for:

bushy_tree

The essence of the problem is that we cannot get what is called bushy join, not with the original flat SQL.

The Optimizer cannot do this by default. And this isn’t an approach that we can get at by hinting (nor would we want to if we could, of course!).

Rewrite Required

To get this bushy plan, we need to rewrite our SQL to be more explicit around the set-based approach required.

WITH subq_curr_deal AS
     (SELECT /*+ no_merge */
             bd.trade_id
      ,      bd.deal_id
      ,      bd.book_id
      ,      bdp.portfolio_id
      FROM   deals      bd
      ,      portfolios bdp
      WHERE  bd.business_date         = :t_date
      AND    bd.src_bus_date          < :t_date
      AND    bd.type                  = 'Trade'
      AND    bdp.ref_portfolio_id     = bd.book_id)
,    subq_prev_deal AS
     (SELECT /*+ no_merge */
             pd.trade_id 
      ,      pd.deal_status
      ,      pdp.portfolio_id
      FROM   deals      pd
      ,      portfolios pdp
      WHERE  pd.business_date         = :y_date
      AND    pd.type                  = 'Trade'
      AND    pdp.ref_portfolio_id     = pd.book_id)
SELECT cd.trade_id
,      cd.portfolio_id
,      cd.deal_id
,      cd.book_id 
,      pd.deal_status prev_deal_status
FROM   subq_curr_deal cd
,      subq_prev_deal pd
WHERE  cd.trade_id             = pd.trade_id
AND    cd.portfolio_id         = pd.portfolio_id;

How exactly does the rewrite help?

By writing the SQL deliberately with this structure, by using WITH to create subqueries in conjunction with no_merge, we are deliberately forcing the bushy join.

This is an example execution plan that this bushy tree might represent.

----------------------------------------------------------
| Id  | Operation                        | Name          |
----------------------------------------------------------
|   0 | SELECT STATEMENT                 |               |
|*  1 |  HASH JOIN                       |               |
|   2 |   VIEW                           |               |
|   3 |    NESTED LOOPS                  |               |
|   4 |     NESTED LOOPS                 |               |
|*  5 |      TABLE ACCESS BY INDEX ROWID | DEALS         |
|*  6 |       INDEX RANGE SCAN           | DEALS_IDX01   |
|*  7 |      INDEX UNIQUE SCAN           | PK_PORTFOLIOS |
|   8 |     TABLE ACCESS BY INDEX ROWID  | PORTFOLIOS    |
|   9 |   VIEW                           |               |
|  10 |    NESTED LOOPS                  |               |
|  11 |     NESTED LOOPS                 |               |
|* 12 |      TABLE ACCESS BY INDEX ROWID | DEALS         |
|* 13 |       INDEX RANGE SCAN           | DEALS_IDX01   |
|* 14 |      INDEX UNIQUE SCAN           | PK_PORTFOLIOS |
|  15 |     TABLE ACCESS BY INDEX ROWID  | PORTFOLIOS    |
----------------------------------------------------------
 
Predicate Information (identified by operation id): 
--------------------------------------------------- 
                                 
   1 - access("CD"."TRADE_ID"="PD"."TRADE_ID" AND "CD"."PORTFOLIO_ID"="PD"."PORTFOLIO_ID")
   5 - filter(("BD"."TYPE"='Trade' AND "BD"."SRC_BUS_DATE"<:t_date)) 
   6 - access("BD"."BUSINESS_DATE"=:t_date ) 
   7 - access("BD"."BOOK_ID"="BDP"."REF_PORTFOLIO_ID") 
  12 - filter("PD"."TYPE"='Trade') 
  13 - access("PD"."BUSINESS_DATE"=:y_date) 
  14 - access("PDP"."REF_PORTFOLIO_ID"="PD"."BOOK_ID")

Is this a recommendation to go use WITH everywhere?

No.

What about the no_merge hint?

No.
The no_merge hint is a tricky one. This is not necessarily a recommendation but its usage here prevents the Optimizer from flattening. I often find it goes hand-in-hand with this sort of deliberately structured SQL for that reason, and similar goes for push_pred.

Do developers need to know about left deep, right deep and bushy?

No, not at all.

Takeaways?

It helps to think in sets and about what sets of data you are joining and recognise when SQL should be deliberately structured.

Further Reading

http://www.oaktable.net/content/right-deep-left-deep-and-bushy-joins
https://tonyhasler.wordpress.com/2008/12/27/bushy-joins/
https://www.toadworld.com/platforms/oracle/b/weblog/archive/2014/06/16/hitchhikers-guide-to-explain-plan-8
https://jonathanlewis.wordpress.com/2007/01/24/left-deep-trees/
https://antognini.ch/top/

What AWR isn’t telling you

It is well-known that AWR, and Statspack before, take snapshots of V$ views (or rather the underlying objects) to produce the data in AWR.

It is also well-known that, when considering sql and its statistics in the shared pool, if something big hitting happens but the big-hitter is no longer in the shared pool by the time of the snapshot, then it can’t be recorded in your AWR picture of activity.

But like many things supposedly well-known, it can still come back and remind you that you’ve forgotten or overlooked this behaviour.

Here is a little illustration which happened earlier this week.

This week I’ve been looking closely at activity which happens on a particular DB between 16:00 and 17:00 because the IO subsystem is showing signs of stress and the timings for ‘Average Synchronous Single-Block Read Latency’ (V$SYSMETRIC_SUMMARY / DBA_HIST_SYSMETRIC_SUMMARY) have pushed above our amber warning levels.

Don’t get me started but our amber level for a single block read is 20ms!
Way too high for my liking for such sustained average but that is apparently the SLA from the SAN.

Why do we have such an alert?
The purpose is two-fold.

  1. It can sometimes be a warning that some of our IO-sensitive batch work might show some performance degradations, threatening SLAs
  2. In the past, it has been an effective warning that a sql statement executed concurrently in multiple threads from the applications has switched to a "bad" execution plan whose impact is system-wide (normally heavy concurrent direct path reads).

So… I was hovering around this system during the relevant timeframe looking at what was going on.

And I observed some heavy hitting queries running in parallel originating from another system (i.e. not my application).

There were two executions of one sql id (0cu2sm062jutc) executing concurrently with one single execution of a another sql statement.

The top-level RTSM execution metrics for these three executions were all very similar to this:

Executing for about 10 minutes, in parallel doing a lot of physical direct path reads (38GB) and accumulating over 1 hour in parallel processing time.

 Status              :  DONE (ALL ROWS)          
 Instance ID         :  2                        
 SQL ID              :  0cu2sm062jutc            
 SQL Execution ID    :  33554432                 
 Execution Started   :  10/13/2015 16:03:35      
 First Refresh Time  :  10/13/2015 16:03:37      
 Last Refresh Time   :  10/13/2015 16:13:19      
 Duration            :  584s                     
 Module/Action       :  JDBC Thin Client/-       
 Program             :  JDBC Thin Client         
 Fetch Calls         :  257                      

Global Stats
===================================================================================================================================
| Elapsed |   Cpu   |    IO    | Application | Concurrency | Cluster  |  Other   | Fetch | Buffer | Read  | Read  | Write | Write |
| Time(s) | Time(s) | Waits(s) |  Waits(s)   |  Waits(s)   | Waits(s) | Waits(s) | Calls |  Gets  | Reqs  | Bytes | Reqs  | Bytes |
===================================================================================================================================
|    4041 |     252 |     3782 |        0.06 |        1.75 |     0.97 |     3.95 |   257 |     6M | 75051 |  38GB |  1070 | 125MB |
===================================================================================================================================

And then whilst I was fiddling with various queries against V$SQL the queries in question disappeared from V$SQL.

No big surprise of course, stuff is liable to be released from the library cache whenever it sees fit.

But then the remembrance penny hit home and I was left thinking that it was no wonder that I couldn’t see this regularly when looking retrospectively at this timeframe via AWR.

But this cumulative exection time above is larger than the number 3 SQL statement in AWR by elapsed time … so I’m not making this up. It’s significant.

This got me thinking – how can we try to identify heavy-hitting SQL which is being missed by AWR – ASH surely? – and what can we do to help – COLORED SQL?

Here is my first attempt at such an ASH query, trying to find queries which have taken longer than 5 minutes in a single execution in a single snapshot period but which aren’t in AWR.

SELECT * 
FROM   (SELECT h.dbid
        ,      h.instance_number
        ,      h.snap_id
        ,      h.sql_id
        ,      h.sql_exec_id
        ,      h.sql_exec_start
        ,      h.qc_instance_id||'::'||h.qc_session_id||'::'||h.qc_session_serial# px_details
        ,      COUNT(*)
        ,      COUNT(DISTINCT h.session_id)
        --,      MIN(h.snap_id) min_snap
        --,      MAX(h.snap_id) max_snap
        ,      MIN(h.sample_time)
        ,      MAX(h.sample_time)
        ,    ((EXTRACT(DAY    FROM (MAX(h.sample_time) - MIN(h.sample_time)))*86400)
            + (EXTRACT(HOUR   FROM (MAX(h.sample_time) - MIN(h.sample_time)))*3600)
            + (EXTRACT(MINUTE FROM (MAX(h.sample_time) - MIN(h.sample_time)))*60)
            + (EXTRACT(SECOND FROM (MAX(h.sample_time) - MIN(h.sample_time))))) duration
        FROM   dba_hist_active_sess_history h
        ,      dba_hist_snapshot            s
        WHERE  h.dbid            = s.dbid
        AND    h.snap_id         = s.snap_id
        AND    h.instance_number = s.instance_number
        AND    sql_id           IS NOT NULL
        AND    sql_exec_id      IS NOT NULL
        AND    NOT EXISTS (SELECT 1
                           FROM   dba_hist_sqlstat st
                           WHERE  st.dbid            = h.dbid
                           AND    st.snap_id         = h.snap_id --BETWEEN xxx.min_snap AND xxx.max_snap
                           AND    st.instance_number = h.instance_number
                           AND    st.sql_id          = h.sql_id)
        GROUP BY 
               h.dbid
        --
        ,      h.snap_id
        --
        ,      h.instance_number
        ,      h.sql_id
        ,      h.sql_exec_id
        ,      h.sql_exec_start
        ,      h.qc_instance_id
        ,      h.qc_session_id
        ,      h.qc_session_serial#) xxx
WHERE  duration > 600
ORDER BY snap_id DESC, duration DESC;

The main problem with this query is that DURATION does not tell us how long the query has been active in the database.

We could have a query which has been active all that time but was doing relatively little in the DB.

I had a quick look (again) at TM_DELTA_TIME and TM_DELTA_DB_TIME but they are still unreliable – in some of the rows it looks ok, in others not.

Let’s forget that and pretend that the SQL above works ok.

Perhaps we can add a limit WHERE the COUNT(*) > threshold because otherwise this is currently giving me 100s of statements in my production database.

That gives me a token 166 statements over the past 30 days.

And that’s a problem because my intention was to use this to identify SQL statements to add to AWR via ADD_COLORED_SQL.
And that has a limit of 100 statements.

Ah… but no, the other penny has just dropped whilst writing this…

If I add color this SQL, that will make no difference. That just means AWR will record that SQL if it’s there.

But if it’s not in the shared pool at snapshot time, it’s not in the shared pool so it’s not in AWR.

Duh!

It’s almost as if we need a process which if a sql statement has breached a certain threshold then it won’t let it be flushed until MMON has written it to AWR.

No, that’s not right. That would present other problems.

I could shorten the AWR snapshot interval but I’m not convinced.

I think this is an oft-overlooked yet significant shortcoming.

So, for the moment at least, I’m left with what I think is a problem and I can’t think of a good solution… can you?

Or are you thinking mountain vs molehill?

Quick Cross-Period AWR Comparison

Here’s a query which I find useful in order to have a very quick comparison across AWR snapshots of the high level time model statistics.
The numbers should match those in the associated section in the AWR report.

If you feel compulsed, obsessively, with tuning then you may see some blips here and there which then encourage you to dive into the AWR detail for that snapshot.

Or quite often I get in in the morning and there might be an email about slow overnight processes and generic “database has been slow” enquiries and before I start hunting around to prove or disprove the allegations, this can be a useful frame.

WITH subq_snaps AS
(SELECT dbid                dbid
 ,      instance_number     inst
 ,      snap_id             e_snap
 ,      lag(snap_id) over (partition by instance_number, startup_time order by snap_id) b_snap
 ,      TO_CHAR(begin_interval_time,'D') b_day
 ,      TO_CHAR(begin_interval_time,'DD-MON-YYYY HH24:MI') b_time
 ,      TO_CHAR(end_interval_time,'HH24:MI')   e_time
 ,    ((extract(day    from (end_interval_time - begin_interval_time))*86400)
     + (extract(hour   from (end_interval_time - begin_interval_time))*3600)
     + (extract(minute from (end_interval_time - begin_interval_time))*60)
     + (extract(second from (end_interval_time - begin_interval_time)))) duration
 FROM   dba_hist_snapshot)
SELECT ss.inst
,      ss.b_snap
,      ss.e_snap
,      ss.b_time
,      ss.e_time
,      TO_CHAR(ROUND(MAX(CASE WHEN bm.stat_name = 'DB time' THEN em.value - bm.value END)/1000000/60,2),'999999990.99')                                  db_time
,      TO_CHAR(ROUND(MAX(CASE WHEN bm.stat_name = 'DB time' THEN em.value - bm.value END)/(ss.duration*1000000),1),'999999990.99')        aas
,      (SELECT round(average,2)
        FROM   dba_hist_sysmetric_summary sm
        WHERE  sm.dbid            = ss.dbid
        AND    sm.snap_id         = ss.e_snap
        AND    sm.instance_number = ss.inst
        AND    sm.metric_name     = 'Average Synchronous Single-Block Read Latency'
        AND    sm.group_id        = 2)                                                                                                                   assbl
,      TO_CHAR(ROUND(MAX(CASE WHEN bm.stat_name = 'DB CPU' THEN em.value - bm.value END)/1000000,2),'999999990.99')                                      db_cpu
,      TO_CHAR(ROUND(MAX(CASE WHEN bm.stat_name = 'sql execute elapsed time' THEN em.value - bm.value END)/1000000,2),'999999990.99')                    sql_time
,      TO_CHAR(ROUND(MAX(CASE WHEN bm.stat_name = 'PL/SQL execution elapsed time' THEN em.value - bm.value END)/1000000,2),'999999990.99')               plsql_time
,      TO_CHAR(ROUND(MAX(CASE WHEN bm.stat_name = 'parse time elapsed' THEN em.value - bm.value END)/1000000,2),'999999990.00')                          parse_time
,      TO_CHAR(ROUND(MAX(CASE WHEN bm.stat_name = 'failed parse elapsed time' THEN em.value - bm.value END)/1000000,2),'999999990.99')                   failed_parse
,      TO_CHAR(ROUND(MAX(CASE WHEN bm.stat_name = 'hard parse (sharing criteria) elapsed time' THEN em.value - bm.value END)/1000000,2),'999999990.99')  hard_parse_sharing
,      TO_CHAR(ROUND(MAX(CASE WHEN bm.stat_name = 'RMAN cpu time (backup/restore)' THEN em.value - bm.value END)/1000000,2),'999999990.99')              rman_cpu
,      TO_CHAR(ROUND(MAX(CASE WHEN bm.stat_name = 'connection management call elapsed time' THEN em.value - bm.value END)/1000000,2),'999999990.99')     connection_mgmt
,      TO_CHAR(ROUND(MAX(CASE WHEN bm.stat_name = 'sequence load elapsed time' THEN em.value - bm.value END)/1000000,2),'999999990.99')                  sequence_load
,      TO_CHAR(ROUND(100*MAX(CASE WHEN bm.stat_name = 'DB CPU' THEN em.value - bm.value END) 
           / NULLIF(MAX(CASE WHEN bm.stat_name = 'DB time' THEN em.value - bm.value END),0),2),'999999990.99')                                           db_cpu_perc
,      TO_CHAR(ROUND(100*MAX(CASE WHEN bm.stat_name = 'sql execute elapsed time' THEN em.value - bm.value END)
           / NULLIF(MAX(CASE WHEN bm.stat_name = 'DB time' THEN em.value - bm.value END),0),2),'999999990.99')                                           sql_time_perc
,      TO_CHAR(ROUND(MAX(CASE WHEN bm.stat_name = 'PL/SQL execution elapsed time' THEN em.value - bm.value END)
           / NULLIF(MAX(CASE WHEN bm.stat_name = 'DB time' THEN em.value - bm.value END),0),2),'999999990.99')                                           plsql_time_perc
,      TO_CHAR(ROUND(MAX(CASE WHEN bm.stat_name = 'parse time elapsed' THEN em.value - bm.value END)
           / NULLIF(MAX(CASE WHEN bm.stat_name = 'DB time' THEN em.value - bm.value END),0),2),'999999990.99')                                           parse_time_perc
,      TO_CHAR(ROUND(MAX(CASE WHEN bm.stat_name = 'failed parse elapsed time' THEN em.value - bm.value END)
           / NULLIF(MAX(CASE WHEN bm.stat_name = 'DB time' THEN em.value - bm.value END),0),2),'999999990.99')                                           failed_parse_perc
,      TO_CHAR(ROUND(MAX(CASE WHEN bm.stat_name = 'hard parse (sharing criteria) elapsed time' THEN em.value - bm.value END)
           / NULLIF(MAX(CASE WHEN bm.stat_name = 'DB time' THEN em.value - bm.value END),0),2),'999999990.99')                                           hard_parse_sharing_perc
,      TO_CHAR(ROUND(MAX(CASE WHEN bm.stat_name = 'RMAN cpu time (backup/restore)' THEN em.value - bm.value END)
           / NULLIF(MAX(CASE WHEN bm.stat_name = 'DB time' THEN em.value - bm.value END),0),2),'999999990.99')                                           rman_cpu_perc
FROM  subq_snaps              ss
,     dba_hist_sys_time_model em  
,     dba_hist_sys_time_model bm  
WHERE bm.dbid                   = ss.dbid
AND   bm.snap_id                = ss.b_snap  
AND   bm.instance_number        = ss.inst
AND   em.dbid                   = ss.dbid
AND   em.snap_id                = ss.e_snap
AND   em.instance_number        = ss.inst
AND   bm.stat_id                = em.stat_id  
GROUP BY 
       ss.dbid
,      ss.inst
,      ss.b_day
,      ss.b_snap
,      ss.e_snap
,      ss.b_time
,      ss.e_time
,      ss.duration
--HAVING b_day NOT IN (6,7)
--AND    inst = 2 
--AND b_snap = 18673
--AND    e_time = '17:00'
ORDER BY b_snap DESC;

Also, similar query for comparing the different IO metrics.
Again should match numbers in AWR.

WITH subq_snaps AS
(SELECT dbid                dbid
 ,      instance_number     inst
 ,      snap_id             e_snap
 ,      lag(snap_id) over (partition by instance_number, startup_time order by snap_id) b_snap
 ,      TO_CHAR(begin_interval_time,'D') b_day
 ,      TO_CHAR(begin_interval_time,'DD-MON-YYYY HH24:MI') b_time
 ,      TO_CHAR(end_interval_time,'HH24:MI')   e_time
 ,    ((extract(day    from (end_interval_time - begin_interval_time))*86400)
     + (extract(hour   from (end_interval_time - begin_interval_time))*3600)
     + (extract(minute from (end_interval_time - begin_interval_time))*60)
     + (extract(second from (end_interval_time - begin_interval_time)))) duration
 FROM   dba_hist_snapshot)
,    io_stats AS
(SELECT ss.*
 ,      bv.event_name
 ,      ev.time_waited_micro_fg - bv.time_waited_micro_fg time_waited_micro
 ,      ev.total_waits_fg       - bv.total_waits_fg       waits
 FROM   subq_snaps            ss
 ,      dba_hist_system_event bv
 ,      dba_hist_system_event ev
 WHERE  bv.dbid                   = ss.dbid
 AND    bv.snap_id                = ss.b_snap  
 AND    bv.instance_number        = ss.inst
 AND    bv.event_name            IN ('db file sequential read','direct path read','direct path read temp','db file scattered read','db file parallel read')
 AND    ev.dbid                   = ss.dbid
 AND    ev.snap_id                = ss.e_snap
 AND    ev.instance_number        = ss.inst
 AND    ev.event_id               = bv.event_id)
SELECT io.dbid
,      io.inst
,      io.b_snap
,      io.e_snap
,      io.b_time
,      io.e_time
,      (SELECT ROUND(average,2)
        FROM   dba_hist_sysmetric_summary sm
        WHERE  sm.dbid            = io.dbid
        AND    sm.snap_id         = io.e_snap
        AND    sm.instance_number = io.inst
        AND    sm.metric_name     = 'Average Synchronous Single-Block Read Latency'
        AND    sm.group_id        = 2) assbl
,      MAX(CASE WHEN event_name = 'db file sequential read' THEN ROUND(CASE WHEN waits < 0 THEN NULL ELSE waits END) END) single_waits
,      MAX(CASE WHEN event_name = 'db file scattered read'  THEN ROUND(CASE WHEN waits < 0 THEN NULL ELSE waits END) END) multi_waits
,      MAX(CASE WHEN event_name = 'db file parallel read'   THEN ROUND(CASE WHEN waits < 0 THEN NULL ELSE waits END) END) prefch_wait
,      MAX(CASE WHEN event_name = 'direct path read'        THEN ROUND(CASE WHEN waits < 0 THEN NULL ELSE waits  END) END) direct_waits
,      MAX(CASE WHEN event_name = 'direct path read temp'   THEN ROUND(CASE WHEN waits < 0 THEN NULL ELSE waits END) END)  temp_waits
,      MAX(CASE WHEN event_name = 'db file sequential read' THEN ROUND(CASE WHEN waits < 0 THEN NULL ELSE waits END/duration) END) iops_single
,      MAX(CASE WHEN event_name = 'db file sequential read' THEN ROUND(CASE WHEN time_waited_micro/1000/1000 < 0 THEN NULL ELSE time_waited_micro/1000/1000 END) END) single_secs_total
,      MAX(CASE WHEN event_name = 'db file sequential read' THEN ROUND((time_waited_micro/1000)/NULLif(waits,0)) END) single_avg
,      MAX(CASE WHEN event_name = 'db file scattered read'  THEN ROUND(CASE WHEN waits < 0 THEN NULL ELSE waits END/duration) END) iops_multi
,      MAX(CASE WHEN event_name = 'db file scattered read'  THEN ROUND(CASE WHEN time_waited_micro/1000/1000 < 0 THEN NULL ELSE time_waited_micro/1000/1000 END) END) multi_secs_total
,      MAX(CASE WHEN event_name = 'db file scattered read'  THEN ROUND((time_waited_micro/1000)/NULLif(waits,0)) END) multi_avg
,      MAX(CASE WHEN event_name = 'db file parallel read'   THEN ROUND(CASE WHEN waits < 0 THEN NULL ELSE waits END/duration) END) iops_prefch
,      MAX(CASE WHEN event_name = 'db file parallel read'   THEN ROUND(CASE WHEN time_waited_micro/1000/1000 < 0 THEN NULL ELSE time_waited_micro/1000/1000 END) END) prefch_secs_total
,      MAX(CASE WHEN event_name = 'db file parallel read'   THEN ROUND((time_waited_micro/1000)/NULLif(waits,0)) END) prefch_avg
,      MAX(CASE WHEN event_name = 'direct path read'        THEN ROUND(CASE WHEN waits < 0 THEN NULL ELSE waits  END/duration) END) iops_direct
,      MAX(CASE WHEN event_name = 'direct path read'        THEN ROUND(CASE WHEN time_waited_micro/1000/1000 < 0 THEN NULL ELSE time_waited_micro/1000/1000 END) END) direct_secs_total
,      MAX(CASE WHEN event_name = 'direct path read'        THEN ROUND((time_waited_micro/1000)/NULLif(waits,0)) END) direct_avg
,      MAX(CASE WHEN event_name = 'direct path read temp'   THEN ROUND(CASE WHEN waits < 0 THEN NULL ELSE waits END/duration) END) iops_temp
,      MAX(CASE WHEN event_name = 'direct path read temp'   THEN ROUND(CASE WHEN time_waited_micro/1000/1000 < 0 THEN NULL ELSE time_waited_micro/1000/1000 END) END) temp_secs_total
,      MAX(CASE WHEN event_name = 'direct path read temp'   THEN ROUND((time_waited_micro/1000)/NULLif(waits,0)) END) temp_avg
FROM   io_stats io
GROUP BY 
       io.dbid
,      io.inst
,      io.b_day
,      io.b_snap
,      io.e_snap
,      io.b_time
,      io.e_time
,      io.duration
HAVING b_day NOT IN (6,7)
AND    inst = 2 
--AND b_snap = 18673
--AND    e_time = '17:00'
ORDER BY b_snap DESC;

Finally, one of the queries I used the most for quick checks – comparing top SQL in AWR across periods:

SELECT x.*, (SELECT sql_text from dba_hist_sqltext t where t.sql_id = x.sql_id and rownum = 1) txt
FROM (
SELECT sn.snap_id
,      TO_CHAR(sn.end_interval_time,'DD-MON-YYYY HH24:MI') dt
,      st.sql_id
,      st.instance_number
,      st.parsing_schema_name
,      st.plan_hash_value
,      SUM(st.fetches_delta) fch
,      SUM(rows_processed_delta) rws
,      SUM(executions_delta)     execs
,      ROUND(SUM(elapsed_time_delta)/1000/1000)   elp
,      ROUND(SUM(elapsed_time_delta)/1000/1000/NVL(NULLIF(SUM(executions_delta),0),1),2)   elpe
,      ROUND(SUM(cpu_time_delta)/1000/1000)       cpu
,      SUM(buffer_gets_delta)    gets
,      ROUND(SUM(iowait_delta)/1000/1000)         io
,      ROUND(SUM(clwait_delta)/1000/1000)         cl
,      ROUND(SUM(ccwait_delta)/1000/1000)         cc
,      ROUND(SUM(apwait_delta)/1000/1000)         ap
,      ROUND(SUM(plsexec_time_delta)/1000/1000)   pl
,      ROUND(SUM(disk_reads_delta))         disk_reads
,      ROUND(SUM(direct_writes_delta))        direct_writes
,      ROW_NUMBER() over (PARTITION BY sn.dbid, sn.snap_id, st.instance_number
                          ORDER BY SUM(elapsed_time_delta) desc) rn
FROM   dba_hist_snapshot sn
,      dba_hist_sqlstat  st
WHERE  st.dbid            = sn.dbid
AND    st.snap_id         = sn.snap_id
AND    sn.instance_number = st.instance_number
GROUP BY 
       sn.dbid
,      sn.snap_id
,      sn.end_interval_time
,      st.sql_id
,      st.instance_number
,      st.parsing_schema_name
,      st.plan_hash_value
) x
WHERE rn <= 5
ORDER by snap_id DESC, instance_number, rn;

Let me know if you spot any errors or anomolies or obvious improvements / additions.

SQL Plan Management Choices

My thoughts on SQL plan management decision points: SPM SQL Patches are also available, primarily to avoid a specific problem not to enforce a particular plan, and are not covered in the above flowchart.

Epoch

Note to self because it’s just one of those date/timezone-related topics which just doesn’t seem to stick…

Epoch/Unix time – See https://en.wikipedia.org/wiki/Unix_time

Unix time (also known as POSIX time or erroneously as Epoch time) is a system for describing instants in time, defined as the number of seconds that have elapsed since 00:00:00 Coordinated Universal Time (UTC), Thursday, 1 January 1970,

Firstly when converting from Oracle date or timestamp – we need to work from UTC not local time.

select systimestamp
,      ((extract( day    from systimestamp - TO_TIMESTAMP('01/01/1970', 'MM/DD/YYYY'))*24*60*60)
      + (extract( hour   from systimestamp - TO_TIMESTAMP('01/01/1970', 'MM/DD/YYYY'))*60*60)
      + (extract( minute from systimestamp - TO_TIMESTAMP('01/01/1970', 'MM/DD/YYYY'))*60)
      + (round(extract( second from systimestamp - TO_TIMESTAMP('01/01/1970', 'MM/DD/YYYY')))))*1000
        now_epoch
from dual;

SYSTIMESTAMP                         NOW_EPOCH
----------------------------------- -------------
26-JUN-15 11.57.09.634813000 +01:00 1435319830000 

If we plug that epoch into somewhere like http://www.epochconverter.com/ , we can see it’s wrong – it’s 1 hour ahead of where it should be:

Assuming that this timestamp is in milliseconds:
GMT: Fri, 26 Jun 2015 11:57:10 GMT
Your time zone: 26 June 2015 12:57:10 GMT+1:00 DST

That’s because we need to work in UTC using SYS_EXTRACT_UTC, i.e.

with now as
(select systimestamp                  now_ts
 ,      sys_extract_utc(systimestamp) now_utc
 ,      ((extract( day    from sys_extract_utc(systimestamp) - TO_TIMESTAMP('01/01/1970', 'MM/DD/YYYY'))*24*60*60)
       + (extract( hour   from sys_extract_utc(systimestamp) - TO_TIMESTAMP('01/01/1970', 'MM/DD/YYYY'))*60*60)
       + (extract( minute from sys_extract_utc(systimestamp) - TO_TIMESTAMP('01/01/1970', 'MM/DD/YYYY'))*60)
       + (round(extract( second from sys_extract_utc(systimestamp) - TO_TIMESTAMP('01/01/1970', 'MM/DD/YYYY')))))*1000
         now_epoch
 from dual)
select *
from   now;

NOW_TS                              NOW_UTC             NOW_EPOCH
----------------------------------- ------------------ -------------
26-JUN-15 12.03.35.231688000 +01:00 26-JUN-15 11.03.35 1435316626000 

Better!

This came about because there is a table storing epoch/unix time format which is originating from Java code and the developer said that the conversion was losing 1 hour and speculated that the DB might be “unsafe” when dealing with epoch time.

So, let’s convert this number back to a date or a timestamp and crush that notion.

with now as
(select systimestamp                  now_ts
 ,      sys_extract_utc(systimestamp) now_utc
 ,      ((extract( day    from sys_extract_utc(systimestamp) - TO_TIMESTAMP('01/01/1970', 'MM/DD/YYYY'))*24*60*60)
       + (extract( hour   from sys_extract_utc(systimestamp) - TO_TIMESTAMP('01/01/1970', 'MM/DD/YYYY'))*60*60)
       + (extract( minute from sys_extract_utc(systimestamp) - TO_TIMESTAMP('01/01/1970', 'MM/DD/YYYY'))*60)
       + (round(extract( second from sys_extract_utc(systimestamp) - TO_TIMESTAMP('01/01/1970', 'MM/DD/YYYY')))))*1000
         now_epoch
 from dual)
select now_ts
,      now_utc
,      now_epoch
,      TO_TIMESTAMP('01-01-1970','DD-MM-YYYY') + NUMTODSINTERVAL(now_epoch/1000,'SECOND') 
       epoch_back_to_ts
from   now;

NOW_TS                              NOW_UTC             NOW_EPOCH    EPOCH_BACK_TO_TS 
----------------------------------- ------------------ ------------- ------------------
26-JUN-15 12.09.45.671605000 +01:00 26-JUN-15 11.09.45 1435316986000 26-JUN-15 11.09.46 

Our conversion back is still in UTC so we need to convert, and there are numerous ways that we might want to convert that back:

with now as
(select systimestamp                  now_ts
 ,      sys_extract_utc(systimestamp) now_utc
 ,      ((extract( day    from sys_extract_utc(systimestamp) - TO_TIMESTAMP('01/01/1970', 'MM/DD/YYYY'))*24*60*60)
       + (extract( hour   from sys_extract_utc(systimestamp) - TO_TIMESTAMP('01/01/1970', 'MM/DD/YYYY'))*60*60)
       + (extract( minute from sys_extract_utc(systimestamp) - TO_TIMESTAMP('01/01/1970', 'MM/DD/YYYY'))*60)
       + (round(extract( second from sys_extract_utc(systimestamp) - TO_TIMESTAMP('01/01/1970', 'MM/DD/YYYY')))))*1000
         now_epoch
 from dual)
select now_ts
,      now_utc
,      now_epoch
,      TO_TIMESTAMP('01-01-1970','DD-MM-YYYY') + NUMTODSINTERVAL(now_epoch/1000,'SECOND') 
       epoch_back_to_utc
       ,      CAST(
       FROM_TZ(TO_TIMESTAMP('01-01-1970','DD-MM-YYYY') + NUMTODSINTERVAL(now_epoch/1000,'SECOND'),'UTC')
            AT TIME ZONE 'Europe/London' AS TIMESTAMP) 
       back_to_name
,      CAST(
       FROM_TZ(TO_TIMESTAMP('01-01-1970','DD-MM-YYYY') + NUMTODSINTERVAL(now_epoch/1000,'SECOND'),'UTC')
            AT LOCAL AS TIMESTAMP) 
       back_to_local
,      CAST(
       FROM_TZ(TO_TIMESTAMP('01-01-1970','DD-MM-YYYY') + NUMTODSINTERVAL(now_epoch/1000,'SECOND'),'UTC')
            AT TIME ZONE DBTIMEZONE AS TIMESTAMP) 
       back_to_dblocal
FROM   now;

NOW_TS                              NOW_UTC             NOW_EPOCH     EPOCH_BACK_TO_UTC  BACK_TO_NAME       BACK_TO_LOCAL      BACK_TO_DBLOCAL  
----------------------------------- ------------------ -------------- ------------------ ------------------ ------------------ ------------------
26-JUN-15 12.12.23.936868000 +01:00 26-JUN-15 11.12.23 1435317144000  26-JUN-15 11.12.24 26-JUN-15 12.12.24 26-JUN-15 12.12.24 26-JUN-15 12.12.24 

SQL Plan Baseline Manual Evolution

I’ve never really spent time on evolution because most of the time I use baselines for emergency sql plan management and not with the added value of controlled evolution with which the feature was conceived.

But some observations on sql plan baseline evolution originating from the questions:

  • What does evolution actually do?
  • What information does it use?
  • What happens with DML?

Starting point – one baselined plan

1. FTS plan in memory from SQL which should do FTS and put in a baseline

drop table t1;
drop table t2;

create table t1 
(col1  number 
,col2  varchar2(50) 
,flag  varchar2(2));
 
insert into t1 
select rownum 
,      lpad('X',50,'X') 
,      case when rownum = 1 
            then 'Y1'
            when rownum = 2 
            then 'Y2' 
            when mod(rownum,2) = 0
            then 'N1'
            else 'N2' 
       end 
from   dual 
connect by rownum <= 100000; 

exec dbms_stats.gather_table_stats(USER,'T1',method_opt=>'FOR COLUMNS flag SIZE 5',no_invalidate=>FALSE); 
 
create index i1 on t1 (flag);

var n varchar2(2) 
exec :n := 'N1';
select /*+ domtest */ count(*), max(col2) from t1 where flag = :n;
select * from table(dbms_xplan.display_cursor);

declare
  l_op pls_integer;
begin
  l_op :=
  dbms_spm.load_plans_from_cursor_cache('45sygvgu8ccnz');
end;
/

select to_char(signature), sql_handle, sql_text, origin, last_verified, enabled, accepted from dba_sql_plan_baselines;

TO_CHAR(SIGNATURE)  SQL_HANDLE           SQL_TEXT                                                          ORIGIN       LAST_VERIFIED ENABLED ACCEPTED
------------------- -------------------- ----------------------------------------------------------------- ------------ ------------- ------- --------
8384366354684364262 SQL_745b40c6fdb2c5e6 select /*+ domtest */ count(*), max(col2) from t1 where flag = :n MANUAL-LOAD                YES     YES      

We have a full table scan baselined.

Step 2 – an alternative plan

Next, let’s execute a SQL statement which would rather do an index scan for a different set of binds.

When there is a baselined plan, then it does this automatically at hard parse time.

If a lower cost plan is generated then store it for future evaluation / evolution.

We need to flush or invalidate the existing cursor to get that hard parse though.

Then we get an AUTO_CAPTURE plan in DBA_SQL_PLAN_BASELINES, ACCEPTED = NO

comment on table t1 is 'flushme';
var n varchar2(2) 
exec :n := 'Y1';
select /*+ domtest */ count(*), max(col2) from t1 where flag = :n;
select * from table(dbms_xplan.display_cursor);

select to_char(signature), sql_handle, sql_text, origin, last_verified, enabled, accepted from dba_sql_plan_baselines;

TO_CHAR(SIGNATURE)  SQL_HANDLE           SQL_TEXT                                                          ORIGIN       LAST_VERIFIED ENABLED ACCEPTED
------------------- -------------------- ----------------------------------------------------------------- ------------ ------------- ------- --------
8384366354684364262 SQL_745b40c6fdb2c5e6 select /*+ domtest */ count(*), max(col2) from t1 where flag = :n MANUAL-LOAD                YES     YES      
8384366354684364262 SQL_745b40c6fdb2c5e6 select /*+ domtest */ count(*), max(col2) from t1 where flag = :n AUTO-CAPTURE               YES     NO   

Step 3 – Manual Evolution

Now let’s do an evolution.
And I’m going to set MODULE so we can track some sql.
(Perhaps we could/should do a sql trace but, having done it, it doesn’t really tell us much more in this case).

exec dbms_application_info.set_module('DOMTEST','SPM');

set serveroutput on
set long 10000
DECLARE
    r CLOB;
BEGIN
    r := 
    dbms_spm.evolve_sql_plan_baseline
    (sql_handle => 'SQL_745b40c6fdb2c5e6',verify=>'YES',commit=>'NO');
    dbms_output.put_line(r);
END;
/

Why would you do VERIFY => ‘NO’?

Only use case really is if you’ve already evaluated the plan yourself and want to just get it accepted.

The combination of VERIFY=> NO and COMMIT=> NO does nothing.

Our report looks like this:

------------------------------------------------------------------------------
                        Evolve SQL Plan Baseline Report
-------------------------------------------------------------------------------

Inputs:
-------
  SQL_HANDLE = SQL_745b40c6fdb2c5e6
  PLAN_NAME  = 
  TIME_LIMIT = DBMS_SPM.AUTO_LIMIT
  VERIFY     = YES
  COMMIT     = NO

Plan: SQL_PLAN_78qu0svyv5jg68576eb1f
------------------------------------
  Plan was verified: Time used .1 seconds.
  Plan passed performance criterion: 314.79 times better than baseline plan.

                            Baseline Plan      Test Plan       Stats Ratio
                            -------------      ---------       -----------
  Execution Status:              COMPLETE       COMPLETE
  Rows Processed:                       1              1
  Elapsed Time(ms):                 3.708           .047             78.89
  CPU Time(ms):                     3.777              0
  Buffer Gets:                        944              3            314.67
  Physical Read Requests:               0              0
  Physical Write Requests:              0              0
  Physical Read Bytes:                  0              0
  Physical Write Bytes:                 0              0
  Executions:                           1              1

-------------------------------------------------------------------------------
                                 Report Summary
-------------------------------------------------------------------------------
Number of plans verified: 1
Number of plans accepted: 0

I’ve also run this on 12.1.0.2 and there doesn’t seem to be a massive difference in behaviour although there are a few extra columns and stats in the output.
(including the two plans which aren’t in the output of 11g because they’re not stored although they could be derived from the analysis sql).

What? Where? How?

What has been run behind the scenes to do this?

What information has been used to do the evaluation?

select * from v$sql where module = 'DOMTEST' order by first_load_time;

Statements of interest:

/* SQL Analyze(686,0) */ select /*+ domtest */ count(*), max(col2) from t1 where flag = :n

/* SQL Analyze(686,0) */ select /*+ domtest */ count(*), max(col2) from t1 where flag = :n

SELECT obj_type, plan_id, comp_data FROM sqlobj$data WHERE signature = :1 AND category = :2;

SELECT st.sql_text, sod.comp_data, sox.optimizer_env, sox.bind_data, sox.parsing_schema_name, sox.creator
FROM   sql$text st, sqlobj$data sod, sqlobj$auxdata sox
WHERE  sod.signature = st.signature 
AND    sod.signature = :1 
AND    sod.category  = :2 
AND    sod.obj_type  = :3 
AND    sod.plan_id   = :4 
AND    sod.signature = sox.signature 
AND    sod.category  = sox.category
AND    sod.obj_type  = sox.obj_type
AND    sod.plan_id   = sox.plan_id;

What was run to evaluate the performance of the relative plans?

See the first two /* SQL Analyze */ statements.

We can look more closely at v$sql for these:

select s.sql_text, s.child_number, s.plan_hash_value phv
,      (select extractvalue(xmltype(other_xml),'other_xml/info[@type="plan_hash_2"]')
        from   v$sql_plan p
        where  p.sql_id          = s.sql_id
        and    p.plan_hash_value = s.plan_hash_value
        and    p.other_xml is not null) plan_hash_2
, s.executions, s.elapsed_time/1000/s.executions elapsed_ms, s.cpu_time/1000/s.executions cpu_ms, s.buffer_gets/s.executions, b.value_string
from   v$sql s, table(dbms_sqltune.extract_binds(s.bind_data)) b
where  s.module = 'DOMTEST' 
and    s.sql_id = '8rnh80j2b09kt';

SQL_TEXT                                                                                    CHILD_NUMBER PHV        PLAN_HASH_2 EXECUTIONS ELAPSED_MS CPU_MS S.BUFFER_GETS/S.EXECUTIONS VALUE_STRING
------------------------------------------------------------------------------------------- ------------ ---------- ----------- ---------- ---------- ------ -------------------------- ------------
/* SQL Analyze(686,0) */ select /*+ domtest */ count(*), max(col2) from t1 where flag = :n  0            1634389831 1634389831  10         4.0923     4.0994 944                        Y1
/* SQL Analyze(686,0) */ select /*+ domtest */ count(*), max(col2) from t1 where flag = :n  1            2239163167 2239163167  10         1.1728     0.1999 3                          Y1

We can see where some of the stats from the report have come from – they look to be an average over a number of executions in this example. The timings aren’t an exact match.

Note that in this case PHV = PHV2, but PHV2 is the plan_id in the underlying baselined tables as I have discussed previously elsewhere.

Where did the SQL statement come from?

Where did the bind data come from?

That’s the last statement in my “statements of interest” above.

SELECT st.sql_text, sod.comp_data, sox.optimizer_env, sox.bind_data, sox.parsing_schema_name, sox.creator
FROM   sql$text st, sqlobj$data sod, sqlobj$auxdata sox
WHERE  sod.signature = st.signature 
AND    sod.signature = :1 
AND    sod.category  = :2 
AND    sod.obj_type  = :3 
AND    sod.plan_id   = :4 
AND    sod.signature = sox.signature 
AND    sod.category  = sox.category
AND    sod.obj_type  = sox.obj_type
AND    sod.plan_id   = sox.plan_id;

The basic sql plan baseline mechanism is by signature (V$SQL.EXACT_MATCHING_SIGNATURE) but SQL$TEXT stores the actual sql text.

How else could be evaluate the baselined plans during evolution because the sql might not be in memory or in AWR!

The statement to execure bind data is from the hard parse which generated the lower cost of plan of interest and is stored in SYS.SQLOBJ$AUXDATA.BIND_DATA.

Makes sense, right?

What happens with DML?

There’s nothing I can find which spells it out in black and white but only the SELECT of the DML, be it INSERT SELECT, UPDATE or MERGE.

Even a sql trace doesn’t make this crystal clear.

But using a similar method to above:

create table t2
(cnt   number 
,col2  varchar2(50) 
,flag  varchar2(2));

var n varchar2(2) 
exec :n := 'N1';
insert into t2
select /*+ domtest */ count(*), max(col2), :n from t1 where flag = :n;
select * from table(dbms_xplan.display_cursor);

select sql_id, child_number, is_bind_aware, is_bind_sensitive, is_shareable, to_char(exact_matching_signature) sig
,      executions, plan_hash_value
from   v$sql 
where  sql_id = '08ravsws1s6bn';

declare
  l_op pls_integer;
begin
  l_op :=
  dbms_spm.load_plans_from_cursor_cache('08ravsws1s6bn');
end;
/

select to_char(signature), sql_handle, sql_text, origin, last_verified, enabled, accepted from dba_sql_plan_baselines;

comment on table t2 is 'flushme';

var n varchar2(2) 
exec :n := 'Y1';
insert into t2
select /*+ domtest */ count(*), max(col2), :n from t1 where flag = :n;
select * from table(dbms_xplan.display_cursor);

select to_char(signature), sql_handle, sql_text, origin, last_verified, enabled, accepted from dba_sql_plan_baselines;

exec dbms_application_info.set_module('DOMTEST2','SPM');
select * from v$sql where module = 'DOMTEST2' order by first_load_time;

set serveroutput on
set long 10000
DECLARE
    r CLOB;
BEGIN
    r := 
    dbms_spm.evolve_sql_plan_baseline
    (sql_handle => 'SQL_4ed45b79c88f3392',verify=>'YES',commit=>'NO');
    dbms_output.put_line(r);
END;
/

We get a similar report and similar evaluation sql.
We can see from ROWS_PROCESSED that no rows were actually inserted.

select s.sql_text, s.child_number, s.plan_hash_value phv
,      (select extractvalue(xmltype(other_xml),'other_xml/info[@type="plan_hash_2"]')
        from   v$sql_plan p
        where  p.sql_id          = s.sql_id
        and    p.plan_hash_value = s.plan_hash_value
        and    p.other_xml is not null) plan_hash_2
, s.rows_processed,s.executions, s.elapsed_time/1000/s.executions elapsed_ms, s.cpu_time/1000/s.executions cpu_ms, s.buffer_gets/s.executions
from   v$sql s
where  s.module = 'DOMTEST2' 
and    s.sql_id = 'b04dpwja6smx7';

SQL_TEXT                                                                                                       CHILD_NUMBER PHV        PLAN_HASH_2 ROWS_PROCESSED EXECUTIONS ELAPSED_MS CPU_MS S.BUFFER_GETS/S.EXECUTIONS
-------------------------------------------------------------------------------------------                    ------------ ---------- ----------- -------------- ---------- ---------- ------ --------------------------
/* SQL Analyze(771,0) */ insert into t2 select /*+ domtest */ count(*), max(col2), :n from t1 where flag = :n  0            1634389831 1634389831  0              10         4.1668     4.1993 944                       
/* SQL Analyze(771,0) */ insert into t2 select /*+ domtest */ count(*), max(col2), :n from t1 where flag = :n  1            2239163167 2239163167  0              10         0.1512     0.1  3                         

I also tried with the addition of a PK just to double check that it couldn’t be inserted. Nothing to see.

Finally…

One note about evolution behaviour and the adjustment of the threshold for plan verification, i.e. the bit which results in this:

Plan was verified: Time used .1 seconds.
Plan passed performance criterion: 314.79 times better than baseline plan.

You can use parameter _plan_verify_improvement_margin, e.g.

ALTER SYSTEM SET “_plan_verify_improvement_margin”=120;

Unfortunately valid range for parameter is between 100 and 10000 so not high enough for me to show it’s influence on the results above (> 300x improvment).

Follow

Get every new post delivered to your Inbox.

Join 266 other followers