Strategies for Minimising SQL Execution Plan Instability

Execution Plan Instability – What is the problem?

The Oracle Optimizer is a complex piece of software and with every release it becomes more complex.

RBO

In the beginning, the Optimizer was rule-based.

The Optimizer had a ranked list of heuristics used to optimize a query, picking the lowest ranked rule available.

This rule-based mode, whilst still in use with some internal Oracle dictionary queries, has been unsupported since version 10.1.

This means that no code changes have been officially made to the RBO and no bug fixes are provided. There are many features that the RBO is unaware of.

Applications should not still be using the rule-based optimizer.

CBO

The Cost-Based Optimizer is designed to evaluate a number of execution plans and pick the one estimated to be the fastest.

Many of the CBO features are designed to combat common optimization problems and many of these problems occur where development teams are not necessarily aware of the full implications of the features that they are using.

These built-in default behaviours roughly conform to the infamous 80%:20% rule in that most of the time they do a good job but they are not infallible.

Bind Variables, Literals and Histograms

Most of the features which are deliberately designed such that plan instability is difficult to avoid stem from the decision to use bind variables or literals (NOT that they should be mutually exclusive) and the interaction of the former with histograms.

In article 1 of his Philosophy series, Jonathan Lewis neatly sums this up:

Histograms and bind variables exist for diametrically opposed reasons – they won’t work well together without help

Fundamentally, bind variables exist to provide shareable plans.

Bind variables should be used where we are not interested in getting specific optimizations for differing parameters.

Literals should be used where we want the Optimizer to pay particular attention to data distribution and skew for the specific parameters supplied.

A SQL statement will/should often have some parameters which should be literals and some which should be binds.

From this point onwards, there has been a whole raft of features designed to treat the misuse of one or the other.

In Oracle 8.1.6, histograms were introduced.

Histograms exist to provide specific data distribution information, particularly relevant to specific parameters.

Also, in that version, we got CURSOR_SHARING – targeted at applications using literals instead of binds such that SQL which was identical part from the use of binds was rewritten to use sytem-generated bind variables.

Then in 9.2, we got bind variable peeking.

This feature was introduced so that the optimizer could peek at the values supplied at parse time and use data distribution information specific to these parse-time values to generate an execution plan which suited those values.

In addition and at the same time, through these various versions to present day, we have had the default behaviour of DBMS_STATS statistic gathering to let the database decide which columns it will create histograms on, based on the SQL which has been running.

This means that new histograms can suddenly spring up – or existing histograms unexpectedly disappear – on all sorts of columns. This can be problematic on columns with large numbers of distinct values AND particularly so on join cardinalities where there may be a mismatch of histograms on both sides of the join.

Ever since this point, we have had a conflict of interest in feature usage and an ever increasing number of additional functionality to battle against this conflict – adaptive cursor sharing, cardinality feedback, etc, etc

Finally, the education message got blurred or lost somewhere along the line to the extent that a lot of systems blindly overuse bind variables because of the perceived performance impact of using literals.

This situation is not helped by the fact PL/SQL is designed to encourage bind variables.

Using supplied parameters as literals means using some construct of dynamic SQL, not difficult but nevertheless an added complexity and also another feature which is often blindly discouraged.

SQL Execution Plan Instability – Is SPM a viable approach?

SPM Overview

In Oracle 11g, Oracle introduced SQL Plan Baselines as a SQL Plan Management feature.

The core attraction of this functionality is that you can “lock in” a particular plan or plans for a SQL statement. This stores a set of outline hints and a specific plan hash value in relation to a specific SQL signature.

The Optimizer then uses that set of hints to try to reproduce the desired plan. If it can it will, if it can’t it will reject the hintset.

Additionally, the Optimizer completes its best-cost execution plan optimization anyway so that it can provide the ability to do a controlled evolution of baselined plans in the event that the lower-cost plan that it would have used is different and better performing than the baselined plan.

To use this the database parameters (session or system) just need to be configured to capture plans into a baseline and then use them.

There is flexibility to this capture/use. You can capture everything as it is generated; you could capture everything from memory now and/or at regular intervals and import into a baseline; you could capture everything from AWR into a SQL Tuning Set and then import into a baseline; or you could capture everything from another environment and export/import it into another.

And at any point, you can turn off capture and continue to use those which you currently have – this usage continues to capture any lower cost plans that the optimizer would have generated for any existing baselined plans

For a more detailed look at SPM and a comparison with SQL Profiles, see documentation.

Sounds good – why isn’t it the silver bullet for system-wide stability?

This approach might be good enough, but it is not a cast-iron guarantee of system-wide stability.

There are a number of reasons why not.

New & Changed SQL

Firstly, you need to have captured all your plans already. If you get brand new SQL, then there will be nothing in SPM.

Depending on your application, this may be a significant concern.

For example, consider an application making heavy usage of Hibernate which generates SQL statements.

A minor change in the domain model can mean a change of system-generated table alias in many statements.

As a result, you may suddenly get a lot of brand new SQL and significant numbers of baselined statements which you will now never see again.

What are the baselined plans based on? Are they the best plans? The only plans ever used?

If you suddenly baseline the plans for a large number of SQL statements, you are dictating which plan is to be used.

The plans will be based on the parsed plans in memory or in AWR at the time.
Are these the best plans?

Does/should this SQL statement always use this plan?

Are there normally multiple plans for different bindsets?

What if you normally get benefit from adapative cursor sharing?

ACS and baselines

What if your application benefits from adapative cursor sharing?

Sure, you can baseline multiple plans but these plans have no ACS information.

As soon as that ACS information is no longer in memory (as happens), there is no shortcut in a baseline to regain that, you still have to have the multiple executions required for the optimizer to recognize that which plans to use for which bindsets.

Parsing overhead

Widespread usage of baselines might, depending on your system performance profile, have a significant impact on the parsing resources.

This is because it always generates a best-cost plan anyway.

Then if that is not the baselined plan, it will use the baselined hintset to try to generate the specific plan hash.

In addition, it that is not possible, it will use just the optimizer_features_enable hint to try to generate the required plan.

So, you might in a heavily-baselined system to be doing 2x the parse work of a non-baselined system.

This might well be easily accommodated but there are systems where this would cause a problem.

Integration with development and testing processes

A SQL Plan Baseline is tied to a SQL statement based on the statement’s EXACT_MATCHING_SIGNATURE – a hash of the SQL statement which has been case and space normalized.

If a SQL statement materially changes, the baseline no longer applies.

How aware are developers of the presence of baselines?

And how to integrate with the development process?

How will our release process deal with baselines?

And if baselining large numbers of plans is being considered, then we have to think about where these will be generated.

The natural implication (of widespread baseline usage) is that new functionality being promoted to Prod would have a set of tested, baselined plans accompanying it and these would presumably have to be generated in a Prod-like environment which included Prod-like volumes.

SQL Execution Plan Instability – Decision Time

There is a decision to be made.

(And/or perhaps there is often a conceptual over-simplification by senior management to combat? Or at least a lack of deep understanding of the beast that we’re dealing with here?)

Do you want the Optimizer to try to get a better execution plan sometimes?

If the answer is yes, then you have to accept that it will get it wrong from time to time.

In particular, the various feedback and adaptive mechanisms are designed to recognize that they have got it wrong.

BUT they need that problematic execution in the the first place – sometimes more than one – to recognize that fact.

That one problematic execution could be your next Priority 1 incident.

In addition, the feedback mechanism is not perfect and it still can make subsequent executions worse in some circumstances.

SQL Execution Plan Instability – Turn it off?

IF your primary goal is plan stability – and I think many teams would claim this is this their goal but they do not embrace the implications of this – then perhaps a better decision is to turn off the various features which cause or combine to cause most of the problems of instability.

Appropriate usage of binds, appropriate usage of literals

Choose whether to use a bind variable or a literal as is appropriate for the value / column / SQL statement.

A SQL statement might have a mix of both.

DBMS_STATS defaults

A METHOD_OPT of FOR ALL INDEXED COLUMNS SIZE AUTO is an immediate red flag. This is never a good setting.

FOR ALL COLUMNS SIZE AUTO without table-level preferences (SET_TABLE_PREFS) is another red flag.

As an interim step, consider use FOR ALL COLUMNS SIZE REPEAT to lock in the current histogram usage.

The end goal should be to have table level preferences set for all tables.

This relies on knowing your data, your data distribution, your code, and knowing which histograms make sense (i.e. for skewed columns) – it will be far fewer than gathered by default.

For columns with significant numbers of distinct skew, it may be necessary to manually craft the histograms.

Volatile tables

Volatile tables should have stats set to an appropriate setting to generate appropriate plans for all situations and then those stats should be locked.

Stats which are gathered at any point during the volatility cycle may be good or may be problematic.

Similarly dynamic sampling can only see the data at the time of hard parse – you might be lucky and this is better than stats which say 0 rows but it can be a time bomb.

Turn off optimizer features

Turning off optimizer features might be best done via a LOGON trigger and turning such off for a subset of application users. These features include:

  • Bind Variable Peeking – off via _optim_peek_user_binds = false
  • Cardinality Feedback and ACS – should be disabled by turning off bind variable peeking but off via _optimizer_use_feedback = false, _optimizer_adaptive_cursor_sharing = false, _optimizer_extended_cursor_sharing_rel = “none”
  • Dynamic Sampling – optimizer_dynamic_sampling to 0
  • 12c additional adaptive features – e.g. adaptive execution plans

Additionally it probably makes sense to turn off the adaptive direct path read behaviour or anything with the word adaptive or dynamic in it or associated to it

This functionality decides on whether to do full segment scans via the buffer cache or not and the behaviour is a runtime decision depending on the size of the object relative to the buffer cache AND depending on how much of the segment is currently in the cache.

  • Adaptive direct path reads – _serial_direct_read = always

All too often I’ve seen a concurrently executed SQL statement switch to a “bad” plan involving a full table scan delivered via direct path reads stress out the IO subsystem because of the number of concurrent executions of that query which then affects performance across the DB.

Caveat

The actions above are still not sufficient to guarantee plan stability but, for this goal above all else, this is likely to be the most appropriate action.

However, to further guarantee stability it is still likely that some degree of hinting – whether via manual hints, sql profiles or baselines – might be necessary for small numbers of SQL statements where the intial cost-based plan is not appropriate e.g. GTTs and other situations but it should be small number of statements.

SQL Execution Plan Instability – Summary & Opinion

The actions discussed above are made on the basis that we want to minimise the chance of execution plan instability at all costs.

By making this decision, we are prioritizing stability over all the features within Oracle designed to generate better plans for specific situations, sets of binds, etc.

Personally, I always recommend going with the default behaviour until such time as it causes a significant problem.

I also always recommend matching the scope of a solution to the scope of a problem.

For example, if we have a problem with one or two SQL statements, the potential solutions should be limited to those SQL statements.

We should never be making changes with a potential system-wide impact for the sake of a couple of SQL statements.

And even parameter changes can be injected to specific SQL statements either via a SQL Patch or via manual hints.

In my opinion, having multiple plans for a SQL statement is expected.

But what is under-appreciated is the extent to which this is normal.

These are normally only noticed when they cause a problem and the significant number of plans which regularly change plans without impact tend to go unnoticed.

It is also my opinion that SQL execution issues occur mostly when the SQL is badly written or when incompatible features are combined – e.g. peeked binds together with histograms – and I prefer to tackle that root cause rather than a generic concept of plan instability being inherently problematic.

A future path?

Might it be a good idea for there to be a built-in choice of optimizer “umbrella” modes – maximum performance vs maximum stability which would turn on/off some of the features above and/or adjust their sensitivity/aggressiveness?

AWR: Was a baselined plan used?

Sometimes a simple question turns out to be harder than expected.

“Can we see if a particular SQL execution in AWR used a baselined plan?”

Initial thoughts might be:

Q: Does DBMS_XPLAN.DISPLAY_AWR tell us this?
A: Apparently not. See below. This question could also be rephrased as two other possible questions:

Q:Isn’t there a column on DBA_HIST_SQLSTAT which tell us this?
A: No. You’d think there should be. There is a SQL_PROFILE column. There isn’t a SQL_PLAN_BASELINE column.
There also isn’t an EXACT_MATCHING_SIGNATURE although there is a FORCE_MATCHING_SIGNATURE.

Q: Is it in DBA_HIST_SQL_PLAN.OTHER_XML?
A. No although this is where DBMS_XPLAN.DISPLAY_AWR gets it’s notes about cardinality feedback and dynamic sampling from.

First of all, Let’s see that it’s not shown in DBMS_XPLAN.DISPLAY_AWR.
Let’s get a sql statement baselined, in AWR and not in memory.

The usual setup

SQL> create table t1
  2  (col1  number
  3  ,col2  varchar2(50)
  4  ,flag  varchar2(2));

Table created.

SQL> insert into t1
  2  select rownum
  3  ,      lpad('X',50,'X')
  4  ,      case when rownum = 1
  5              then 'Y1'
  6              when rownum = 2
  7              then 'Y2'
  8              when mod(rownum,2) = 0
  9              then 'N1'
 10              else 'N2'
 11         end
 12  from   dual
 13* connect by rownum <= 100000
SQL> /

100000 rows created.

SQL> commit;

Commit complete.

Get a plan and put it in a baseline:

SQL> var n varchar2(2);
SQL> exec :n := 'N1';

PL/SQL procedure successfully completed.

SQL> select count(*), max(col2) from t1 where flag = :n;
select * from table(dbms_xplan.display_cursor);
  COUNT(*) MAX(COL2)
---------- --------------------------------------------------
     49999 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

SQL> select * from table(dbms_xplan.display_cursor);


PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
SQL_ID  731b98a8u0knf, child number 0
-------------------------------------
select count(*), max(col2) from t1 where flag = :n

Plan hash value: 3724264953

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |       |   221 (100)|          |
|   1 |  SORT AGGREGATE    |      |     1 |    30 |            |          |
|*  2 |   TABLE ACCESS FULL| T1   | 50254 |  1472K|   221   (1)| 00:00:03 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("FLAG"=:N)

Note
-----
   - dynamic sampling used for this statement (level=2)

SQL> declare
  2   l_op pls_integer;
  3  begin
  4    l_op :=
  5    dbms_spm.load_plans_from_cursor_cache('731b98a8u0knf');
  6* end;
SQL> /

PL/SQL procedure successfully completed.

Verify that DBMS_XPLAN.DISPLAY_CURSOR reports baseline usage:

SQL> select count(*), max(col2) from t1 where flag = :n;

  COUNT(*) MAX(COL2)
---------- --------------------------------------------------
     49999 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
SQL_ID  731b98a8u0knf, child number 0
-------------------------------------
select count(*), max(col2) from t1 where flag = :n

Plan hash value: 3724264953

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |       |   221 (100)|          |
|   1 |  SORT AGGREGATE    |      |     1 |    30 |            |          |
|*  2 |   TABLE ACCESS FULL| T1   | 50254 |  1472K|   221   (1)| 00:00:03 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("FLAG"=:N)

Note
-----
   - dynamic sampling used for this statement (level=2)
   - SQL plan baseline SQL_PLAN_13w748wknkcwd616acf47 used for this statement

See that dynamic_sampling note most likely comes from OTHER_XML and Baseline note doesn’t, it comes from V$SQL:

SQL>select sql_id
  2  ,      to_char(exact_matching_signature) sig
  3  ,      plan_hash_value
  4  ,      sql_plan_baseline
  5  from   v$sql
  6* where  sql_id = '731b98a8u0knf'
SQL> /

SQL_ID        SIG                                      PLAN_HASH_VALUE SQL_PLAN_BASELINE
------------- ---------------------------------------- --------------- ------------------------------
731b98a8u0knf 1292784087274697613                           3724264953 SQL_PLAN_13w748wknkcwd616acf47

SQL> set long 10000
SQL> select other_xml
  2  from   v$sql_plan
  3  where  sql_id = '731b98a8u0knf'
  4* and    other_xml is not null
SQL> /

OTHER_XML
--------------------------------------------------------------------------------
<other_xml><info type="db_version">11.2.0.3</info><info type="parse_schema"><![C
DATA["E668983_DBA"]]></info><info type="dynamic_sampling">2</info><info type="pl
an_hash">3724264953</info><info type="plan_hash_2">1634389831</info><peeked_bind
s><bind nam=":N" pos="1" dty="1" csi="178" frm="1" mxl="32">4e31</bind></peeked_
binds><outline_data><hint><![CDATA[IGNORE_OPTIM_EMBEDDED_HINTS]]></hint><hint><!
[CDATA[OPTIMIZER_FEATURES_ENABLE('11.2.0.3')]]></hint><hint><![CDATA[DB_VERSION(
'11.2.0.3')]]></hint><hint><![CDATA[OPT_PARAM('query_rewrite_enabled' 'false')]]
></hint><hint><![CDATA[ALL_ROWS]]></hint><hint><![CDATA[OUTLINE_LEAF(@"SEL$1")]]
></hint><hint><![CDATA[FULL(@"SEL$1" "T1"@"SEL$1")]]></hint></outline_data></oth
er_xml>

Now, let’s add statement to AWR, do an AWR snap and flush the SP:

SQL> exec dbms_workload_repository.add_colored_sql('731b98a8u0knf');

PL/SQL procedure successfully completed.

SQL> exec dbms_workload_repository.create_snapshot;

PL/SQL procedure successfully completed.

SQL> alter system flush shared_pool;

System altered.

Now if we look back at the plan in memory, there’s nothing there.

SQL> select * from v$sql where sql_id = '731b98a8u0knf';

no rows selected

SQL> select * from table(dbms_xplan.display_cursor('731b98a8u0knf'));

PLAN_TABLE_OUTPUT
------------------------------------------------------------
SQL_ID: 731b98a8u0knf, child number: 0 cannot be found

But the SQL is in AWR and let’s see that DBMS_XPLAN.DISPLAY_AWR does not tells us that a baseline is used:

SQL> select * from table(dbms_xplan.display_awr('731b98a8u0knf'));

PLAN_TABLE_OUTPUT
---------------------------------------------------------------
SQL_ID 731b98a8u0knf
--------------------
select count(*), max(col2) from t1 where flag = :n

Plan hash value: 3724264953

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |       |       |   221 (100)|          |
|   1 |  SORT AGGREGATE    |      |     1 |    30 |            |          |
|   2 |   TABLE ACCESS FULL| T1   | 50238 |  1471K|   221   (1)| 00:00:03 |
---------------------------------------------------------------------------

Note
-----
   - dynamic sampling used for this statement (level=2)

But it will tell us if a sql profile is used because it can via DBA_HIST_SQLSTAT.SQL_PROFILE – I leave that exercise to the reader.

So, back to the original question:

“Can we see if a particular SQL execution in AWR used a baselined plan?”

I’ve never needed to do this but as an exercise we can infer this connection by the presence of a baselined plan with the same phv2 but it’s convoluted.

It may be that there are simpler ways but it seems to me that we need to:

  1. Get the exact_matching_signature from the sql text
  2. Get the phv2 out of DBA_HIST_SQL_PLAN.OTHER_XML
  3. Match that phv2 with planid which is not exposed in DBA_SQL_PLAN_BASELINES but is in the underlying SYS.SQLOBJ$ tables
SQL> with subq_mysql as
  2  (select sql_id
  3   ,      (select dbms_sqltune.sqltext_to_signature(ht.sql_text)
  4           from dual) sig
  5   from   dba_hist_sqltext       ht
  6   where  sql_id = '731b98a8u0knf')
  7  ,    subq_baselines as
  8  (select b.signature
  9   ,      b.plan_name
 10   ,      b.accepted
 11   ,      b.created
 12   ,      o.plan_id
 13   from   subq_mysql             ms
 14   ,      dba_sql_plan_baselines b
 15   ,      sys.sqlobj$            o
 16   where  b.signature   = ms.sig
 17   and    o.signature   = b.signature
 18   and    o.name        = b.plan_name)
 19  ,    subq_awr_plans as
 20  (select  sn.snap_id
 21   ,       to_char(sn.end_interval_time,'DD-MON-YYYY HH24:MI') dt
 22   ,       hs.sql_id
 23   ,       hs.plan_hash_value
 24   ,       t.phv2
 25   ,       ms.sig
 26   from    subq_mysql        ms
 27   ,       dba_hist_sqlstat  hs
 28   ,       dba_hist_snapshot sn
 29   ,       dba_hist_sql_plan hp
 30   ,       xmltable('for $i in /other_xml/info
 31                     where $i/@type eq "plan_hash_2"
 32                     return $i'
 33                    passing xmltype(hp.other_xml)
 34                    columns phv2 number path '/') t
 35   where   hs.sql_id          = ms.sql_id
 36   and     sn.snap_id         = hs.snap_id
 37   and     sn.instance_number = hs.instance_number
 38   and     hp.sql_id          = hs.sql_id
 39   and     hp.plan_hash_value = hs.plan_hash_value
 40   and     hp.other_xml      is not null)
 41  select awr.*
 42  ,       nvl((select max('Y')
 43               from   subq_baselines b
 44               where  b.signature = awr.sig
 45               and    b.accepted  = 'YES'),'N') does_baseline_exist
 46  ,      nvl2(b.plan_id,'Y','N') is_baselined_plan
 47  ,      to_char(b.created,'DD-MON-YYYY HH24:MI')  when_baseline_created
 48  from   subq_awr_plans awr
 49  ,      subq_baselines b
 50  where  b.signature (+) = awr.sig
 51  and    b.plan_id   (+) = awr.phv2
 52* order by awr.snap_id
SQL> /

   SNAP_ID DT                SQL_ID        PLAN_HASH_VALUE       PHV2        SIG D I WHEN_BASELINE_CRE
---------- ----------------- ------------- --------------- ---------- ---------- - - -----------------
      8703 05-FEB-2014 15:18 731b98a8u0knf      3724264953 1634389831 1.2928E+18 Y Y 05-FEB-2014 15:08

SQL>

Surprising.
Unless, of course, I’ve overlooked something bleeding obvious.

Failed Logins can cause row cache lock on dc_users

One “feature” to be aware of in old-fashioned Oracle 11g – took me a while to figure it out, so just in case anyone else gets hit by it…

I changed the password of a schema in a test database to stop anything connecting whilst that schema was being logically refreshed.

Following this change, this environment started crawling but, despite being heavily used, no-one was complaining apart from me.

I couldn’t connect, just seemed to be “hanging” and for any existing connections certain statements were very slow.

I could see lots of row cache locks, the p1 of which decoded to dc_users.

Spent a while trying to figure it out, and eventually asked for a couple of second opinions on at the environment – they had no problems connecting, could see the row cache locks and eventually bounced the database.

Nothing got any better.

Another strange thing was that when looking at the sessions via ASH, they were all SYS sessions but with a session module that could not have been SYS – i.e application connections, etc.

So, that pointed at recursive SQL.

select count(*) over (partition by sample_time) sess_cnt
,      (select username from dba_users u where u.user_id = h.user_id) u
,      sample_time, session_state, session_id, sql_id
,      module, event, p1, blocking_session
from   dba_hist_active_sess_history h
where  event = 'row cache lock'
order by sample_time desc;
SESS_CNT U     SAMPLE_TIME                   SESSION_STATE SESSION_ID MODULE                  EVENT             P1 BLOCKING_SESSION
-------- ----- ----------------------------- ------------- ---------- ----------------------- ----------------- -- ----------------
      19 SYS   11-JUL-13 10.19.15.935000000  WAITING             1828 sqlplus@xx (TNS V1-V3)  row cache lock     7                  
      19 SYS   11-JUL-13 10.19.15.935000000  WAITING             2172 JDBC Thin Client        row cache lock     7             1898 
      19 SYS   11-JUL-13 10.19.15.935000000  WAITING             1837 JDBC Thin Client        row cache lock     7             2144 
      19 SYS   11-JUL-13 10.19.15.935000000  WAITING             1848 JDBC Thin Client        row cache lock     7             1898 
      19 SYS   11-JUL-13 10.19.15.935000000  WAITING             1880 sqlplus@xx (TNS V1-V3)  row cache lock     7             2144 
      19 SYS   11-JUL-13 10.19.15.935000000  WAITING             1886 JDBC Thin Client        row cache lock     7             1909 
      19 SYS   11-JUL-13 10.19.15.935000000  WAITING             2012 JDBC Thin Client        row cache lock     7             2144 
      19 SYS   11-JUL-13 10.19.15.935000000  WAITING             2026 sqlplus@xx (TNS V1-V3)  row cache lock     7             1898 
      19 SYS   11-JUL-13 10.19.15.935000000  WAITING             2031 JDBC Thin Client        row cache lock     7             1898 
      19 SYS   11-JUL-13 10.19.15.935000000  WAITING             2041 JDBC Thin Client        row cache lock     7             2144 
      19 SYS   11-JUL-13 10.19.15.935000000  WAITING             2058 sqlplus@xx (TNS V1-V3)  row cache lock     7             1898 
      19 SYS   11-JUL-13 10.19.15.935000000  WAITING             2063 JDBC Thin Client        row cache lock     7             2144 
      19 SYS   11-JUL-13 10.19.15.935000000  WAITING             2064 SQL Developer           row cache lock     7             2144 
      19 SYS   11-JUL-13 10.19.15.935000000  WAITING             2073 sqlplus@xx (TNS V1-V3)  row cache lock     7             1898 
      19 SYS   11-JUL-13 10.19.15.935000000  WAITING             2084 sqlplus@xx (TNS V1-V3)  row cache lock     7                  
      19 SYS   11-JUL-13 10.19.15.935000000  WAITING             2105 sqlplus@xx (TNS V1-V3)  row cache lock     7             1898 
      19 SYS   11-JUL-13 10.19.15.935000000  WAITING             2123 sqlplus@xx (TNS V1-V3)  row cache lock     7             1898 
      19 SYS   11-JUL-13 10.19.15.935000000  WAITING             2144 JDBC Thin Client        row cache lock     7             2041 
      19 SYS   11-JUL-13 10.19.15.935000000  WAITING             1832 sqlplus@xx (TNS V1-V3)  row cache lock     7             2144 

 19 rows selected 
select trunc(sample_time), count(*), min(sample_time), max(sample_time)
from   dba_hist_active_Sess_history
where  event = 'row cache lock'
group by trunc(sample_time)
order by trunc(sample_time);
TRUNC(SAMPLE_TIME) COUNT(*) MIN(SAMPLE_TIME)              MAX(SAMPLE_TIME)            
------------------ -------- ----------------------------- -----------------------------
02-JUL-13                10 02-JUL-13 04.08.48.955000000  02-JUL-13 16.12.30.128000000  
03-JUL-13                 2 03-JUL-13 11.42.30.367000000  03-JUL-13 22.02.38.886000000  
04-JUL-13                 6 04-JUL-13 04.10.25.568000000  04-JUL-13 22.02.52.536000000  
05-JUL-13                16 05-JUL-13 04.09.00.444000000  05-JUL-13 22.01.13.067000000  
06-JUL-13                 2 06-JUL-13 01.06.31.261000000  06-JUL-13 14.07.15.208000000  
08-JUL-13                26 08-JUL-13 03.10.53.909000000  08-JUL-13 22.00.45.416000000  
09-JUL-13                19 09-JUL-13 04.06.55.191000000  09-JUL-13 20.23.16.801000000  
10-JUL-13             48293 10-JUL-13 04.00.41.732000000  10-JUL-13 19.00.04.750000000  
11-JUL-13              5412 11-JUL-13 09.20.40.833000000  11-JUL-13 10.41.49.610000000  

 9 rows selected 

Eventually the penny dropped …

See bug 7715339 – Logon failures causes “row cache lock” waits – Allow disable of logon delay [ID 7715339.8]

In 11g there is an intentional delay between allowing failed logon
attempts to retry. For some specific application types this can cause
a problem as the row cache entry is locked for the duration of the
delay . This can lead to excessive row cache lock waits for DC_USERS
for specific users / schemas.

After 3 successive failures a sleep delay is introduced starting
at 3 seconds and extending to 10 seconds max. During each delay
the user X row cache lock is held in exclusive mode preventing
any concurrent logon attempt as user X (and preventing any
other operation which would need the row cache lock for user X).

That explains a) the problem and b) why the impact was limited.

RAC – It’s all about the node affinity

Or what happens when you don’t have node affinity.

Here’s an old, short post that was lying around unpublished.

SQL executed on the 14th on node 2, on 15th on node 1.

DT SQL ID Node Rows Execs Elapsed Cluster Wait Time
15-NOV 7287jpw5vtm8j 1 4129697 7591 32625 32175
14-NOV 7287jpw5vtm8j 2 19703872 30909 78 60

Timings in seconds.

32625 seconds is just over 9 hours, of which 32175 is cluster wait time.

Database services and application datasource config have now been reconfigured to the extent that there is node affinity and the data required for the workload on node 1 is completely segregated from the workload on node 2.

If you use RAC as a black box and you have problems with performance or problems with predictability of performance then look into whether node affinity, or a lack thereof, is your problem.

CPU_time much higher than elapsed_time

Of course, one of the first places you should go is Oracle Support, but if you, like me, put off doing that and scratch your head for a while wondering why ELAPSED_TIME in V$SQL might be significantly lower than CPU_TIME when there are no obvious other explanations, then note these two bugs in 11.1.0.7:

BUG 7168259 – Millisecond elapsed_time shows as 0 : Elapsed_time for SQL statements which execute in under one millisecond can show up as 0

BUG 7561762 – Elapsed_time is under-reported : Elapsed_time can be 0 or significantly less than V$SQL.CPU_TIME if there is significant idle wait class time

Flip Flop Hooray, Ho

(Title reference for any old skool hip hop fans?)

An investigation into a flip flopping sql statement – the downside of dynamic sampling – why it can be a problem for fast changing tables and frequently executed SQL.

Yesterday morning a sql statement which processes outstanding messages was having performance problems.

There were two child cursors being executed by the four concurrent sessions, depending on their optimizer settings.

I’m not going to detail the optimzer environment differences because it’s mainly irrelvant other than, because of the child cursors, there was a hard parse at time T0 and a hard parse at time T1 and then these cursors were executed repeatedly / regularly over a number of hours.

SQL statistics show this:

SQL> select to_char(sn.end_interval_time,'DD/MM/YYYY HH24:MI') dt
  2  ,      st.plan_hash_value plan_hash
  3  ,      rows_processed_delta rws
  4  ,      executions_delta     execs
  5  ,      elapsed_time_delta   elp
  6  ,      cpu_time_delta       cpu
  7  ,      buffer_gets_delta    gets
  8  ,      iowait_delta         io
  9  from   dba_hist_snapshot sn
 10  ,      dba_hist_sqlstat  st
 11  where  st.snap_id            = sn.snap_id
 12  and    st.sql_id             = '1s1vgw24g2u1k'
 13  and    st.elapsed_time_delta > 0
 14  and    sn.end_interval_time between to_timestamp('18/06/2012 09:00','DD/MM/YYYY HH24:MI')
 15                              and     to_timestamp('18/06/2012 10:00','DD/MM/YYYY HH24:MI')
 16  order by sn.snap_id, st.plan_hash_value; 

DT                PLAN_HASH        RWS      EXECS        ELP        CPU       GETS         IO
---------------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
18/06/2012 09:00 1081606725       1525        182 2163175676  101353586    3938349 2062115102
18/06/2012 09:00 3791039348         48        299     449781     380944      71472      68103
18/06/2012 09:30 1081606725       2032         24 3019992733  146890665   12995474 2886335201
18/06/2012 09:30 3791039348         43        289   73670959   73468837   56740610      24170

SQL> 

Whereas this is representative of “normal”:

DT                PLAN_HASH        RWS      EXECS        ELP        CPU       GETS         IO
---------------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
15/06/2012 09:00 3791039348       1289       1077  174434637   16030542    5470115     280865

SQL> 

I mentioned that this code was processing outstanding messages.
These different processes aren’t balanced and aren’t expected to be.

But plan hash 3791039348 is good for everyone and in general plan hash 1081606725 suits no-one.

Here’s the sanitised SQL – remember this isn’t so much a SQL tuning/rewriting exercise so much as a root cause of slowness exercise (where the root cause is not initially meant to be “because the SQL is not brilliant”)

      SELECT  ....
      FROM    cmsg,
              sint,
              setm,
              comp,
              trga,
		          seti,
		          trad,
              (SELECT COUNT(*) count,
                      setm_il.sett_mesg_tag sett_mesg_tag
               FROM   prco prco_il,
                      cmsg cmsg_il,
                      setm setm_il,
                      seti seti_il
               WHERE  cmsg_il.unacknowledged = prco_il.unack_code
               AND    cmsg_il.evnt_type      = 'SI'
               AND   (cmsg_il.evnt_tag       = seti_il.sett_istr_tag 
               OR     cmsg_il.evnt_tag       = seti_il.amending_si_tag)
               AND    setm_il.sett_istr_tag  = seti_il.sett_istr_tag
               GROUP BY setm_il.sett_mesg_tag) cmsg_si
      WHERE   cmsg.rowid IN (SELECT row_id
                             FROM (SELECT rowid row_id
                                   FROM   cmsg  cmsg2
                                   WHERE  unacknowledged = :p_unack_code
                                   AND    evnt_type     != 'SOD'
                                   ORDER BY seq_num)
                             WHERE rownum <= :p_batch_lim)
      AND     sint.sint_tag         (+) = cmsg.evnt_tag
      AND     sint.stus_ind         (+) = 'C'
      AND     setm.sett_mesg_tag    (+) = sint.sett_mesg_tag
      AND     comp.cmpy_num         (+) = setm.cmpy_num
      AND     trga.trga_code        (+) = comp.trga_code
      AND     seti.sett_istr_tag    (+) = setm.sett_istr_tag
      AND     trad.trad_tag         (+) = seti.trad_tag
      AND     trad.cmpy_num         (+) = seti.cmpy_num
      AND     cmsg_si.sett_mesg_tag (+) = sint.sett_mesg_tag
      ORDER by cmsg.seq_num;

Here is the relevant sections of the two execution plans:

The good:

SQL_ID 1s1vgw24g2u1k
 
Plan hash value: 3791039348
 
------------------------------------------------------------------------------------------------
| Id  | Operation                                         | Name       | Rows  | Cost  | Time  |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                                  |            |       |  9192 |       |
|   1 |  FOR UPDATE                                       |            |       |       |       |
|   2 |   SORT ORDER BY                                   |            |     1 |  9192 | 00:46 |
|   3 |    NESTED LOOPS OUTER                             |            |     1 |  9191 | 00:46 |
|   4 |     NESTED LOOPS OUTER                            |            |     1 |  9190 | 00:46 |
|   5 |      NESTED LOOPS OUTER                           |            |     1 |  9188 | 00:46 |
|   6 |       NESTED LOOPS OUTER                          |            |     1 |  9187 | 00:46 |
|   7 |        NESTED LOOPS OUTER                         |            |     1 |  9186 | 00:46 |
|   8 |         HASH JOIN OUTER                           |            |     1 |  9184 | 00:46 |
|   9 |          NESTED LOOPS OUTER                       |            |     1 |    10 | 00:01 |
|  10 |           NESTED LOOPS                            |            |     1 |     7 | 00:01 |
|  11 |            VIEW                                   | VW_NSO_1   |     3 |     5 | 00:01 |
|  12 |             SORT UNIQUE                           |            |     1 |       |       |
|  13 |              COUNT STOPKEY                        |            |       |       |       |
|  14 |               VIEW                                |            |     3 |     5 | 00:01 |
|  15 |                SORT ORDER BY STOPKEY              |            |     3 |     5 | 00:01 |
|  16 |                 TABLE ACCESS BY INDEX ROWID       | CMSG       |     3 |     4 | 00:01 |
|  17 |                  INDEX RANGE SCAN                 | CMSG3_IDX  |     4 |     3 | 00:01 |
|  18 |            TABLE ACCESS BY USER ROWID             | CMSG       |     1 |     1 | 00:01 |
|  19 |           TABLE ACCESS BY INDEX ROWID             | SETI       |     1 |     3 | 00:01 |
|  20 |            INDEX RANGE SCAN                       | SETI_1     |     1 |     2 | 00:01 |
|  21 |          VIEW                                     |            | 14490 |  9173 | 00:46 |
|  22 |           SORT GROUP BY                           |            | 14490 |       |       |
|  23 |            CONCATENATION                          |            |       |       |       |
|  24 |             NESTED LOOPS                          |            |       |       |       |
|  25 |              NESTED LOOPS                         |            |   562 |  8472 | 00:43 |
|  26 |               NESTED LOOPS                        |            |   428 |  6757 | 00:34 |
|  27 |                NESTED LOOPS                       |            |   304 |    30 | 00:01 |
|  28 |                 TABLE ACCESS BY INDEX ROWID       | CMSG       |   304 |    30 | 00:01 |
|  29 |                  INDEX FULL SCAN                  | CMSG4_IDX  |     1 |    29 | 00:01 |
|  30 |                 INDEX UNIQUE SCAN                 | PRCO_2_IDX |     1 |     0 |       |
|  31 |                PARTITION LIST ALL                 |            |     1 |    24 | 00:01 |
|  32 |                 TABLE ACCESS BY LOCAL INDEX ROWID | SETI       |     1 |    24 | 00:01 |
|  33 |                  INDEX RANGE SCAN                 | SETI_3_IDX |    36 |    22 | 00:01 |
|  34 |               INDEX RANGE SCAN                    | SETM_2_IDX |     1 |     2 | 00:01 |
|  35 |              TABLE ACCESS BY INDEX ROWID          | SETM       |     1 |     4 | 00:01 |
|  36 |             NESTED LOOPS                          |            |       |       |       |
|  37 |              NESTED LOOPS                         |            |    20 |   700 | 00:04 |
|  38 |               NESTED LOOPS                        |            |    15 |   640 | 00:04 |
|  39 |                NESTED LOOPS                       |            |    15 |   640 | 00:04 |
|  40 |                 TABLE ACCESS BY INDEX ROWID       | CMSG       |   304 |    30 | 00:01 |
|  41 |                  INDEX FULL SCAN                  | CMSG4_IDX  |     1 |    29 | 00:01 |
|  42 |                 TABLE ACCESS BY GLOBAL INDEX ROWID| SETI       |     1 |     2 | 00:01 |
|  43 |                  INDEX UNIQUE SCAN                | SETI_IDX   |     1 |     1 | 00:01 |
|  44 |                INDEX UNIQUE SCAN                  | PRCO_2_IDX |     1 |     0 |       |
|  45 |               INDEX RANGE SCAN                    | SETM_2_IDX |     1 |     2 | 00:01 |
|  46 |              TABLE ACCESS BY INDEX ROWID          | SETM       |     1 |     4 | 00:01 |
|  47 |         TABLE ACCESS BY INDEX ROWID               | SETM       |     1 |     2 | 00:01 |
|  48 |          INDEX UNIQUE SCAN                        | SETM_IDX   |     1 |     1 | 00:01 |
|  49 |        TABLE ACCESS BY INDEX ROWID                | COMP       |     1 |     1 | 00:01 |
|  50 |         INDEX UNIQUE SCAN                         | COMP_PK    |     1 |     0 |       |
|  51 |       TABLE ACCESS BY INDEX ROWID                 | TRGA       |     1 |     1 | 00:01 |
|  52 |        INDEX UNIQUE SCAN                          | TRGA_IDX   |     1 |     0 |       |
|  53 |      TABLE ACCESS BY GLOBAL INDEX ROWID           | SETI       |     1 |     2 | 00:01 |
|  54 |       INDEX UNIQUE SCAN                           | SETI_IDX   |     1 |     1 | 00:01 |
|  55 |     PARTITION LIST ITERATOR                       |            |     1 |     1 | 00:01 |
|  56 |      INDEX UNIQUE SCAN                            | TRAD_IDX   |     1 |     1 | 00:01 |
------------------------------------------------------------------------------------------------

And the downright bad and ugly:

SQL_ID 1s1vgw24g2u1k
 
Plan hash value: 1081606725
 
------------------------------------------------------------------------------------------------
| Id  | Operation                                         | Name       | Rows  | Cost  | Time  |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                                  |            |       |   436K|       |
|   1 |  FOR UPDATE                                       |            |       |       |       |
|   2 |   SORT ORDER BY                                   |            |     1 |   436K| 36:24 |
|   3 |    NESTED LOOPS OUTER                             |            |     1 |   436K| 36:24 |
|   4 |     NESTED LOOPS OUTER                            |            |     1 |   436K| 36:24 |
|   5 |      NESTED LOOPS OUTER                           |            |     1 |   436K| 36:23 |
|   6 |       NESTED LOOPS OUTER                          |            |     1 |   436K| 36:23 |
|   7 |        NESTED LOOPS OUTER                         |            |     1 |   436K| 36:23 |
|   8 |         HASH JOIN OUTER                           |            |     1 |   436K| 36:23 |
|   9 |          NESTED LOOPS OUTER                       |            |     1 |    10 | 00:01 |
|  10 |           NESTED LOOPS                            |            |     1 |     7 | 00:01 |
|  11 |            VIEW                                   | VW_NSO_1   |     2 |     5 | 00:01 |
|  12 |             SORT UNIQUE                           |            |     1 |       |       |
|  13 |              COUNT STOPKEY                        |            |       |       |       |
|  14 |               VIEW                                |            |     2 |     5 | 00:01 |
|  15 |                SORT ORDER BY STOPKEY              |            |     2 |     5 | 00:01 |
|  16 |                 TABLE ACCESS BY INDEX ROWID       | CMSG       |     2 |     4 | 00:01 |
|  17 |                  INDEX RANGE SCAN                 | CMSG3_IDX  |     2 |     3 | 00:01 |
|  18 |            TABLE ACCESS BY USER ROWID             | CMSG       |     1 |     1 | 00:01 |
|  19 |           TABLE ACCESS BY INDEX ROWID             | SETI       |     1 |     3 | 00:01 |
|  20 |            INDEX RANGE SCAN                       | SETI_1     |     1 |     2 | 00:01 |
|  21 |          VIEW                                     |            |   719K|   436K| 36:23 |
|  22 |           SORT GROUP BY                           |            |   719K|       |       |
|  23 |            CONCATENATION                          |            |       |       |       |
|  24 |             NESTED LOOPS                          |            |       |       |       |
|  25 |              NESTED LOOPS                         |            | 27212 |   403K| 33:40 |
|  26 |               HASH JOIN                           |            | 20520 |   301K| 25:06 |
|  27 |                HASH JOIN                          |            | 14473 |    35 | 00:01 |
|  28 |                 INDEX FULL SCAN                   | PRCO_2_IDX |   502 |     1 | 00:01 |
|  29 |                 TABLE ACCESS BY INDEX ROWID       | CMSG       | 14473 |    33 | 00:01 |
|  30 |                  INDEX FULL SCAN                  | CMSG4_IDX  |     1 |    32 | 00:01 |
|  31 |                PARTITION LIST ALL                 |            |    30M|   300K| 25:02 |
|  32 |                 TABLE ACCESS FULL                 | SETI       |    30M|   300K| 25:02 |
|  33 |               INDEX RANGE SCAN                    | SETM_2_IDX |     1 |     3 | 00:01 |
|  34 |              TABLE ACCESS BY INDEX ROWID          | SETM       |     1 |     5 | 00:01 |
|  35 |             NESTED LOOPS                          |            |       |       |       |
|  36 |              NESTED LOOPS                         |            |   960 | 32673 | 02:44 |
|  37 |               NESTED LOOPS                        |            |   724 | 29046 | 02:26 |
|  38 |                NESTED LOOPS                       |            |   724 | 29045 | 02:26 |
|  39 |                 TABLE ACCESS BY INDEX ROWID       | CMSG       | 14473 |    33 | 00:01 |
|  40 |                  INDEX FULL SCAN                  | CMSG4_IDX  |     1 |    32 | 00:01 |
|  41 |                 TABLE ACCESS BY GLOBAL INDEX ROWID| SETI       |     1 |     2 | 00:01 |
|  42 |                  INDEX UNIQUE SCAN                | SETI_IDX   |     1 |     1 | 00:01 |
|  43 |                INDEX UNIQUE SCAN                  | PRCO_2_IDX |     1 |     0 |       |
|  44 |               INDEX RANGE SCAN                    | SETM_2_IDX |     1 |     3 | 00:01 |
|  45 |              TABLE ACCESS BY INDEX ROWID          | SETM       |     1 |     5 | 00:01 |
|  46 |         TABLE ACCESS BY INDEX ROWID               | SETM       |     1 |     2 | 00:01 |
|  47 |          INDEX UNIQUE SCAN                        | SETM_IDX   |     1 |     1 | 00:01 |
|  48 |        TABLE ACCESS BY INDEX ROWID                | COMP       |     1 |     2 | 00:01 |
|  49 |         INDEX UNIQUE SCAN                         | COMP_PK    |     1 |     1 | 00:01 |
|  50 |       TABLE ACCESS BY INDEX ROWID                 | TRGA       |     1 |     1 | 00:01 |
|  51 |        INDEX UNIQUE SCAN                          | TRGA_IDX   |     1 |     0 |       |
|  52 |      TABLE ACCESS BY GLOBAL INDEX ROWID           | SETI       |     1 |     2 | 00:01 |
|  53 |       INDEX UNIQUE SCAN                           | SETI_IDX   |     1 |     1 | 00:01 |
|  54 |     PARTITION LIST ITERATOR                       |            |     1 |     1 | 00:01 |
|  55 |      INDEX UNIQUE SCAN                            | TRAD_IDX   |     1 |     1 | 00:01 |
------------------------------------------------------------------------------------------------
 
Note
-----
   - dynamic sampling used for this statement (level=4)
 

The trouble with specific examples is that sometimes they’re a bit longer than ideal (as well as being a bit out of context).

Let me highlight what I see as the problem:

  • There is a non-merged view with an OR (which is expanded to a CONCATENATION)
  • In the good plan, @ line 33, one part of the OR condition is dealt with via an INDEX RANGE SCAN on SETI_3_IDX.
  • In the bad plan, @ line 32, this is a TABLE ACCESS FULL on SETI.
  • There’s a dynamic sampling note in the bad plan

Good:

|  21 | VIEW                                    |            | 14490 |  9173   (1)| 00:00:46 |
...
|  26 |     NESTED LOOPS                        |            |   428 |  6757   (1)| 00:00:34 |
|  27 |      NESTED LOOPS                       |            |   304 |    30   (0)| 00:00:01 |
|  28 |       TABLE ACCESS BY INDEX ROWID       | CMSG       |   304 |    30   (0)| 00:00:01 |
|  29 |        INDEX FULL SCAN                  | CMSG4_IDX  |     1 |    29   (0)| 00:00:01 |
|  30 |       INDEX UNIQUE SCAN                 | PRCO_2_IDX |     1 |     0   (0)|          |
|  31 |      PARTITION LIST ALL                 |            |     1 |    24   (0)| 00:00:01 |
|  32 |       TABLE ACCESS BY LOCAL INDEX ROWID | SETI       |     1 |    24   (0)| 00:00:01 |
|  33 |        INDEX RANGE SCAN                 | SETI_3_IDX |    36 |    22   (0)| 00:00:01 |
...

Bad:

|  21 | VIEW                                    |            |   719K|   436K  (3)| 00:36:23 |
...
|  26 |     HASH JOIN                           |            | 20520 |   301K  (4)| 00:25:06 |
|  27 |      HASH JOIN                          |            | 14473 |    35   (3)| 00:00:01 |
|  28 |       INDEX FULL SCAN                   | PRCO_2_IDX |   502 |     1   (0)| 00:00:01 |
|  29 |       TABLE ACCESS BY INDEX ROWID       | CMSG       | 14473 |    33   (0)| 00:00:01 |
|  30 |        INDEX FULL SCAN                  | CMSG4_IDX  |     1 |    32   (0)| 00:00:01 |
|  31 |      PARTITION LIST ALL                 |            |    30M|   300K  (4)| 00:25:02 |
|  32 |       TABLE ACCESS FULL                 | SETI       |    30M|   300K  (4)| 00:25:02 |
...
Note
-----
   - dynamic sampling used for this statement (level=4)

It’s always nice to simplify a situation and because the problem seems to be in this non-merged view, we can actually eliminate half the sql and then see if we can just reproduce the problem by adjusting the estimates.

Which we can, and via trial and error, we see that as we approach an estimate of 14000 rows from CMSG, the plan changes to the FULL TABLE SCAN of SETI:

SQL> explain plan for
  2                 SELECT /*+
  3                          opt_estimate(table cmsg_il rows=14000)
  4                          */
  5                        COUNT(*) count,
  6                        setm_il.sett_mesg_tag sett_mesg_tag
  7                 FROM   prco    prco_il,
  8                        cmsg    cmsg_il,
  9                        setm    setm_il,
 10                        seti    seti_il
 11                 WHERE  cmsg_il.unacknowledged = prco_il.unack_code
 12                 AND    cmsg_il.evnt_type = 'SI'
 13                 AND   (cmsg_il.evnt_tag = seti_il.sett_istr_tag OR
 14                        cmsg_il.evnt_tag = seti_il.amending_si_tag)
 15                 AND    setm_il.sett_istr_tag = seti_il.sett_istr_tag
 16                 GROUP BY setm_il.sett_mesg_tag;

Explained.

SQL> select * from table(dbms_xplan.display);
Plan hash value: 273475595

-----------------------------------------------------------------------------------------------
| Id  | Operation                                | Name       | Rows  | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                         |            |   695K|   432K  (3)| 00:36:01 |
|   1 |  HASH GROUP BY                           |            |   695K|            |          |
|   2 |   CONCATENATION                          |            |       |            |          |
|   3 |    NESTED LOOPS                          |            |       |            |          |
|   4 |     NESTED LOOPS                         |            | 26323 |   400K  (3)| 00:33:23 |
|*  5 |      HASH JOIN                           |            | 19850 |   301K  (4)| 00:25:06 |
|*  6 |       HASH JOIN                          |            | 14000 |    35   (3)| 00:00:01 |
|   7 |        INDEX FULL SCAN                   | PRCO_2_IDX |   502 |     1   (0)| 00:00:01 |
|*  8 |        TABLE ACCESS BY INDEX ROWID       | CMSG       | 14000 |    33   (0)| 00:00:01 |
|*  9 |         INDEX FULL SCAN                  | CMSG4_IDX  |     1 |    32   (0)| 00:00:01 |
|  10 |       PARTITION LIST ALL                 |            |    30M|   300K  (4)| 00:25:02 |
|  11 |        TABLE ACCESS FULL                 | SETI       |    30M|   300K  (4)| 00:25:02 |
|* 12 |      INDEX RANGE SCAN                    | SETM_2_IDX |     1 |     3   (0)| 00:00:01 |
|  13 |     TABLE ACCESS BY INDEX ROWID          | SETM       |     1 |     5   (0)| 00:00:01 |
|  14 |    NESTED LOOPS                          |            |       |            |          |
|  15 |     NESTED LOOPS                         |            |   928 | 31605   (1)| 00:02:39 |
|  16 |      NESTED LOOPS                        |            |   700 | 28097   (1)| 00:02:21 |
|  17 |       NESTED LOOPS                       |            |   700 | 28097   (1)| 00:02:21 |
|* 18 |        TABLE ACCESS BY INDEX ROWID       | CMSG       | 14000 |    33   (0)| 00:00:01 |
|* 19 |         INDEX FULL SCAN                  | CMSG4_IDX  |     1 |    32   (0)| 00:00:01 |
|* 20 |        TABLE ACCESS BY GLOBAL INDEX ROWID| SETI       |     1 |     2   (0)| 00:00:01 |
|* 21 |         INDEX UNIQUE SCAN                | SETI_IDX   |     1 |     1   (0)| 00:00:01 |
|* 22 |       INDEX UNIQUE SCAN                  | PRCO_2_IDX |     1 |     0   (0)| 00:00:01 |
|* 23 |      INDEX RANGE SCAN                    | SETM_2_IDX |     1 |     3   (0)| 00:00:01 |
|  24 |     TABLE ACCESS BY INDEX ROWID          | SETM       |     1 |     5   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - access("CMSG_IL"."EVNT_TAG"="SETI_IL"."AMENDING_SI_TAG")
   6 - access("CMSG_IL"."UNACKNOWLEDGED"="PRCO_IL"."UNACK_CODE")
   8 - filter("CMSG_IL"."EVNT_TYPE"='SI')
   9 - filter("CMSG_IL"."UNACKNOWLEDGED" IS NOT NULL)
  12 - access("SETM_IL"."SETT_ISTR_TAG"="SETI_IL"."SETT_ISTR_TAG")
  18 - filter("CMSG_IL"."EVNT_TYPE"='SI')
  19 - filter("CMSG_IL"."UNACKNOWLEDGED" IS NOT NULL)
  20 - filter(LNNVL("CMSG_IL"."EVNT_TAG"="SETI_IL"."AMENDING_SI_TAG"))
  21 - access("CMSG_IL"."EVNT_TAG"="SETI_IL"."SETT_ISTR_TAG")
  22 - access("CMSG_IL"."UNACKNOWLEDGED"="PRCO_IL"."UNACK_CODE")
  23 - access("SETM_IL"."SETT_ISTR_TAG"="SETI_IL"."SETT_ISTR_TAG")

46 rows selected.

SQL> 

So, optimizer environment differences and bind variable peeking are pretty much ruled out as influencing factors.

What remains is that the bad plan has a higher estimate of rows than the good plan and the bad plan has dynamic sampling.

Both SQL statements are the same.

Dynamic sampling level is the same for both (4 set at system level).

Dynamic sampling should be being applied because of the parameter setting and because there are two single table access predicates
– one from the predicate on CMSG_IL.EVENT_TYPE and one implicit IS NOT NULL predicate on the join column UNACKNOWLEDGED.

So why would we see DS in one plan and not anohther?

If we do a 10053 trace of the distilled non-mergeable view, we see:

Access path analysis for CMSG
***************************************
SINGLE TABLE ACCESS PATH
  Single Table Cardinality Estimation for CMSG[CMSG_IL]

*** 2012-06-19 14:32:04.576
** Performing dynamic sampling initial checks. **
  Column (#4):
    NewDensity:0.000062, OldDensity:0.000000 BktCnt:8090, PopBktCnt:8084, PopValCnt:46, NDV:80
  Column (#4): EVNT_TYPE(
    AvgLen: 4 NDV: 80 Nulls: 0 Density: 0.000062
    Histogram: Freq  #Bkts: 52  UncompBkts: 8090  EndPtVals: 52
** Dynamic sampling initial checks returning TRUE (level = 4).

*** 2012-06-19 14:32:04.576
** Generated dynamic sampling query:
    query text :
SELECT /* OPT_DYN_SAMP */ 
       /*+ 
         ALL_ROWS 
         IGNORE_WHERE_CLAUSE 
         NO_PARALLEL(SAMPLESUB) 
         opt_param('parallel_execution_enabled', 'false') 
         NO_PARALLEL_INDEX(SAMPLESUB) 
         NO_SQL_TUNE 
         */ 
       NVL(SUM(C1),0), 
       NVL(SUM(C2),0) 
FROM (SELECT /*+ 
               IGNORE_WHERE_CLAUSE NO_PARALLEL("CMSG_IL") 
               FULL("CMSG_IL") 
               NO_PARALLEL_INDEX("CMSG_IL") 
               */ 
             1 AS C1, 
             CASE WHEN "CMSG_IL"."EVNT_TYPE"=:B1 
                  AND  "CMSG_IL"."UNACKNOWLEDGED" IS NOT NULL 
                  THEN 1 
                  ELSE 0 
             END AS C2 
FROM "CMSG" SAMPLE BLOCK (0.002711 , 1) SEED (1) "CMSG_IL") SAMPLESUB

*** 2012-06-19 14:32:04.612
** Executed dynamic sampling query:
    level : 4
    sample pct. : 0.002711
    actual sample size : 1756
    filtered sample card. : 0
    orig. card. : 75899119
    block cnt. table stat. : 1118909
    block cnt. for sampling: 1143461
    max. sample block cnt. : 32
    sample block cnt. : 31
    min. sel. est. : 0.00000001
** Not using dynamic sampling for single table sel. or cardinality.

So, the results of dynamic sampling are discarded as unreliable and the estimates fall back on using the stored object stats.

But at some point (one of the hard parses) the data returned in the sample must have found some relevant data and been considered reliable which led to the “bad” plan being executed repeatedly in inappropriate circumstances.

Two footnotes to this investigation.

Firstly, when you have tables with fast-changing data and row counts, there are usually two choices in approach for statistics
– either find a set of stats that gives the best (balance of) plans and lock them.
– or use dynamic sampling

The scenario above shows the downside of the dynamic sampling approach.
It may well be that the plan hash 1081606725 was good for the initial execution(s) of the statement when there maybe were a large number of outstanding messages but it very quickly became unsuitable given the frequent executions.

Secondly, I mentioned that this was not a SQL rewriting exercise, but, if it was, I’m not a great fan of the outer-joined inline view. It’s not merged but it seems sensible that a predicate be pushed into view.
Pushing the predicate, whether happening by default or hinted explicitly, would give a subsection of the plan like:

|  20 | VIEW PUSHED PREDICATE                  |            |     1 |    38   (0)| 00:00:01 |
|  21 |  SORT GROUP BY                         |            |     1 |    38   (0)| 00:00:01 |
|  22 |   NESTED LOOPS                         |            |     1 |    38   (0)| 00:00:01 |
|  23 |    NESTED LOOPS                        |            |     1 |    38   (0)| 00:00:01 |
|  24 |     NESTED LOOPS                       |            |     1 |     5   (0)| 00:00:01 |
|  25 |      TABLE ACCESS BY INDEX ROWID       | SETM       |     1 |     3   (0)| 00:00:01 |
|* 26 |       INDEX UNIQUE SCAN                | SETM_IDX   |     1 |     2   (0)| 00:00:01 |
|  27 |      TABLE ACCESS BY GLOBAL INDEX ROWID| SETI       |    30M|     2   (0)| 00:00:01 |
|* 28 |       INDEX UNIQUE SCAN                | SETI_IDX   |     1 |     1   (0)| 00:00:01 |
|* 29 |     TABLE ACCESS BY INDEX ROWID        | CMSG       |     1 |    33   (0)| 00:00:01 |
|* 30 |      INDEX FULL SCAN                   | CMSG_IDX   |     1 |    32   (0)| 00:00:01 |
|* 31 |    INDEX UNIQUE SCAN                   | PRCO_2_IDX |     1 |     0   (0)| 00:00:01 |

However, I’m not keen on explicitly hinting this nor leaving it to chance and as it’s outer joined, I would strongly consider using a scalar subselect:

      WITH cmsg AS
           (SELECT *
            FROM   (SELECT cmsg.evnt_tag,
                           cmsg.evnt_type,
                           cmsg.seq_num
                    FROM   cmsg
                    WHERE  unacknowledged = :p_unack_code
                    AND    evnt_type != 'SOD'
                    ORDER BY seq_num)
            WHERE rownum <= :p_batch_lim)
      SELECT  ...,
              (SELECT COUNT(*)
               FROM   proc    prco_il,
                      cmsg    cmsg_il,
                      setm    setm_il,
                      seti    seti_il
               WHERE  cmsg_il.unacknowledged = prco_il.unack_code
               AND    cmsg_il.evnt_type      = 'SI'
               AND   (cmsg_il.evnt_tag       = seti_il.sett_istr_tag
               OR     cmsg_il.evnt_tag       = seti_il.amending_si_tag)
               AND    setm_il.sett_istr_tag  = seti_il.sett_istr_tag
               AND    setm_il.sett_mesg_tag  = sint.sett_mesg_tag) count
      FROM    cmsg,
              sint,
              setm,
              comp,
              trga,
              seti,
              trad
      WHERE   sint.sint_tag      (+) = cmsg.evnt_tag
      AND     sint.stus_ind      (+) = 'C'
      AND     setm.sett_mesg_tag (+) = sint.sett_mesg_tag
      AND     comp.cmpy_num      (+) = setm.cmpy_num
      AND     trga.trga_code     (+) = comp.trga_code
      AND     seti.sett_istr_tag (+) = setm.sett_istr_tag
      AND     trad.trad_tag      (+) = seti.trad_tag
      AND     trad.cmpy_num      (+) = seti.cmpy_num
      ORDER by cmsg.seq_num;

Flippin’ Witch Hunt – adaptive direct path read

Plan flips… here seems to be general impression that most application performance issues are going to be caused by some sort of “plan flip” and that “plan flips” are bad.

  • If you don’t use plan stability features, should you really expect plan stability?
  • Even outside your stats job window whenever that is?
  • What if you’re using bind variables with histograms?
  • And you’ve got features like dynamic sampling which will probably get a slightly different sample for each hard parse?
  • And if you’ve also got cardinality feedback kicking in?

The old battle between stability and optimisation, I suppose.

I made a throwaway comment on a forum thread about having seen a performance problem caused by a sql statement which changed to doing direct path IO.

I thought it might be vaguely interesting to do a quick blog post and back that up with a tiny bit of “evidence”.

This was getting on for three weeks ago now (29th May) and there were bigger fish to fry at the time so some of the evidence and some of the accuracy of the evidence might wont a little but I love having a long AWR retention and the outputs below are what ASH & AWR can tell us now about what happened back then.

Problem:
A particular sql statment was running slowly.
Initial analysis had identified a change in the plan and that change was caused by cardinality feedback and changes in plans are bad of course ;)

Looking at DBA_HIST_SQLSTAT, I could see that the SQL regularly flip-flopped between two plans, one of which was the plan normally received after a hard parse, the other a change brought on by cardinality feedback.

But there really wasn’t much to choose between the plans in terms of statistics and averages.

In the execution plan, we can also skip specifics, the only relevant detail that there was a FULL TABLE SCAN which was the inner part (i.e. executed multiple times for each row in the driving / outer rowsource) of a NESTED LOOP – something which is rarely going to be a good thing, i.e.

NESTED LOOP
  some row source
  FULL TABLE SCAN

However, if we cross-reference DBA_HIST_ACTIVE_SESS_HISTORY, then we see the critical differences between the executions of this FULL TABLE SCAN on the 28th / 29th May and other days.

If you don’t like averages then I don’t blame you but you won’t like this query:

with ash as
(select trunc(ash.sample_time) snap_day
 ,      min(ash.snap_id) min_snap_id
 ,      max(ash.snap_id) max_snap_id
 ,      ash.current_obj# obj_id
 ,      ash.sql_id
 ,      ash.sql_plan_hash_value
 ,      ash.sql_plan_operation
 ,      ash.sql_plan_options
 ,      ash.current_obj#
 ,      ash.event
 ,      count(distinct ash.sql_exec_id) ash_cnt
 from   dba_hist_active_sess_history ash
 where  ash.sql_id              = 'drbkfkfxg6a6a'
 and    ash.sql_plan_operation  = 'TABLE ACCESS'
 and    ash.sql_plan_options    = 'FULL'
 and    ash.current_obj#        = 169748
 and    ash.sample_time        >= to_date('01/05/2012','DD/MM/YYYY')
 and    ash.event              is not null
 group by
        trunc(ash.sample_time)
 ,      ash.sql_plan_hash_value
 ,      ash.sql_id
 ,      ash.sql_plan_operation
 ,      ash.sql_plan_options
 ,      ash.current_obj#
 ,      ash.event)
select ash.snap_day
,      ash.event
,      ash.ash_cnt
,      (select round(sum(elapsed_time_delta) / sum(executions_delta) /1000/1000)
        from   dba_hist_sqlstat   st
        where  st.sql_id        = ash.sql_id
        and    st.snap_id between ash.min_snap_id
                          and     ash.max_snap_id) avg_elp
,      (select round(sum(rows_processed_delta) / sum(executions_delta))
        from   dba_hist_sqlstat   st
        where  st.sql_id        = ash.sql_id
        and    st.snap_id between ash.min_snap_id
                          and     ash.max_snap_id) avg_rws
,      round((1275068416 -
       (select sum(space_allocated_delta)
        from   dba_hist_seg_stat st
        where  st.obj#          = ash.obj_id
        and    st.snap_id      >= ash.min_snap_id))
      / 1024/1024) "SPACE_USED (ISH)"
,      (select min(bytes)/1024/1024
        from   dba_hist_sgastat st
        where  name             = 'buffer_cache'
        and    st.snap_id between ash.min_snap_id
                          and     ash.max_snap_id) min_bc
from   ash
order by snap_day;

What am I trying to see:

  • Historic running of a particular sql statment
  • Focusing on this FULL TABLE SCAN operation which was a problem on the 28th / 29th
  • Occurences in ASH (be very careful of reading too much into samples of samples and aggregates and averages)
  • What the statement tended to wait on and a feel for how much waiting (beware of sampled samples)
  • And finally a rough approximation of how big the object was relative to the buffer cache (we know what size it is now (which I’ve plugged into the query) so work back through the growth in segment stats. It’s using space allocated because I don’t have access to dbms_space to get space used and space allocated is going to be a high estimate of actual space used – it certainly shouldn’t be more than that)
SNAP_DAY  EVENT                               ASH_CNT    AVG_ELP SPACE_USED (ISH)     MIN_BC
--------- -------------------------------- ---------- ---------- ---------------- -------
01-MAY-12 db file scattered read                    1         19             1152      61696
01-MAY-12 db file sequential read                   1         19             1152      61696
01-MAY-12 db file sequential read                   1         25             1152      61696
02-MAY-12 db file sequential read                   1          8             1152      61696
02-MAY-12 db file scattered read                    1          8             1152      61696
07-MAY-12 db file scattered read                    1          7             1152      61696
08-MAY-12 db file scattered read                    1         24             1152      61696
08-MAY-12 db file sequential read                   1         24             1152      61696
09-MAY-12 db file sequential read                   1        100             1152      61696
09-MAY-12 db file scattered read                    1        100             1152      61696
10-MAY-12 db file sequential read                   1          6             1152      61696
10-MAY-12 db file scattered read                    1          6             1152      61696
11-MAY-12 db file sequential read                   1          7             1152      61696
11-MAY-12 db file scattered read                    1         13             1152      61696
11-MAY-12 db file sequential read                   1         13             1152      61696
15-MAY-12 db file scattered read                    1        148             1152      62208
16-MAY-12 db file scattered read                    1          8             1152      62464
16-MAY-12 db file sequential read                   2          5             1152      62464
17-MAY-12 db file scattered read                    1         17             1152      62208
17-MAY-12 db file sequential read                   1         17             1152      62208
18-MAY-12 db file sequential read                   2         17             1152      62208
18-MAY-12 db file scattered read                    2         17             1152      62208
21-MAY-12 db file scattered read                    1         16             1216      62208
21-MAY-12 db file scattered read                    1          6             1216      62208
22-MAY-12 db file scattered read                    1         19             1216      62208
23-MAY-12 db file scattered read                    3         17             1216      62208
23-MAY-12 read by other session                     1         15             1216      62208
24-MAY-12 db file scattered read                    1         28             1216      61696
25-MAY-12 db file scattered read                    1         67             1216      62208
25-MAY-12 db file sequential read                   1         67             1216      62208
28-MAY-12 direct path read                          3         45             1216      62464
28-MAY-12 direct path read                         96         70             1216      62464
29-MAY-12 direct path read                         10        484             1216      62464
29-MAY-12 direct path read                        137        643             1216      62464
29-MAY-12 enq: KO - fast object checkpoint          1        545             1216      62464

35 rows selected.

SQL> 

The FULL TABLE SCAN as the outer rowsource of a NESTED LOOP is a performance threat anyway but in summary, a performance problem, irrelevant plan stability and a key switch to direct path IO, not obviously caused by a significant growth in object size.

Follow

Get every new post delivered to your Inbox.

Join 85 other followers