SQL Plan Baseline Manual Evolution

I’ve never really spent time on evolution because most of the time I use baselines for emergency sql plan management and not with the added value of controlled evolution with which the feature was conceived.

But some observations on sql plan baseline evolution originating from the questions:

  • What does evolution actually do?
  • What information does it use?
  • What happens with DML?

Starting point – one baselined plan

1. FTS plan in memory from SQL which should do FTS and put in a baseline

drop table t1;
drop table t2;

create table t1 
(col1  number 
,col2  varchar2(50) 
,flag  varchar2(2));
 
insert into t1 
select rownum 
,      lpad('X',50,'X') 
,      case when rownum = 1 
            then 'Y1'
            when rownum = 2 
            then 'Y2' 
            when mod(rownum,2) = 0
            then 'N1'
            else 'N2' 
       end 
from   dual 
connect by rownum <= 100000; 

exec dbms_stats.gather_table_stats(USER,'T1',method_opt=>'FOR COLUMNS flag SIZE 5',no_invalidate=>FALSE); 
 
create index i1 on t1 (flag);

var n varchar2(2) 
exec :n := 'N1';
select /*+ domtest */ count(*), max(col2) from t1 where flag = :n;
select * from table(dbms_xplan.display_cursor);

declare
  l_op pls_integer;
begin
  l_op :=
  dbms_spm.load_plans_from_cursor_cache('45sygvgu8ccnz');
end;
/

select to_char(signature), sql_handle, sql_text, origin, last_verified, enabled, accepted from dba_sql_plan_baselines;

TO_CHAR(SIGNATURE)  SQL_HANDLE           SQL_TEXT                                                          ORIGIN       LAST_VERIFIED ENABLED ACCEPTED
------------------- -------------------- ----------------------------------------------------------------- ------------ ------------- ------- --------
8384366354684364262 SQL_745b40c6fdb2c5e6 select /*+ domtest */ count(*), max(col2) from t1 where flag = :n MANUAL-LOAD                YES     YES      

We have a full table scan baselined.

Step 2 – an alternative plan

Next, let’s execute a SQL statement which would rather do an index scan for a different set of binds.

When there is a baselined plan, then it does this automatically at hard parse time.

If a lower cost plan is generated then store it for future evaluation / evolution.

We need to flush or invalidate the existing cursor to get that hard parse though.

Then we get an AUTO_CAPTURE plan in DBA_SQL_PLAN_BASELINES, ACCEPTED = NO

comment on table t1 is 'flushme';
var n varchar2(2) 
exec :n := 'Y1';
select /*+ domtest */ count(*), max(col2) from t1 where flag = :n;
select * from table(dbms_xplan.display_cursor);

select to_char(signature), sql_handle, sql_text, origin, last_verified, enabled, accepted from dba_sql_plan_baselines;

TO_CHAR(SIGNATURE)  SQL_HANDLE           SQL_TEXT                                                          ORIGIN       LAST_VERIFIED ENABLED ACCEPTED
------------------- -------------------- ----------------------------------------------------------------- ------------ ------------- ------- --------
8384366354684364262 SQL_745b40c6fdb2c5e6 select /*+ domtest */ count(*), max(col2) from t1 where flag = :n MANUAL-LOAD                YES     YES      
8384366354684364262 SQL_745b40c6fdb2c5e6 select /*+ domtest */ count(*), max(col2) from t1 where flag = :n AUTO-CAPTURE               YES     NO   

Step 3 – Manual Evolution

Now let’s do an evolution.
And I’m going to set MODULE so we can track some sql.
(Perhaps we could/should do a sql trace but, having done it, it doesn’t really tell us much more in this case).

exec dbms_application_info.set_module('DOMTEST','SPM');

set serveroutput on
set long 10000
DECLARE
    r CLOB;
BEGIN
    r := 
    dbms_spm.evolve_sql_plan_baseline
    (sql_handle => 'SQL_745b40c6fdb2c5e6',verify=>'YES',commit=>'NO');
    dbms_output.put_line(r);
END;
/

Why would you do VERIFY => ‘NO’?

Only use case really is if you’ve already evaluated the plan yourself and want to just get it accepted.

The combination of VERIFY=> NO and COMMIT=> NO does nothing.

Our report looks like this:

------------------------------------------------------------------------------
                        Evolve SQL Plan Baseline Report
-------------------------------------------------------------------------------

Inputs:
-------
  SQL_HANDLE = SQL_745b40c6fdb2c5e6
  PLAN_NAME  = 
  TIME_LIMIT = DBMS_SPM.AUTO_LIMIT
  VERIFY     = YES
  COMMIT     = NO

Plan: SQL_PLAN_78qu0svyv5jg68576eb1f
------------------------------------
  Plan was verified: Time used .1 seconds.
  Plan passed performance criterion: 314.79 times better than baseline plan.

                            Baseline Plan      Test Plan       Stats Ratio
                            -------------      ---------       -----------
  Execution Status:              COMPLETE       COMPLETE
  Rows Processed:                       1              1
  Elapsed Time(ms):                 3.708           .047             78.89
  CPU Time(ms):                     3.777              0
  Buffer Gets:                        944              3            314.67
  Physical Read Requests:               0              0
  Physical Write Requests:              0              0
  Physical Read Bytes:                  0              0
  Physical Write Bytes:                 0              0
  Executions:                           1              1

-------------------------------------------------------------------------------
                                 Report Summary
-------------------------------------------------------------------------------
Number of plans verified: 1
Number of plans accepted: 0

I’ve also run this on 12.1.0.2 and there doesn’t seem to be a massive difference in behaviour although there are a few extra columns and stats in the output.
(including the two plans which aren’t in the output of 11g because they’re not stored although they could be derived from the analysis sql).

What? Where? How?

What has been run behind the scenes to do this?

What information has been used to do the evaluation?

select * from v$sql where module = 'DOMTEST' order by first_load_time;

Statements of interest:

/* SQL Analyze(686,0) */ select /*+ domtest */ count(*), max(col2) from t1 where flag = :n

/* SQL Analyze(686,0) */ select /*+ domtest */ count(*), max(col2) from t1 where flag = :n

SELECT obj_type, plan_id, comp_data FROM sqlobj$data WHERE signature = :1 AND category = :2;

SELECT st.sql_text, sod.comp_data, sox.optimizer_env, sox.bind_data, sox.parsing_schema_name, sox.creator
FROM   sql$text st, sqlobj$data sod, sqlobj$auxdata sox
WHERE  sod.signature = st.signature 
AND    sod.signature = :1 
AND    sod.category  = :2 
AND    sod.obj_type  = :3 
AND    sod.plan_id   = :4 
AND    sod.signature = sox.signature 
AND    sod.category  = sox.category
AND    sod.obj_type  = sox.obj_type
AND    sod.plan_id   = sox.plan_id;

What was run to evaluate the performance of the relative plans?

See the first two /* SQL Analyze */ statements.

We can look more closely at v$sql for these:

select s.sql_text, s.child_number, s.plan_hash_value phv
,      (select extractvalue(xmltype(other_xml),'other_xml/info[@type="plan_hash_2"]')
        from   v$sql_plan p
        where  p.sql_id          = s.sql_id
        and    p.plan_hash_value = s.plan_hash_value
        and    p.other_xml is not null) plan_hash_2
, s.executions, s.elapsed_time/1000/s.executions elapsed_ms, s.cpu_time/1000/s.executions cpu_ms, s.buffer_gets/s.executions, b.value_string
from   v$sql s, table(dbms_sqltune.extract_binds(s.bind_data)) b
where  s.module = 'DOMTEST' 
and    s.sql_id = '8rnh80j2b09kt';

SQL_TEXT                                                                                    CHILD_NUMBER PHV        PLAN_HASH_2 EXECUTIONS ELAPSED_MS CPU_MS S.BUFFER_GETS/S.EXECUTIONS VALUE_STRING
------------------------------------------------------------------------------------------- ------------ ---------- ----------- ---------- ---------- ------ -------------------------- ------------
/* SQL Analyze(686,0) */ select /*+ domtest */ count(*), max(col2) from t1 where flag = :n  0            1634389831 1634389831  10         4.0923     4.0994 944                        Y1
/* SQL Analyze(686,0) */ select /*+ domtest */ count(*), max(col2) from t1 where flag = :n  1            2239163167 2239163167  10         1.1728     0.1999 3                          Y1

We can see where some of the stats from the report have come from – they look to be an average over a number of executions in this example. The timings aren’t an exact match.

Note that in this case PHV = PHV2, but PHV2 is the plan_id in the underlying baselined tables as I have discussed previously elsewhere.

Where did the SQL statement come from?

Where did the bind data come from?

That’s the last statement in my “statements of interest” above.

SELECT st.sql_text, sod.comp_data, sox.optimizer_env, sox.bind_data, sox.parsing_schema_name, sox.creator
FROM   sql$text st, sqlobj$data sod, sqlobj$auxdata sox
WHERE  sod.signature = st.signature 
AND    sod.signature = :1 
AND    sod.category  = :2 
AND    sod.obj_type  = :3 
AND    sod.plan_id   = :4 
AND    sod.signature = sox.signature 
AND    sod.category  = sox.category
AND    sod.obj_type  = sox.obj_type
AND    sod.plan_id   = sox.plan_id;

The basic sql plan baseline mechanism is by signature (V$SQL.EXACT_MATCHING_SIGNATURE) but SQL$TEXT stores the actual sql text.

How else could be evaluate the baselined plans during evolution because the sql might not be in memory or in AWR!

The statement to execure bind data is from the hard parse which generated the lower cost of plan of interest and is stored in SYS.SQLOBJ$AUXDATA.BIND_DATA.

Makes sense, right?

What happens with DML?

There’s nothing I can find which spells it out in black and white but only the SELECT of the DML, be it INSERT SELECT, UPDATE or MERGE.

Even a sql trace doesn’t make this crystal clear.

But using a similar method to above:

create table t2
(cnt   number 
,col2  varchar2(50) 
,flag  varchar2(2));

var n varchar2(2) 
exec :n := 'N1';
insert into t2
select /*+ domtest */ count(*), max(col2), :n from t1 where flag = :n;
select * from table(dbms_xplan.display_cursor);

select sql_id, child_number, is_bind_aware, is_bind_sensitive, is_shareable, to_char(exact_matching_signature) sig
,      executions, plan_hash_value
from   v$sql 
where  sql_id = '08ravsws1s6bn';

declare
  l_op pls_integer;
begin
  l_op :=
  dbms_spm.load_plans_from_cursor_cache('08ravsws1s6bn');
end;
/

select to_char(signature), sql_handle, sql_text, origin, last_verified, enabled, accepted from dba_sql_plan_baselines;

comment on table t2 is 'flushme';

var n varchar2(2) 
exec :n := 'Y1';
insert into t2
select /*+ domtest */ count(*), max(col2), :n from t1 where flag = :n;
select * from table(dbms_xplan.display_cursor);

select to_char(signature), sql_handle, sql_text, origin, last_verified, enabled, accepted from dba_sql_plan_baselines;

exec dbms_application_info.set_module('DOMTEST2','SPM');
select * from v$sql where module = 'DOMTEST2' order by first_load_time;

set serveroutput on
set long 10000
DECLARE
    r CLOB;
BEGIN
    r := 
    dbms_spm.evolve_sql_plan_baseline
    (sql_handle => 'SQL_4ed45b79c88f3392',verify=>'YES',commit=>'NO');
    dbms_output.put_line(r);
END;
/

We get a similar report and similar evaluation sql.
We can see from ROWS_PROCESSED that no rows were actually inserted.

select s.sql_text, s.child_number, s.plan_hash_value phv
,      (select extractvalue(xmltype(other_xml),'other_xml/info[@type="plan_hash_2"]')
        from   v$sql_plan p
        where  p.sql_id          = s.sql_id
        and    p.plan_hash_value = s.plan_hash_value
        and    p.other_xml is not null) plan_hash_2
, s.rows_processed,s.executions, s.elapsed_time/1000/s.executions elapsed_ms, s.cpu_time/1000/s.executions cpu_ms, s.buffer_gets/s.executions
from   v$sql s
where  s.module = 'DOMTEST2' 
and    s.sql_id = 'b04dpwja6smx7';

SQL_TEXT                                                                                                       CHILD_NUMBER PHV        PLAN_HASH_2 ROWS_PROCESSED EXECUTIONS ELAPSED_MS CPU_MS S.BUFFER_GETS/S.EXECUTIONS
-------------------------------------------------------------------------------------------                    ------------ ---------- ----------- -------------- ---------- ---------- ------ --------------------------
/* SQL Analyze(771,0) */ insert into t2 select /*+ domtest */ count(*), max(col2), :n from t1 where flag = :n  0            1634389831 1634389831  0              10         4.1668     4.1993 944                       
/* SQL Analyze(771,0) */ insert into t2 select /*+ domtest */ count(*), max(col2), :n from t1 where flag = :n  1            2239163167 2239163167  0              10         0.1512     0.1  3                         

I also tried with the addition of a PK just to double check that it couldn’t be inserted. Nothing to see.

Finally…

One note about evolution behaviour and the adjustment of the threshold for plan verification, i.e. the bit which results in this:

Plan was verified: Time used .1 seconds.
Plan passed performance criterion: 314.79 times better than baseline plan.

You can use parameter _plan_verify_improvement_margin, e.g.

ALTER SYSTEM SET “_plan_verify_improvement_margin”=120;

Unfortunately valid range for parameter is between 100 and 10000 so not high enough for me to show it’s influence on the results above (> 300x improvment).

Strategies for Minimising SQL Execution Plan Instability

Execution Plan Instability – What is the problem?

The Oracle Optimizer is a complex piece of software and with every release it becomes more complex.

RBO

In the beginning, the Optimizer was rule-based.

The Optimizer had a ranked list of heuristics used to optimize a query, picking the lowest ranked rule available.

This rule-based mode, whilst still in use with some internal Oracle dictionary queries, has been unsupported since version 10.1.

This means that no code changes have been officially made to the RBO and no bug fixes are provided. There are many features that the RBO is unaware of.

Applications should not still be using the rule-based optimizer.

CBO

The Cost-Based Optimizer is designed to evaluate a number of execution plans and pick the one estimated to be the fastest.

Many of the CBO features are designed to combat common optimization problems and many of these problems occur where development teams are not necessarily aware of the full implications of the features that they are using.

These built-in default behaviours roughly conform to the infamous 80%:20% rule in that most of the time they do a good job but they are not infallible.

Bind Variables, Literals and Histograms

Most of the features which are deliberately designed such that plan instability is difficult to avoid stem from the decision to use bind variables or literals (NOT that they should be mutually exclusive) and the interaction of the former with histograms.

In article 1 of his Philosophy series, Jonathan Lewis neatly sums this up:

Histograms and bind variables exist for diametrically opposed reasons – they won’t work well together without help

Fundamentally, bind variables exist to provide shareable plans.

Bind variables should be used where we are not interested in getting specific optimizations for differing parameters.

Literals should be used where we want the Optimizer to pay particular attention to data distribution and skew for the specific parameters supplied.

A SQL statement will/should often have some parameters which should be literals and some which should be binds.

From this point onwards, there has been a whole raft of features designed to treat the misuse of one or the other.

In Oracle 8.1.6, histograms were introduced.

Histograms exist to provide specific data distribution information, particularly relevant to specific parameters.

Also, in that version, we got CURSOR_SHARING – targeted at applications using literals instead of binds such that SQL which was identical part from the use of binds was rewritten to use sytem-generated bind variables.

Then in 9.2, we got bind variable peeking.

This feature was introduced so that the optimizer could peek at the values supplied at parse time and use data distribution information specific to these parse-time values to generate an execution plan which suited those values.

In addition and at the same time, through these various versions to present day, we have had the default behaviour of DBMS_STATS statistic gathering to let the database decide which columns it will create histograms on, based on the SQL which has been running.

This means that new histograms can suddenly spring up – or existing histograms unexpectedly disappear – on all sorts of columns. This can be problematic on columns with large numbers of distinct values AND particularly so on join cardinalities where there may be a mismatch of histograms on both sides of the join.

Ever since this point, we have had a conflict of interest in feature usage and an ever increasing number of additional functionality to battle against this conflict – adaptive cursor sharing, cardinality feedback, etc, etc

Finally, the education message got blurred or lost somewhere along the line to the extent that a lot of systems blindly overuse bind variables because of the perceived performance impact of using literals.

This situation is not helped by the fact PL/SQL is designed to encourage bind variables.

Using supplied parameters as literals means using some construct of dynamic SQL, not difficult but nevertheless an added complexity and also another feature which is often blindly discouraged.

SQL Execution Plan Instability – Is SPM a viable approach?

SPM Overview

In Oracle 11g, Oracle introduced SQL Plan Baselines as a SQL Plan Management feature.

The core attraction of this functionality is that you can “lock in” a particular plan or plans for a SQL statement. This stores a set of outline hints and a specific plan hash value in relation to a specific SQL signature.

The Optimizer then uses that set of hints to try to reproduce the desired plan. If it can it will, if it can’t it will reject the hintset.

Additionally, the Optimizer completes its best-cost execution plan optimization anyway so that it can provide the ability to do a controlled evolution of baselined plans in the event that the lower-cost plan that it would have used is different and better performing than the baselined plan.

To use this the database parameters (session or system) just need to be configured to capture plans into a baseline and then use them.

There is flexibility to this capture/use. You can capture everything as it is generated; you could capture everything from memory now and/or at regular intervals and import into a baseline; you could capture everything from AWR into a SQL Tuning Set and then import into a baseline; or you could capture everything from another environment and export/import it into another.

And at any point, you can turn off capture and continue to use those which you currently have – this usage continues to capture any lower cost plans that the optimizer would have generated for any existing baselined plans

For a more detailed look at SPM and a comparison with SQL Profiles, see documentation.

Sounds good – why isn’t it the silver bullet for system-wide stability?

This approach might be good enough, but it is not a cast-iron guarantee of system-wide stability.

There are a number of reasons why not.

New & Changed SQL

Firstly, you need to have captured all your plans already. If you get brand new SQL, then there will be nothing in SPM.

Depending on your application, this may be a significant concern.

For example, consider an application making heavy usage of Hibernate which generates SQL statements.

A minor change in the domain model can mean a change of system-generated table alias in many statements.

As a result, you may suddenly get a lot of brand new SQL and significant numbers of baselined statements which you will now never see again.

What are the baselined plans based on? Are they the best plans? The only plans ever used?

If you suddenly baseline the plans for a large number of SQL statements, you are dictating which plan is to be used.

The plans will be based on the parsed plans in memory or in AWR at the time.
Are these the best plans?

Does/should this SQL statement always use this plan?

Are there normally multiple plans for different bindsets?

What if you normally get benefit from adapative cursor sharing?

ACS and baselines

What if your application benefits from adapative cursor sharing?

Sure, you can baseline multiple plans but these plans have no ACS information.

As soon as that ACS information is no longer in memory (as happens), there is no shortcut in a baseline to regain that, you still have to have the multiple executions required for the optimizer to recognize that which plans to use for which bindsets.

Parsing overhead

Widespread usage of baselines might, depending on your system performance profile, have a significant impact on the parsing resources.

This is because it always generates a best-cost plan anyway.

Then if that is not the baselined plan, it will use the baselined hintset to try to generate the specific plan hash.

In addition, it that is not possible, it will use just the optimizer_features_enable hint to try to generate the required plan.

So, you might in a heavily-baselined system to be doing 2x the parse work of a non-baselined system.

This might well be easily accommodated but there are systems where this would cause a problem.

Integration with development and testing processes

A SQL Plan Baseline is tied to a SQL statement based on the statement’s EXACT_MATCHING_SIGNATURE – a hash of the SQL statement which has been case and space normalized.

If a SQL statement materially changes, the baseline no longer applies.

How aware are developers of the presence of baselines?

And how to integrate with the development process?

How will our release process deal with baselines?

And if baselining large numbers of plans is being considered, then we have to think about where these will be generated.

The natural implication (of widespread baseline usage) is that new functionality being promoted to Prod would have a set of tested, baselined plans accompanying it and these would presumably have to be generated in a Prod-like environment which included Prod-like volumes.

SQL Execution Plan Instability – Decision Time

There is a decision to be made.

(And/or perhaps there is often a conceptual over-simplification by senior management to combat? Or at least a lack of deep understanding of the beast that we’re dealing with here?)

Do you want the Optimizer to try to get a better execution plan sometimes?

If the answer is yes, then you have to accept that it will get it wrong from time to time.

In particular, the various feedback and adaptive mechanisms are designed to recognize that they have got it wrong.

BUT they need that problematic execution in the the first place – sometimes more than one – to recognize that fact.

That one problematic execution could be your next Priority 1 incident.

In addition, the feedback mechanism is not perfect and it still can make subsequent executions worse in some circumstances.

SQL Execution Plan Instability – Turn it off?

IF your primary goal is plan stability – and I think many teams would claim this is this their goal but they do not embrace the implications of this – then perhaps a better decision is to turn off the various features which cause or combine to cause most of the problems of instability.

Appropriate usage of binds, appropriate usage of literals

Choose whether to use a bind variable or a literal as is appropriate for the value / column / SQL statement.

A SQL statement might have a mix of both.

DBMS_STATS defaults

A METHOD_OPT of FOR ALL INDEXED COLUMNS SIZE AUTO is an immediate red flag. This is never a good setting.

FOR ALL COLUMNS SIZE AUTO without table-level preferences (SET_TABLE_PREFS) is another red flag.

As an interim step, consider use FOR ALL COLUMNS SIZE REPEAT to lock in the current histogram usage.

The end goal should be to have table level preferences set for all tables.

This relies on knowing your data, your data distribution, your code, and knowing which histograms make sense (i.e. for skewed columns) – it will be far fewer than gathered by default.

For columns with significant numbers of distinct skew, it may be necessary to manually craft the histograms.

Volatile tables

Volatile tables should have stats set to an appropriate setting to generate appropriate plans for all situations and then those stats should be locked.

Stats which are gathered at any point during the volatility cycle may be good or may be problematic.

Similarly dynamic sampling can only see the data at the time of hard parse – you might be lucky and this is better than stats which say 0 rows but it can be a time bomb.

Turn off optimizer features

Turning off optimizer features might be best done via a LOGON trigger and turning such off for a subset of application users. These features include:

  • Bind Variable Peeking – off via _optim_peek_user_binds = false
  • Cardinality Feedback and ACS – should be disabled by turning off bind variable peeking but off via _optimizer_use_feedback = false, _optimizer_adaptive_cursor_sharing = false, _optimizer_extended_cursor_sharing_rel = “none”
  • Dynamic Sampling – optimizer_dynamic_sampling to 0
  • 12c additional adaptive features – e.g. adaptive execution plans

Additionally it probably makes sense to turn off the adaptive direct path read behaviour or anything with the word adaptive or dynamic in it or associated to it

This functionality decides on whether to do full segment scans via the buffer cache or not and the behaviour is a runtime decision depending on the size of the object relative to the buffer cache AND depending on how much of the segment is currently in the cache.

  • Adaptive direct path reads – _serial_direct_read = always

All too often I’ve seen a concurrently executed SQL statement switch to a “bad” plan involving a full table scan delivered via direct path reads stress out the IO subsystem because of the number of concurrent executions of that query which then affects performance across the DB.

Caveat

The actions above are still not sufficient to guarantee plan stability but, for this goal above all else, this is likely to be the most appropriate action.

However, to further guarantee stability it is still likely that some degree of hinting – whether via manual hints, sql profiles or baselines – might be necessary for small numbers of SQL statements where the intial cost-based plan is not appropriate e.g. GTTs and other situations but it should be small number of statements.

SQL Execution Plan Instability – Summary & Opinion

The actions discussed above are made on the basis that we want to minimise the chance of execution plan instability at all costs.

By making this decision, we are prioritizing stability over all the features within Oracle designed to generate better plans for specific situations, sets of binds, etc.

Personally, I always recommend going with the default behaviour until such time as it causes a significant problem.

I also always recommend matching the scope of a solution to the scope of a problem.

For example, if we have a problem with one or two SQL statements, the potential solutions should be limited to those SQL statements.

We should never be making changes with a potential system-wide impact for the sake of a couple of SQL statements.

And even parameter changes can be injected to specific SQL statements either via a SQL Patch or via manual hints.

In my opinion, having multiple plans for a SQL statement is expected.

But what is under-appreciated is the extent to which this is normal.

These are normally only noticed when they cause a problem and the significant number of plans which regularly change plans without impact tend to go unnoticed.

It is also my opinion that SQL execution issues occur mostly when the SQL is badly written or when incompatible features are combined – e.g. peeked binds together with histograms – and I prefer to tackle that root cause rather than a generic concept of plan instability being inherently problematic.

A future path?

Might it be a good idea for there to be a built-in choice of optimizer “umbrella” modes – maximum performance vs maximum stability which would turn on/off some of the features above and/or adjust their sensitivity/aggressiveness?

DBMS_STATS.PURGE_STATS

Prior to 11.2.0.4, the optimizer history tables are unpartitioned and DBMS_STATS.PURGE_STATS has little choice but to do do a slow delete of stats before the parameterised input timestamp.

Why might you be purging? Here’s one such illustration:
https://jhdba.wordpress.com/tag/dbms_stats-purge_stats/

This delete can be slow if these tables are large and there are a number of reasons why they might be so, notably if MMON cannot complete the purge within its permitted timeframe.

But note that if you’re happy to purge all history, there is a special TRUNCATE option if you make the call with a magic timestamp:

exec DBMS_STATS.PURGE_STATS(DBMS_STATS.PURGE_ALL);

but Oracle Support emphasises that:

This option is planned to be used as a workaround on urgent cases and under the advice of Support…

Ah… the old magic value pattern / antipattern!

PURGE_ALL CONSTANT TIMESTAMP WITH TIME ZONE :=
 TO_TIMESTAMP_TZ('1001-01-0101:00:00-00:00','YYYY-MM-DDHH:MI:SSTZH:TZM');

As part of the upgrade to 11.2.0.4, one gotcha is that these history tables become partitioned.

I don’t have a copy of 11.2.0.4 to hand but I do have 12.1.0.2 and the tables here are daily interval partitioned so I presume this is the same.

One plus side of this newly partitioned table is that the PURGE_STATS can now drop old partitions which is quicker than delete but a minor downside is that the tables have global indexes so the recursive/internal operations have to be done with UPDATE GLOBAL INDEXES

One curiosity in the trace file from this operation was this statement:

delete /*+ dynamic_sampling(4) */ 
from   sys.wri$_optstat_histhead_history
where  savtime_date < to_date('01-01-1900', 'dd-mm-yyyy') 
and    savtime not in (select timestamp '0000-01-01 00:00:00 -0:0' + sid + serial#/86400
                       from   gv$session 
                       where  status = 'ACTIVE' 
                       and    con_id in (0, sys_context('userenv', 'con_id')))       
and    rownum <= NVL(:1, rownum)

This is deleting from the P_PERMANENT default partition but why is this necessary and what is that subquery all about, particularly the timestamp ‘0000-01-01 00:00:00 -0:0′ + sid + serial#/86400 bit?

Call vs Exec

Just a reference to a really simple difference between CALL and EXEC.
I thought I had mentioned this before but couldn’t find it so…

EXEC/EXECUTE is a SQL*Plus command which wraps the proc call in an anonymous BEGIN … END; block.

CALL is a SQL command hence it is limited to SQL data types and there are other restrictions which the documentation sums up pretty well.

Because CALL is SQL, there is on key behavioural difference which caused a bug on a project a few years ago when Java code was calling a stored proc not with BEGIN … END; but with CALL and ended up swallowing certain exceptions:

SQL> l
  1  create or replace procedure p1
  2  as
  3  begin
  4    raise no_data_found;
  5* end;
SQL> /

Procedure created.

SQL> exec p1;
BEGIN p1; END;

*
ERROR at line 1:
ORA-01403: no data found
ORA-06512: at "PGPS_UAT1.P1", line 4
ORA-06512: at line 1


SQL> call p1();

Call completed.

SQL>

SQL expects and handles certain exceptions.

So always use BEGIN and END; blocks in application code rather than CALL.

OLTP Compression, Drop Column, Partition Exchange

With Basic Compression, you cannot drop a column.

create table t1
(col1 number
,col2 number)
compress;
 
table T1 created.

alter table t1 drop column col2;

SQL Error: ORA-39726: unsupported add/drop column operation on compressed tables
39726. 00000 -  "unsupported add/drop column operation on compressed tables"
*Cause:    An unsupported add/drop column operation for compressed table
           was attemped.
*Action:   When adding a column, do not specify a default value.
           DROP column is only supported in the form of SET UNUSED column
           (meta-data drop column).

But with Basic Compression, you can set as unused.

alter table t1a set unused column col2;  

table T1A altered

With OLTP Compression, on the face of it, you can drop a column.

drop table t1;

create table t1
(col1 number
,col2 number)
compress for oltp;

select table_name, column_name, hidden_column 
from   user_tab_cols 
where  table_name = 'T1' order by column_id;

TABLE_NAME COLUMN_NAME HIDDEN_COLUMN
---------- ----------- -------------
T1         COL1        NO
T1         COL2        NO

alter table t1 drop column col2;

table T1 altered

But this is a lie/misdirection.
In reality, the dropped column is just renamed.
We see the column is not dropped but hidden:

select table_name, column_name, hidden_column 
from   user_tab_cols 
where  table_name = 'T1' order by column_id;

TABLE_NAME COLUMN_NAME                HIDDEN_COLUMN
---------- -------------------------- -------------
T1         COL1                       NO
T1         SYS_C00002_14111103:30:14$ YES

Note
1. The hidden column name is system generated and suffixed with YYMMDDHH24:MI:SS$
2. Hidden columns are not exposed via USER_TAB_COLUMNS hence usage of USER_TAB_COLS.

I’m not a fan of this behaviour.
It is intended and documented.

Not in OLTP Compression master note 1223705.1.
https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&id=1223705.1

But in doc id 1068820.1
https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&id=1068820.1

Only advanced compression, which requires Advanced Compression license, supports drop on compressed tables but even in this situation there is no real drop but internally the database sets the column UNUSED to avoid long-running decompression and recompression operations.

Why might this matter?

I can think of at least one time when this might matter – partition exchange.
Ok, the circumstances are probably niche and more relevant for a DEV environment than PROD.
But imagine you add a column to one side of the partition exchange operation and then think better of it and immediately drop it.

drop table t1;
drop table t2;

create table t1
(col1 number
,col2 number)
compress for oltp;
 
create table t2
(col1 number
,col2 number)
partition by range (col1) interval(1)
(partition p0 values less than (0))
compress for oltp;

Show that partition exchange works initially:

lock table t2 partition for (1) in exclusive mode;

lock succeeded.

alter table t2 exchange partition for (1) with table t1;

table T2 altered.

But if I add and drop a column

alter table t1 add col3 number;

table T1 altered.

alter table t1 drop column col3;

table T1 altered.

Then my partition exchange no longer works:

alter table t2 exchange partition for (1) with table t1;

SQL Error: ORA-14097: column type or size mismatch in ALTER TABLE EXCHANGE PARTITION
14097. 00000 -  "column type or size mismatch in ALTER TABLE EXCHANGE PARTITION"
*Cause:    The corresponding columns in the tables specified in the
           ALTER TABLE EXCHANGE PARTITION are of different type or size
*Action:   Ensure that the two tables have the same number of columns
           with the same type and size.

Until I recreate the table or use DBMS_REDEFINITION.
Or until I do the same to the other side of the exchange operation:

alter table t2 add col345 number;

table T2 altered.

alter table t2 drop column col345;

table T2 altered.

Then

alter table t2 exchange partition for (1) with table t1;

table T2 altered.

Oracle 12c Multitenant – Inmemory admin basics

A couple of very, very basic observations on getting going with 12c Inmemory in a multitenant database.

1. When trying to set inmemory_size within a PDB when inmemory_size is 0 in the CDB
ORA-02096: specified initialization parameter is not modifiable with this option

SQL> alter session set container = cdb$root;

Session altered.

SQL> select value from v$parameter where name = 'inmemory_size';

VALUE
--------------------------------------------------------------------------------
0

SQL> alter session set container = orcl;

Session altered.

SQL> alter session set inmemory_size=100M;

alter session set inmemory_size=100M
                  *
ERROR at line 1:
ORA-02096: specified initialization parameter is not modifiable with this option

2. You have to use scope=spfile when setting inmemory on CDB and it requires restart to take effect

SQL> alter session set container = cdb$root;

Session altered.

SQL> alter system set inmemory_size = 500M;

alter system set inmemory_size = 500M
*
ERROR at line 1:
ORA-02097: parameter cannot be modified because specified value is invalid
ORA-02095: specified initialization parameter cannot be modified

SQL> alter system set inmemory_size = 500M scope=spfile;

System altered.

SQL> select * from v$sga;

NAME                 VALUE      CON_ID
-------------------- ---------- ----------
Fixed Size           2926472    0
Variable Size        1224738936 0
Database Buffers     905969664  0
Redo Buffers         13848576   0

SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.

SQL> startup
ORACLE instance started.

Total System Global Area 2147483648 bytes
Fixed Size               2926472 bytes
Variable Size            1275070584 bytes
Database Buffers         318767104 bytes
Redo Buffers             13848576 bytes
In-Memory Area           536870912 bytes
Database mounted.
Database opened.

3. You don’t use scope when setting in a PDB otherwise
ORA-02096: specified initialization parameter is not modifiable with this option

SQL> alter session set container = orcl;

Session altered.

SQL> alter system set inmemory_size=400M scope=spfile;

alter system set inmemory_size=400M scope=spfile
                                               *
ERROR at line 1:
ORA-02096: specified initialization parameter is not modifiable with this option

SQL> alter system set inmemory_size=400M;

System altered.

4. If you try to set the PDB inmemory_size larger than the CDB then you get
ORA-02097: parameter cannot be modified because specified value is invalid

SQL> alter session set container = orcl;

Session altered.

SQL> alter system set inmemory_size=600M;

alter system set inmemory_size=600M
*
ERROR at line 1:
ORA-02097: parameter cannot be modified because specified value is invalid
ORA-02095: specified initialization parameter cannot be modified

Added:

5. Similar to point (2), if we change the inmemory size of the CDB, we have to restart for it to take effect

SQL> select sys_context('USERENV','CON_NAME') from dual;

SYS_CONTEXT('USERENV','CON_NAME')
--------------------------------------------------------------------------------
CDB$ROOT

SQL> alter system set inmemory_size=500M scope=spfile;

System altered.

SQL> select value from v$parameter where name = 'inmemory_size';

VALUE
--------------------------------------------------------------------------------
218103808

SQL> shutdown immediate  
Database closed.
Database dismounted.
ORACLE instance shut down.

SQL> startup
ORACLE instance started.

Total System Global Area 2147483648 bytes
Fixed Size               2926472 bytes
Variable Size            1157630072 bytes
Database Buffers         436207616 bytes
Redo Buffers             13848576 bytes
In-Memory Area           536870912 bytes
Database mounted.
Database opened.
SQL> 

6. If we modify inmemory_size for PDB, it takes effect immediately (presumably because all it really is a maximum quota on the actual memory structure established in the CDB?):

SQL> alter session set container = orcl;

Session altered.

SQL> alter system set inmemory_size=200M;

System altered.

SQL> select value from v$parameter where name = 'inmemory_size';

VALUE
--------------------------------------------------------------------------------
209715200

SQL> alter system set inmemory_size=400M;

System altered.

SQL> select value from v$parameter where name = 'inmemory_size';

VALUE
--------------------------------------------------------------------------------
419430400

7. While we showed in point (4) that a single PDB inmemory_size cannot exceed the CDB size, we can overallocate such “maximum quotas” across all PDBs – i.e. the sum of inmemory_size for all PDBs can exceed CDB size (prompts future investigations):

SQL> alter session set container=cdb$root;

Session altered.

SQL> select value from v$parameter where name = 'inmemory_size';

VALUE
--------------------------------------------------------------------------------
536870912

SQL> alter session set container=orcl;

Session altered.

SQL> select value from v$parameter where name = 'inmemory_size';

VALUE
--------------------------------------------------------------------------------
419430400

SQL> alter session set container=cdb$root;

Session altered.

SQL> create pluggable database dom from orcl file_name_convert=
 2  ('/home/oracle/app/oracle/oradata/cdb1/orcl',
 3  '/home/oracle/app/oracle/oradata/cdb1/dom');

Pluggable database created.

SQL> alter pluggable database dom open;

Pluggable database altered.

SQL> alter session set container = dom;

Session altered.

SQL> alter system set inmemory_size = 400M;

System altered.

SQL> select value from v$parameter where name = 'inmemory_size';

VALUE
--------------------------------------------------------------------------------
419430400

Plan Instability

There seems to me to be a relatively simple choice.

Either you except that the Oracle Optimizer has a wealth of complicated strategies and, in this complex effort to get the best executions it can, will inevitably either get it wrong sometimes (or take some extra executions to realise it’s wrong).

Or you stick your head in the sand and raise a bug for every unexpected poor execution or plan flip.

But let’s say that above else you wanted Plan Stability.

This is an widespread desire.

What would be the best strategy?
And to what lengths would you take it?

SQL Plan Management features are designed to give you some degree of stability.

You could baseline statements which have given you a problem in the past.

What if that is not deemed adequate?

So, on the one hand, you could baseline everything you could find.

Taking a long history of AWR and taking regularly snaps of V$SQL, you could put all in a SQL Plan Baseline.

But, depending on your application, you might regularly get new SQL appear unprotected by a baseline.
In one Hibernate application I know, a change in the application domain model often results in the table aliases changing, meaning that there is lots of brand new, but very familiar, SQL.
So, you then become dependant on having a reliable testing infrastructure which will generate nearly all the SQL you’re likely to get in advance of production.

In addition, you might want multiple plans in your baseline – for ACS for example – and then, once that extra bindset awareness is flushed out of memory, you then need a couple of executions to rediscover it (unless you use a SQL Patch to inject BIND_AWARE into the specific sql statements).

It’s certainly no guarantee of stability.

What is the alternative?

I like to argue that most problematic executions stem from two sources:

  1. Badly written SQL
  2. Using poor combinations of features – e.g. bind variable peeking + histograms

The other day I made a comment to someone that “if you’re heavily reliant on ACS, perhaps you should rethink whether you should be using literals rather than bind variables”.

So, you might then take the position that plan instability stems from increasing complexity in the optimizer.

In which case, maybe a viable strategy might be to turn off much of the complexity:

  • Bind Variable Peeking
  • Cardinality feedback
  • Adaptive Cursor Sharing
  • Dynamic Sampling
  • Adaptive direct path reads
  • 12c adaptive execution plans
  • Anything with the word “adaptive” in it?
  • Default Stats job and/or default method_opt histogram behaviour

This seems quite radical to many. Perhaps justifiably.

Perhaps it would be nice if there were new optimizer strategies – OPTIMIZER_MODE? – perhaps “AGGRESSIVE_PERFORMANCE” (maybe “ADAPTIVE_PERFORMANCE_WITH_THE_ODD_HICCUP” ?) vs “PERFORMANCE_STABILITY” which were an umbrella of many of the features above.

Thoughts?

To what extent are you concerned with performance stability, above all else?

How many of the optimizer features in each release are actually aligned with that widespread concern?

Follow

Get every new post delivered to your Inbox.

Join 81 other followers