Developing for the Oracle Database

1 Coding Standards
What this document does not claim to offer is a prescriptive guide on the minutiae of coding standards.

Coding standards and naming conventions, particularly in SQL and PL/SQL, are an emotive and divisive subject.

This is largely the fault of a historical absence of mature IDEs and a lack of any rigorous, generally accepted standards as seen with other languages.

Many developers still hand-craft scripts and routines and the various tools available often have differences, subtle or otherwise, in what built-in formatting they can offer.

Good developers can adapt, good standards can adapt.

The most important objectives for coding standards are to :
• Make development faster and debugging easier
• Make easier the understanding of other people’s code
• Limit bugs

The following expectations support these objectives:
• Where possible use SQL before using PLSQL
• Code will be reasonably formatted and legible, preferably with a consistent style within the module at least.
• It is preferred but not mandated that Oracle keywords be capitalized – e.g. CREATE OR REPLACE PACKAGE, SELECT, UPDATE, DELETE, FROM – and lowercase used for objects and columns, etc
• In SELECT statements, tables should be aliased – this provides a very small benefit to the optimizer when parsing but also prevents bugs particularly in subqueries.
• Procedures, cursors and variables, etc should be properly scoped – e.g. public vs private, global vs local, parameter scope, etc
• Meaningful names will be given for code items
• Reasonable naming might include some sort of prefixed or suffixed indicator of purpose, e.g. k_ for constants, l_ or v_ for local variables, g_ for global variables, p_ for procedure, f_ for function, _pkg for package, i_ for in parameters, o_for out parameters, io_ for in out parameters.
• Package and procedure level comments should indicate why/when a particular program was changed but SVN, or other code respositories, are the appropriate mechanism for code control.
• Code comments should be used when they add value.
• Excessive commenting and stating the obvious should be avoided – these are often more effective when refactoring the code concerned into a private routine (procedure/function) which is named appropriately (e.g. function f_format_swift_string).
• CamelCase is not considered appropriate for the database as all code is stored in the db as uppercase.
• Package headers and bodies should be checked into separate files for clarity and to prevent unnecessary recompilcation of unchanged code and dependencies (version dependent)
• Performance should be built in and evidence of such documented.

2 Writing Optimal SQL

2.1 Key points
Writing Optimal SQL should be relatively simple but many people struggle particularly when making the transition from an object/attribute language to a set-based language like SQL.

The key tenets of performant database code in Oracle are:
• Think in sets.
• Think about how the database might be able to process and apply logic to a set of data with great efficiency.
• Ask the right question in the best way.
• Know your data.

In support, when thinking about database code and SQL operations:

• If the query needs to be long/big, make it long/big.
• Bulk operations are critical, row-by-row operations are a cardinal performance sin.
• Eliminate data at the earliest opportunity.
• Sort on the smallest possible set – if possible avoid aggregations, sorting and distinct operations on the largest sets of data.
• Use bind variable when you require shareable SQL and when bind variables make sense
• Use literals when literals make sense.
• Use a mix of binds and literals if appropriate.
• Avoid PL/SQL in SQL.
• Be careful of applying functions (TRUNC, etc) to columns in the WHERE clause.
• User-defined functions which are called from SQL and which themselves contain SQL are, almost without exception, unacceptable.
• Never rely on implicit datatype conversion. Use the correct datatypes for parameters and where possible convert parameters NOT columns.

2.2 Thinking in Sets

Crucial.

For further reading on thinking in sets, see:
http://explainextended.com/2009/07/12/double-thinking-in-sql/

2.3 What’s the question?

When writing SQL, focus on the question being asked by the SQL statement.

If you put the question into words as a comment before a complex SQL statement, then this can often add value to the next developer.

Often the most performant version of a SQL statement is the one which asks the question at hand in the most natural way.

To this end, proper consideration needs to be given to:
• Subqueries – EXISTS / IN / NOT EXISTS / NOT IN
• Set-operators – MINUS, UNION, UNION ALL, INTERSECT
• Use of DISTINCT is often an indication of a wrong sql statement or poor design
• Common Table Expressions:

Often it can help to use of Common Table Expressions (CTE), aka the WITH clause, for separating the main logic of the query from the subsequent fetching of additional data/attributes, e.g.

WITH main_logic AS
(SELECT …
FROM …
WHERE …)
SELECT ml.*, x.this, y.that, z.something_else
FROM main_logic ml
, …
WHERE …;

2.4 To ANSI or Not To ANSI

Another divisive subject is ANSI SQL vs Oracle Join syntax.

Again, this guide should not seek to be prescriptive on the preference of one over the other.

The bottom line should be that if a developer finds it easier to write a correct and optimal SQL statement using one rather than the other, then that is most important.

There are some SQL statement constructs which are more conveniently written in ANSI – the FULL OUTER JOIN for example.

It is also true that the optimizer always transforms ANSI SQL to the equivalent Oracle syntax and there are some limitations to the optimizer’s other complex query transformations when using ANSI SQL.

And unfortunately there are bugs in both.

2.5 Eliminate Early

Where there are predicates (WHERE clauses) which can significantly reduce the dataset early, check that they are being applied early enough in the execution plan (more information to follow), check whether the SQL statement might be rephrased or reconstructed (CTE/WITH) to make sure they are applied at an appropriate stage.

2.6 Sort / Aggregate on the smallest possible dataset

Similar to eliminate early. Sorting and aggregating requires memory and under certain conditions can spill to expensive (unscalable) disk operations.
Wherever possible, do the sort or aggregation on the smallest set of rows (not necessarily applicable to the order by clause of a query).

2.7 What’s the big deal with PL/SQL functions called from SQL?

The bottom line is that it’s about performance.

We could get in a whole argument about reusability vs performance but performance eventually wins in the end.

Often the correct mechanism for reusability in the Oracle database is not a function but a view joined to appropriately in the main SQL.

Functions cause a relatively expensive context switch between SQL and PLSQL engines.

In the name of reusability, functions encourage row-by-row operations and discourage thinking in sets.

If the function itself contains SQL, then this SQL will not be part of the read consistency mechanism of the calling statements which can be potentially problematic.

If you absolutely have to, have to, have to use functions in SQL, then think again.

Then if you really, really do then please look at deterministic functions and consider wrapping the function call in a (select from dual) to expose the potential benefits of subquery caching for functions called with repeated parameters.

2.8 What about simple functions like TRUNC in the WHERE clause?

Using functions on columns in the WHERE clause can prevent the optimizer from using an index or from pruning a partition unless a function-based index is in place on the column.

For this reason, it is often best to avoid this sort of construct:
WHERE TRUNC(some_date_column) = TO_DATE(’01-NOV-2013’,’DD-MON-YYYY’)

In favour of this:
WHERE some_date_column) >= TO_DATE(’01-NOV-2013’,’DD-MON-YYYY’)
AND some_date_column) < TO_DATE(’02-NOV-2013’,’DD-MON-YYYY’)

2.9 Using the correct datatype, be explicit

Performance problems related to using the incorrect datatypes are common.

The optimizer will implicitly add functions to make sure the datatypes on both sides of the predicate match.

Always convert date-like parameters to DATEs where the column datatype is also DATE.

Never rely on implicit datatype conversion.

3 Execution Plans & Metrics – What we want, why we want it and how to get it
We have flown through some aspects of how to have a better chance of writing an optimal SQL statement.

3.1 How can we tell if it’s optimal?

Run it.

Run it twice to rule out the effect of uncontrollable factors like OS caching, SAN caching, etc.

Run it on representative data.

Run it on current volumes.

Run it on expected future volumes.

Then what?

In order to validate that our SQL statement is likely to have effective performance, what we want is the actual execution plan used and preferably the actual rowsource metrics.

3.2 How?

3.2.1 Serial (i.e. non-parallel) Execution Plans

In general, the following is usually a good approach across a variety of tools – SQL Developer, Toad and SQL*Plus for example:
Alter session set statistics_level = all;
–bind setup
Var bind1 number
Exec :bind1 := …;

–run target sql statement
select ….
— fetch execution plan and metrics
select * from table(dbms_xplan.display_cursor(null,null,’allstats last’));

Then run as a script.

Firstly, for getting actual execution metrics we can do one of two things prior to running the SQL statement concerned:
1. Add the /*+ gather_plan_statistics */ hint to the SQL or
2. In the same session, run alter session set statistics_level = all;

Then run the target SQL and immediately afterwards run this select:

select * from table(dbms_xplan.display_cursor(null,null,’allstats last’));

This is a convenient wrapper to get the execution plan and metrics from V$SQL_PLAN.

The first parameter is SQL_ID and by passing in NULL, we default to the previous SQL_ID run in this session.

The second parameter is CHILD_CURSOR_NO and this should be the previous child_id for the previous sql_id.
The third parameter is the FORMAT and ‘ALLSTATS LAST’ format says to get all statistics for the last execution.

If this works this should produce an output which is examined in more detail in section 6.

3.2.2 What if this doesn’t work?

If you find you don’t have privilege to run these commands – you need access to V$SESSION for example to use DBMS_XPLAN.DISPLAY_CURSOR – then you need privilege. There is no reason for privilege not to be given.

Otherwise the approach above is effective 90% of the time.

For parallel execution plans, see section 3.2.3 below.

However, in addition and specifically to SQL Developer, there are some recursive operations run by the tool which means that SQL Developer runs some internal commands such that when our DBMS_XPLAN statement runs, the previous SQL ID is no longer our target SQL statement.

There is one such example in SQL Developer 3 related to timestamp columns which affects the test script when running everything as a script (F5). In this case, there are two alternatives. Firstly, run the individual commands in SQL Developer as Run Statement (F9 / Ctrl + Enter). Alternatively, just comment out the timestamp columns in the SELECT part of the statement, for the purposes of this exercise.

Furthermore, in SQL Developer 4 there are further changes to recursive operations which seem to affect some statements.

In all such cases, if the output of the DBMS_XPLAN.DISPLAY_CURSOR is not the execution plan of the statement being profiled then the approach should be to identify the SQL statement in the shared pool (Look for matching SQL_TEXT in V$SQL) and plug the specific SQL_ID into the first argument of the DBMS_XPLAN call (no need to rerun the target SQL statement).

3.2.3 Parallel Execution Plans

For parallel execution plans, the approach of using DBMS_XPLAN.DISPLAY_CURSOR with the format of ‘ALLSTATS LAST’ is not appropriate because it fetches the execution metrics from the last execution of the statement – which is the Query Coordinator (QC) and does not include the metrics of the parallel slaves.

A better approach for parallel execution plans is to use real time sql monitoring and the easiest way to do this is to run the following and capture the output report:
select dbms_sqltune.report_sql_monitor(‘’) from dual;

This requires you to identify the SQL_ID of the target sql statement from V$SQL (seek matching text in SQL_TEXT/SQL_

FULLTEXT column).
It may also require you to add the /*+ monitor */ hint to your SQL statement as by default this only kicks in on executions which last longer than a default number of seconds (2?) and for statements which are less than a certain length

3.2.4 When all else fails

Fall back on SQL Trace.
Alter session set sql_trace = true;
SELECT….
Alter session set sql_trace = false;

This produces a trace file on the database server and the trace file name can be identified by:
select * from v$diag_info where name = ‘Default Trace File’;

This can be run through TKPROF to get the execution metrics but TKPROF can also lie about the execution plan so this should be double checked in V$SQL_PLAN or by using DBMS_XPLAN.

In rare circumstances and if all the above alternatives are unavailable or impractical for some reason, only then might EXPLAIN PLAN or AUTOTRACE be acceptable.

For example, in any modern version of Oracle, you can do the following
explain plan for select…;
select * from table(dbms_xplan.display);

Now this is not useless but, for numerous reasons, EXPLAIN PLAN cannot be trusted and is not sufficient for our purposes.

AUTOTRACE also does not tell the truth (because it itself relies on EXPLAIN PLAN).
EXPLAIN PLAN is an estimate of what the execution plan for a SQL statement will be.

It doesn’t peek at binds.

It assumes all binds are VARCHAR2.

3.3 Why are we doing this?

We want this information documented as part of the change, attached to the Jira or whatever tool using for change management and included in any code review.

The most effective mechanism for tuning SQL is “Tuning by Cardinality Feedback”: http://www.centrexcc.com/Tuning%20by%20Cardinality%20Feedback.pdf

This follows the principle that:
“if an access plan is not optimal it is because the cardinality estimate for one or more of the row sources is grossly incorrect”
and
“the cbo (cost-based optimizer) does an excellent job of finding the best access plan for a given sql provided it is able to accurately estimate the cardinalities of the row sources in the plan”

By gathering the actual execution plan and the actual execution metrics, we can show whether the optimizer was accurate in its estimations and if it was accurate, then, from a developer perspective and for the purposes of most code reviews, there is a good likelihood that the SQL is good enough for the optimizer to do a good job with.

4 Interpretation of Execution Plans and Execution Metrics

If we’ve been lucky we should have the actual execution plan and the executions metrics.

4.1 What are we looking for? How do we interpret it?

Providing a thorough guide on how to interpret most variations of execution plans is beyond the scope of this guide, although we ill provide a basic guide in Appendix A.

Essentially, what we want to see in the execution metrics is that the optimizer’s estimates are broadly accurate.

How accurate?

In general, we shouldn’t necessarily be overly concerned until we get to a factor of 10x or even more.

Estimated 100K rows, Actual 1M rows – probably not too bothered.

Estimate of 1 row, Actual 10000 rows – likely to be significant inefficiencies in either join order, join mechanism and/or access path.

And when we are looking at estimates vs actual, we need to consider the “Starts” so what we are looking for is that “Starts * E-rows” is in the right ballpark compared to “A-rows”. For more information, please see Appendix A.

Here are a couple of examples:
SQL_ID fst03j2p1czpb, child number 0
————————————-
select * from t1 , t2 where t1.col1 = t2.col1
Plan hash value: 1838229974

————————————————————————————-
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
————————————————————————————-
| 0 | SELECT STATEMENT | | 1 | | 10000 |00:00:00.03 | 1172 |
|* 1 | HASH JOIN | | 1 | 10000 | 10000 |00:00:00.03 | 1172 |
| 2 | TABLE ACCESS FULL| T1 | 1 | 10000 | 10000 |00:00:00.01 | 576 |
| 3 | TABLE ACCESS FULL| T2 | 1 | 10000 | 10000 |00:00:00.01 | 596 |
————————————————————————————-

Predicate Information (identified by operation id):
—————————————————

1 – access(“T1″.”COL1″=”T2”.”COL1″)

In the above, the estimates are accurate so there is a very good chance that this is a good plan.

Here’s another, this time not so good because the estimate of rows in T1 was 1 whereas the actual was 10000.

This led the optimizer to choose an index access path over a full table scan and a NESTED LOOP rather than a HASH JOIN.
SQL_ID 9quvuvkf8tzwj, child number 0
————————————-
select /*+ cardinality(t1 1) */ * from t1 , t2 where t1.col1 =
t2.col1

Plan hash value: 931243032

——————————————————————————————————–
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
——————————————————————————————————–
| 0 | SELECT STATEMENT | | 1 | | 10000 |00:00:00.04 | 12640 |
| 1 | NESTED LOOPS | | 1 | | 10000 |00:00:00.04 | 12640 |
| 2 | NESTED LOOPS | | 1 | 1 | 10000 |00:00:00.02 | 2640 |
| 3 | TABLE ACCESS FULL | T1 | 1 | 1 | 10000 |00:00:00.01 | 596 |
|* 4 | INDEX UNIQUE SCAN | SYS_C00446778 | 10000 | 1 | 10000 |00:00:00.01 | 2044 |
| 5 | TABLE ACCESS BY INDEX ROWID| T2 | 10000 | 1 | 10000 |00:00:00.02 | 10000 |
——————————————————————————————————–

Predicate Information (identified by operation id):
—————————————————

4 – access(“T1″.”COL1″=”T2”.”COL1″)

4.2 Do I need to worry about things like NESTED LOOPS vs HASH JOINS?

For the purposes of this exercise, no but the more knowledge the better.

Accuracy of estimates should be sufficient.

The remainder of the information should be attached to the change tool for review.

5 Appendix A: Basic guide to reading an execution plan

Using the following execution plan from a two table join:
SQL_ID 9quvuvkf8tzwj, child number 0
————————————-
select /*+ cardinality(t1 1) */ * from t1 , t2 where t1.col1 =
t2.col1

Plan hash value: 931243032

—————————————————————————————-
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time |
—————————————————————————————-
| 0 | SELECT STATEMENT | | 1 | | 10000 |00:00:00.04 |
| 1 | NESTED LOOPS | | 1 | | 10000 |00:00:00.04 |
| 2 | NESTED LOOPS | | 1 | 1 | 10000 |00:00:00.02 |
| 3 | TABLE ACCESS FULL | T1 | 1 | 1 | 10000 |00:00:00.01 |
|* 4 | INDEX UNIQUE SCAN | SYS_C.. | 10000 | 1 | 10000 |00:00:00.01 |
| 5 | TABLE ACCESS BY INDEX ROWID| T2 | 10000 | 1 | 10000 |00:00:00.02 |
—————————————————————————————-

Predicate Information (identified by operation id):
—————————————————

4 – access(“T1″.”COL1″=”T2”.”COL1″)

There are four key elements:
• The SQL statement
• The SQL ID – a hash value of the sql statement, usually consistent between databases and even across versions
• The execution plan
• The predicate section – not to be overlooked. Can highlight issues with implicit functions and datatype conversions amongst other things

For the execution plan itself there are a number of elements to be concerned with:
• Optimizer – all modern version of Oracle use the Cost-based Optimizer (CBO). This uses statistics and cost calculations to choose a best-cost plan for execution.

• Cost – Cost is an estimated indicator of time which the optimizer uses to compare execution plan possibilities, usually choosing the lowest cost plan. However, to all intents and purposes, developers should ignore it.

• Cardinality – An estimate of the number of rows for a particular rowsource, for a particular join, etc. Exposed in the execution plan as E-Rows for estimate and A-Rows for actuals. When comparing E-Rows to A-Rows it is important to take Starts into account, i.e to compare “Starts * E-Rows” to A-Rows. The Nested loop operations for example will have multiple starts for the inner/probed rowsource.

• Parent:child operations – An execution plan is generally a succession of parent:child operations – follow and match the indentation. A join mechanism should have two children.

• Join mechanism – A join mechanism joins two rowsources. There are a variety of mechanisms but in general there are two main methods depending on the cardinalities:

o NESTED LOOP – Essentially a FOR LOOP – For each row in the outer/driving rowsource, probe the inner/probed rowsource. Generally used for low cardinality rowsources.

o HASH JOIN – Hash all the outer/driving rowsource based on the join key(s) then hash all the inner rowsource. Generally used for high cardinality rowsources. If the cardinality estimate is too low, work area sizes used for hashing maybe too small and spill to temp space on disk – slow/unscalable

• Join order – Depending on the cardinalities, the optimizer can choose to join T1 to T2 or T2 to T1. The number of permutations for join order is N! where N is the number of tables being joined. The optimizer will limit itself to a maximum number of permutations to evaluate.

• Access path – how the data is fetched from the table, i.e. by index via various different index access mechanisms or by tablescan, etc.

• Buffers – A measure of logical IO. See below.

5.1 What happens first in the execution plan?
—————————————————————————————-
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time |
—————————————————————————————-
| 0 | SELECT STATEMENT | | 1 | | 10000 |00:00:00.04 |
| 1 | NESTED LOOPS | | 1 | | 10000 |00:00:00.04 |
| 2 | NESTED LOOPS | | 1 | 1 | 10000 |00:00:00.02 |
| 3 | TABLE ACCESS FULL | T1 | 1 | 1 | 10000 |00:00:00.01 |
|* 4 | INDEX UNIQUE SCAN | SYS_C.. | 10000 | 1 | 10000 |00:00:00.01 |
| 5 | TABLE ACCESS BY INDEX ROWID| T2 | 10000 | 1 | 10000 |00:00:00.02 |
—————————————————————————————-
Predicate Information (identified by operation id):
—————————————————
4 – access(“T1″.”COL1″=”T2”.”COL1″)

There are a couple of exceptions but in general the execution plan starts at the first operation without a child.

So, following the indentation, the first operation without a child is:
| 3 | TABLE ACCESS FULL | T1 | 1 | 1 | 10000 |00:00:00.01 |

This is considered to be the inner/driving rowsource of the parent operation at:
| 2 | NESTED LOOPS | | 1 | 1 | 10000 |00:00:00.02 |

For each row in this rowsource, we probe the inner rowsource:
|* 4 | INDEX UNIQUE SCAN | SYS_C.. | 10000 | 1 | 10000 |00:00:00.01 |

Which is actually an index lookup on the primary key using the predicate:
4 – access(“T1″.”COL1″=”T2”.”COL1″)

The data produced by this join is then used in the parent operation:
| 1 | NESTED LOOPS | | 1 | | 10000 |00:00:00.04 |

Which uses the rowids from the unique index/primary key for table T2 to get the actual table data from T2:
| 5 | TABLE ACCESS BY INDEX ROWID| T2 | 10000 | 1 | 10000 |00:00:00.02 |

5.2 Red flags?

• Row estimates of 1:
o The minimum row estimate is 1 and in some cases this actually means 0.
o If this is not a primary key access and there really isn’t 0/1, then are there any statistics for this object?
o Row estimates of 0/1 where the actual number of rows is significantly more than 1 can cause significant performance problems

• MERGE JOIN CARTESIAN + BUFFER SORT – particularly where the estimate is 1. Can be particularly detrimental if the actual rows are greater than 1. Rarely a good operation but can be symptomatic of a missing join.

• Implicit datatype conversions

• Nested loop operations where the inner/probed table/rowsource is a FULL segment scan.

• VIEW operations – symptomatic of a non-mergeable view which may or may not be a problem

• FILTER operations where the row-by-row operation is significant

5.3 Is there anything else to look out for?

Yes, that buffers column is a measure of logical IO.

When comparing different ways of doing things, when tuning SQL, one of the key measures that should be targeted is a reduction in logical IO.

If one approach uses significantly less logical IO compared to another approach then that is significant. The statement with the lower IO is likely to be better, is more likely to benefit from having more of the data it’s interested in cached and is less likely to impact other queries and the caching of other data.

There should probably be a rule of thumb about the ratio of logical IO to rows fetched. The difficulty is picking the right indicators.

If a query selects 100 rows from 100 million buffer gets and those all-important estimates are reasonably accurate, this should be a strong signal that perhaps the indexes are not optimal for that particular query.

As a rule of thumb, a ratio of a couple of consistent gets or less per row is damn good. 100,000s or millions may well be an indicator of significant inefficiencies.

But, as always, it depends.

It also significantly depends on whether the query itself is fast enough for the business requirement and whether it has the potential to impact other users of the database.

Furthermore, one lone query is unlikely to justify a new index but that is beyond the scope of this guide.

5.4 Further Reading

A 10053 Trace provides a very detailed walkthrough of the optimizer’s process of coming up with the execution plan. Not for the faint-hearted.
http://www.oracle.com/technetwork/database/bi-datawarehousing/twp-explain-the-explain-plan-052011-393674.pdf
Jonathan Lewis: Cost-based Oracle Fundamentals: http://www.apress.com/9781590596364

6 Appendix B Interacting with the database
The Oracle database is a very powerful piece of software.

It’s also likely to be one of the most expensive pieces of software in an application’s tech stack.

The keys to a performant Oracle database also differ significantly from other vendors.

How you do best approach something in Sybase or SQL Server is not necessarily how you should do something in Oracle.

One classic example is the use of temporary tables.

Developers should know how to get the best out of a particular database.

To treat it like a bit bucket or a “slow, dumb backup datastore” is to waste money and resources.

6.1 Vendor-specific Database features

Application developers should not be overly wary of using a feature particular to the Oracle database. Some tools can make it difficult to use vendor-specific features or optimizations but an investment in time and effort to do so can reap significant performance benefits.

Whilst this attitude might be relevant for third-party product developers who have to write software that can be installed on different database vendors, this is largely not true of enterprises writing internal software systems.

It is unlikely that the Oracle database on a particular system will be replaced by another vendor database.

It is far more likely that a Java component interacting with the database will eventually be replaced by a C# component or that the usage of the Oracle database will be deprecated in favour of caching and NOSQL technologies, so if you’re going to use SQL, use Oracle-specific features where they offer benefit.

6.2 Round-tripping

The default fetchsize for JDBC and for SQL*Plus is 10. The default is almost never appropriate for general usage as many SQL statements can be expected to fetch significantly more than 10 rows and therefore significant gains can be made by increasing this setting beyond the default.

The issue is not only about roundtrips across the network, it’s also related to the logical IO that a query needs to do. If you ask for just 10 rows, the database will do all the IO it needs to do to fetch the first ten rows. When you ask for the next 10 rows, the server process on the database might well have to do a logical read of some of the same blocks as the previous fetch which can lead to significant inefficiencies compared to a larger fetchsize.

6.3 Abstraction & Interfacing with the database

Abstraction is a principle that is put on a pedestal in the middle tier and yet often abandoned when interacting with the database.

Put simply if SQL is embedded in Java code then this introduces unnecessary dependencies on the database model and limits the ability to make subtle changes to the SQL or to the model without making a change to the application server code and doing an app server release.

Views, procedures and packages can all provide an interface to the database and the data model.

6.4 It’s all about the data.

Interacting with data appropriately, regardless of database vendor, is crucial.

Think in Sets.

Also consider the success of Engineered Systems like Oracle Exadata.

One of the things that Exadata focuses on, for example, is being able to eliminate redundant data as early as possible.

This means that the logic in the storage cells can eliminate the data before it even gets to the traditional database memory, before it goes anywhere near the network, long before it goes up to the application.

And it can do this with significant degrees of parallelism, usually with far more efficiency than similar processing in application threads.

Why is this relevant?

Eliminate early.

Let the database do the work it was designed to do.

Applications should let the database give them the smallest set of data that they need and should not bring excessive amounts of data into the middle tier for elimination and aggregation there.

Volumes of data are exploding. The best chances of scaling efficiently to deal with these volumes of data are to interact with the data appropriately.

Advertisements

Blocking Prepared XA Transaction

There was an oracle-l thread last month about blocking sessions which could not be identified.

I replied back about a very niche scenario which I’d seen a few times before. And it’s just happened again so I thought it would be useful to just post some evidence on this.

Here are a couple of articles posted previously on XA transactions:

First, a reminder, from the posts above, that a transaction doesn’t have to have a session.

Normal “vanilla” sessions, there’s a 1:1 relationship. But with an XA transaction, a session can attach and detach – but only one session can be attached to a transaction at any one time.

And a bit of context about these XA transactions.

This is a complex Java application with multiple resources participating in the XA transaction. There was HornetQ involved and HornetQ was apparently timing out and was bounced, the taking down being down probably with a kill -9.

And an XA transaction does what is called a 2-phase commit where the first phase involves the transaction co-ordinator instructing the participating resources to prepare to commit and subsequently the second phase is to commit.

What I have seen many times on this application is that something happens to the transaction co-ordinator in between the PREPARE and the COMMIT and we are left we an orphaned transaction.

This transaction will eventually time-out – depending on the properties set on the transaction by the co-ordinator – at which point it will become an in-doubt transaction and be visible in dba_2pc_pending.

Back to reality… there is nothing in DBA_2PC_PENDING:

select * from dba_2pc_pending;

no rows selected

But we can see there is a transaction in prepared state in V$TRANSACTION, a transaction which started yesterday afternoon:

select * from v$transaction where status = 'PREPARED';
ADDR                 XIDUSN    XIDSLOT     XIDSQN     UBAFIL     UBABLK     UBASQN     UBAREC STATUS           START_TIME           START_SCNB START_SCNW START_UEXT START_UBAFIL START_UBABLK START_UBASQN START_UBAREC SES_ADDR               FLAG SPACE RECURSIVE NOUNDO PTX NAME                                                                                                                                                                                                                                                             PRV_XIDUSN PRV_XIDSLT PRV_XIDSQN PTX_XIDUSN PTX_XIDSLT PTX_XIDSQN     DSCN-B     DSCN-W  USED_UBLK  USED_UREC     LOG_IO     PHY_IO     CR_GET  CR_CHANGE START_DATE            DSCN_BASE  DSCN_WRAP  START_SCN DEPENDENT_SCN XID              PRV_XID          PTX_XID        
---------------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------------- -------------------- ---------- ---------- ---------- ------------ ------------ ------------ ------------ ---------------- ---------- ----- --------- ------ --- ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- -------------------- ---------- ---------- ---------- ------------- ---------------- ---------------- ----------------
0000004A833D6868        691         32     315541         22    1085802     -28624         18 PREPARED         01/19/18 15:39:45    3454176189       2913         13           22      1085802       -28624            9 0000004B82E584A0    4201987 NO    NO        NO     NO                                                                                                                                                                                                                                                                            0          0          0          0          0          0          0          0          1         12     -40287       -163  -66597824      -1385 19-JAN-2018 15:39:45          0          0    1.3E+13             0 B302200095D00400 0000000000000000 0000000000000000 


And this is the script which I mentioned on the oracle-l thread which is one I use to see what locks transaction are holding, when the transaction started and what sessions are attached:

select s.machine
,      lo.inst_id
,      lo.object_id
,      lo.session_id
,      lo.os_user_name
,      lo.process
,      ob.owner
,      ob.object_name
,      ob.subobject_name
,      tx.addr
,      tx.start_time txn_start_time
,      tx.status
,      tx.xid
,      s.*
from   gv$locked_object lo
,      dba_objects      ob
,      gv$transaction    tx
,      gv$session        s
where  ob.object_id = lo.object_id
and    tx.xidusn    (+) = lo.xidusn
and    tx.xidslot   (+) = lo.xidslot
and    tx.xidsqn    (+) = lo.xidsqn
and    s.taddr      (+) = tx.addr
order by txn_start_time, session_id, object_name;

For privacy reasons and as this is a real-world situation and not an isolated test case, I won’t share the output of the script.

But it shows that:

  • the transaction is holding mode 6 exclusive TX row locks on a number of objects
  • that the transaction is in PREPARED
  • and that there is no session attached to the transaction (although v$locked_object does tell us what the SID was when it was there)

Now, from the application perspective, something has apparently rolled back a message perhaps because now HornetQ has been bounced, everything is back up and running and it seems like the message that resulted in our orphaned transaction is being reattempted and is being blocked by the exclusive locks still being held.

From an ACTIVE SESSION HISTORY perspective, this is what we see from this script from which I’ve removed columns for brevity and privacy but left them in so you can see what I run normally:

select count(*) over (partition by h.sample_time) sess_cnt
--,      h.user_id
--,      (select username from dba_users u where u.user_id = h.user_id) u, h.service_hash
,      xid--, sample_id
, sample_time, session_state, session_id, session_serial#,--sql_id,
sql_exec_id, sql_exec_start, event, --p1,
mod(p1,16), blocking_session,blocking_session_serial#--, current_obj#
--,      (select object_name||' - '||subobject_name from dba_objects where object_id = current_obj#) obj
--,      (select sql_fulltext from v$sql s where s.sql_id = h.sql_id and rownum = 1) sqltxt
--,      (select sql_text from dba_hist_sqltext s where s.sql_id = h.sql_id and rownum = 1) sqltxt
--, h.*
from   v$active_session_history h
where event = 'enq: TX - row lock contention'
order by h.sample_id desc;
XID              SESSION_STATE SESSION_ID SESSION_SERIAL# EVENT                                                            MOD(P1,16) BLOCKING_SESSION BLOCKING_SESSION_SERIAL#
---------------- ------------- ---------- --------------- ---------------------------------------------------------------- ---------- ---------------- ------------------------
4F021700A3C00600 WAITING              232           53035 enq: TX - row lock contention                                             6
FC010B00196E0A00 WAITING              471            5205 enq: TX - row lock contention                                             6
670320004FA50300 WAITING             2652           11791 enq: TX - row lock contention                                             6
640204005BA40500 WAITING             4300           49665 enq: TX - row lock contention                                             6

So, you can see we have four sessions being blocked on exclusive mode 6 row level locks and that the blocking session is null. If I showed you the full history then you would see that these sessions have been repeatedly trying for many hours.

BTW, this is RAC but all these sessions are intentionally on the same instance so there’s none of that jiggery-pokery involved.

I imagine at some point there was an internal conversation in Oracle about whether to report blocking session or blocking transaction.

At this point, it’s just a normal lock held by a “normal” transaction which hasn’t committed yet and actually the application just keeps trying to run the transaction waits for 1 minute until they hit the default distributed transaction timeout:

ORA-02049: timeout: distributed transaction waiting for lock

which will be logged somewhere obscure in the application logs – and there’ll be a brief pause and then it starts all over again.

Anyway at this point what to do?

At this point, the transaction hasn’t timed out.

In a few hours, the transaction will time out and become an in-doubt transaction.

Once it does, the application will receive a message:

ORA-01591: lock held by in-doubt distributed transaction

At which time, it can be manually resolved.

Typically, this is what we usually do:

begin
for txn in (select local_tran_id from dba_2pc_pending)
loop
-- if txn is not already forced rollback
execute immediate 'rollback force '''||txn.local_tran_id||'''';
commit;
dbms_transaction.purge_lost_db_entry(txn.local_tran_id);
commit;
end loop;
end;
/

If we were going to intervene right now and resolve it, what could we do?

We could bounce the database. Then the application would become in-doubt and see above.

But bouncing the database is quite drastic.

A lot of the advice in the manual is about what to do once it is IN-DOUBT. Some of it might work now – not sure.

What I advise is that if we know we want to rollback or commit this transaction now and we don’t want to wait for it to become IN-DOUBT which often we can’t wait for then programatically we can attach to the transaction using DBMS_XA and do something with it.

First of all, we need some information about the transaction.
CAVEAT DBA!!! This isn’t exactly well documented but I have found that what we tend to need is to identify the transactions in V$GLOBAL_TRANSACTION which are in COLLECTING state:

select state
,      UTL_RAW.CAST_TO_BINARY_INTEGER (globalid)
,      UTL_RAW.CAST_TO_BINARY_INTEGER (branchid)
,      t.* 
from v$global_transaction t where state = 'COLLECTING';
STATE                                  UTL_RAW.CAST_TO_BINARY_INTEGER(GLOBALID) UTL_RAW.CAST_TO_BINARY_INTEGER(BRANCHID)   FORMATID GLOBALID                                                                                                                         BRANCHID                                                                                                                           BRANCHES   REFCOUNT PREPARECOUNT STATE                                       FLAGS COUPLING      
-------------------------------------- ---------------------------------------- ---------------------------------------- ---------- -------------------------------------------------------------------------------------------------------------------------------- -------------------------------------------------------------------------------------------------------------------------------- ---------- ---------- ------------ -------------------------------------- ---------- ---------------
COLLECTING                                                                   49                                       45     131075 312D2D35363832376566363A393236643A35613562363664363A633738353036                                                                 2D35363832376566363A393236643A35613562363664363A633738353065                                                                              1          1            1 COLLECTING                                      1 TIGHTLY COUPLED 


Then we can replace the formatid, global id and branch id in the script below. Whether you require numbers or raw depends on the signature to DBMS_XA_XID – see documentation.

set serveroutput on
DECLARE
l_xid     DBMS_XA_XID :=
DBMS_XA_XID(131075,
'312D2D35363832376566363A393236643A35613562363664363A633738353036',
'2D35363832376566363A393236643A35613562363664363A633738353065');
l_return  PLS_INTEGER;
BEGIN
l_return := SYS.dbms_xa.xa_rollback(xid =&gt;  l_xid);
dbms_output.put_line(l_return);
END;
/

This approach above comes with no guarantees.
But it has worked for me several times in the past.

Outer Join with OR and Lateral View Decorrelation

Use of ANSI SQL is a personal thing.

Historically I have not been a fan apart from where it makes things easier/possible.

This reticence was mainly due to optimizer bugs and limitations in the earlier days.

Recently I have been using it much more because I find that the developers I interact with prefer it / understand it better.

You might/should be aware that Oracle will rewrite ANSI SQL to an Oracle syntax representation, this transformation being listed in the optimizer trace file.

You might/should also be aware that Oracle outer join syntax does not allow OR or IN:

drop table t1;
drop table t2;

create table t1
as
select floor((rownum+1)/2) col1
,      case when mod(rownum,2) = 0 then 1 else 2 end col2
,      10 col3
from   dual
connect by rownum <= 20;

create table t2
as
select rownum col1
,      case when mod(rownum,2) = 0 then 2 else 1 end col3
from   dual
connect by rownum <= 10;

select *
from   t1
,      t2
where  t1.col1 = t2.col1 (+) 
and  ((t1.col2 = 1
and    t2.col3 (+) > t1.col3)
or    (t1.col2 = 2
and    t2.col3 (+) < t1.col3));

ORA-01719: outer join operator (+) not allowed in operand of OR or IN

ANSI SQL remedies this:

alter session tracefile_identifier = 'domlg1';
alter session set events 'trace[rdbms.SQL_Optimizer.*]';
select *
from   t1
left join t2
on    t1.col1 = t2.col1
and ((t1.col2 = 1
and   t2.col3 > t1.col3)
or   (t1.col2 = 2
and   t2.col3 < t1.col3));

alter session set events 'trace off';

But it comes at a price.

Note the execution plan:

----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |    20 |  1300 |    42   (0)| 00:00:01 |
|   1 |  NESTED LOOPS OUTER |      |    20 |  1300 |    42   (0)| 00:00:01 |
|   2 |   TABLE ACCESS FULL | T1   |    20 |   780 |     2   (0)| 00:00:01 |
|   3 |   VIEW              |      |     1 |    26 |     2   (0)| 00:00:01 |
|*  4 |    TABLE ACCESS FULL| T2   |     1 |    26 |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - filter("T1"."COL1"="T2"."COL1" AND ("T1"."COL2"=1 AND
              "T2"."COL3">"T1"."COL3" OR "T1"."COL2"=2 AND "T2"."COL3"<"T1"."COL3"))   

Now, maybe you will have better luck than me but no matter what I try I cannot change the NESTED LOOPS OUTER operation (I could rewrite it to do two joins rather than one and remove the OR).

So, if that lateral view involves some full table scans or other significant operations, they might be very expense on the outer operation of a nested loop.

The reason is in the optimizer trace.

Query after View Removal
******* UNPARSED QUERY IS ********
SELECT "T1."COL1" "COL1", "T1."COL2" "COL2", "T1."COL3" "COL3", "VW_LAT_AE9E49E8"."ITEM_1_0" "COL1", "VW_LAT_AE9E49E8"."ITEM_2_1" "COL3" FROM "DOM"."T1" "T1", LATERAL( (SELECT "T2"."COL1" "ITEM_1_0", "T2"."COL3" "ITEM_2_1" FROM "DOM"."T2" "T2" WHERE "T1"."COL1"="T2"."COL1" AND ("T1"."COL2"=1 AND "T2"."COL3">"T1"."COL3" OR "T1"."COL2"=2 AND "T2"."COL3" < "T1"."COL3"))) (+) "VW_LAT_AE9E49E8"
DCL:Checking validity of lateral view decorrelation SEL$BCD4421C (#1)
DCL: Bypassed: view has non-well-formed predicate
DCL: Failed decorrelation validity for lateral view block SEL$BCD4421C (#1)

The OR prevents the decorrelation which seems to mean that we’re stuck with a NESTED LOOP for now.

Further Reading on ANSI:
Oracle Optimizer Blog
Jonathan Lewis on ANSI Outer
Jonathan Lewis on ANSI

INSERT ALL caveat

Why you might want to think twice about using INSERT ALL.

One of those things I knew and then forgot.

So, let’s say you’ve got three tables or a partitioned table or something like that.

Let’s use regional tables for simplicity.

drop table t1_r1;
drop table t1_r2;
drop table t1_r3;

create table t1_r1
(col1 varchar2(2) not null
,col2 number not null
,check( col1 in ('R1')));

create table t1_r2
(col1 varchar2(2) not null
,col2 number not null
,check( col1 in ('R2')));

create table t1_r3
(col1 varchar2(2) not null
,col2 number not null
,check( col1 in ('R3')));

insert into t1_r1 values ('R1',1);
insert into t1_r2 values ('R2',1);
insert into t1_r3 values ('R3',1);

commit;

And you want a routine that will insert into one of those tables depending on region.

And you’re a simple fellow, so you go with an IF statement:

create or replace procedure p1 (
  col1 in varchar2, 
  col2 in number
)
as
begin
  if col1 = 'R1'
  then
      insert into t1_r1 values(col1,col2);
  elsif col1 = 'R2'
  then
      insert into t1_r3 values(col1,col2);
  else 
      insert into t1_r3 values(col1,col2);
  end if;
end p1;
/

Procedure P1 compiled

And then in the same session you run this uncommitted:

exec p1('R1',2);

PL/SQL procedure successfully completed.

And then in another session you decide to truncate table T1_R3:

truncate table t1_r3;

Table T1_R3 truncated.

No problem.
None was expected.

However…

Let’s say that we decide to tidy up that procedure and get rid of some of the repetition by using an INSERT ALL statement.
I will use a standalone sql statement just to demonstrate a further minor aspect rather than using a procedure with a bound parameter.

insert all
when col1 = 'R1' then into t1_r1
when col1 = 'R2' then into t1_r2
when col1 = 'R3' then into t1_r3
select 'R1' col1,2 col2
from dual;

1 row inserted.

Let’s revisit the truncate:

truncate table t1_r3;

SQL Error: ORA-00054: resource busy and acquire with NOWAIT specified or timeout expired
00054. 00000 -  "resource busy and acquire with NOWAIT specified or timeout expired"
*Cause:    Interested resource is busy.
*Action:   Retry if necessary or increase timeout.

TM share locks from the INSERT ALL on all three possible targets prevent the TRUNCATE.

So, a simple/simplisitic illustration of why you might want to think twice about whether INSERT ALL is the best feature for your use case, based on a real life problem.

Side-effect of ALTER SESSION SET CURRENT_SCHEMA

For most of our database set-ups we use a different TEMP space for application users than for end-user/support/developer/reporting usage.
The intention is to minimise the potential impact of a rogue ad-hoc query on the main applicaiton.

However, turns out this is ineffective IF you use:

ALTER SESSION SET CURRENT_SCHEMA = main_app_schema;

This is documented but I wasn’t previously aware.
No surprise that this learning opportunity was presented as a direct result of a rogue query on a prod DB blowing out the main TEMP space.

Deduction quickly suggested that the above must be the case and it was quickly proven by a test case and also supported by the documentation .

the database uses the temporary tablespace of the specified schema for sorts, joins, and storage of temporary database objects

Obviously… not using ALTER SESSION and using fully qualified object names prevents this “vulnerability”… but that’s not good enough unfortunately.

Did it really fix it 2: Plan flip, unprofileable, baseline OFE round 2 and “stability”

Snappy title, huh?

Aka: Why a sql plan baseline may be no guarantee of stability.

The other day, a problematic plan flip…

Tangent 1:
Cue much discussion about plan flips etc.
My thoughts on stability are that the priority for most application owners is stability and predictability but that does not tally with the defaul CBO behaviour and potentially you have to turn off a helluva lot to even get close.

Tangent 2:
I have commented before about the common sensitive of many databases to concurrent direct path reads (or rather the sensitivity is on the underlying SAN).

Here is one such illustration caused by this particular plan flip.

See that our single block read times go out from a “normal” range of 5 – 20ms to an average of 50+ ms. At this point anything doing physical IO starts to notice.
(Needless to say, I’m not a fan of these “normal” IO times – I think they’re awful but the SLA of the tier 2 (non flash) SAN storage is 20ms so we don’t alert until we breach that.)
sbr

Source:

select snap_id, begin_time, end_time, round(average,2)
from   dba_hist_sysmetric_summary
where  metric_name     = 'Average Synchronous Single-Block Read Latency'
order by snap_id;

Back to reality
Anyway, standard immediate-term fix for a plan flip is to hunt down a better plan from AWR and fix it with a profile or a baseline.

Not a problem to find one for this SQL as it nearly always executed with the same decent plan.
(AWR history is much longer than this but it gives the picture)

select to_char(trunc(sn.end_interval_time),'DD-MON-YYYY') dt
,      st.sql_id
,      st.plan_hash_value
,      sum(rows_processed_delta) rws
,      sum(executions_delta)     execs
,      round(sum(elapsed_time_delta)/1000/1000)   elp
,      round(sum(elapsed_time_delta)/1000/1000/nvl(nullif(sum(executions_delta),0),1),2)   elpe
,      round(sum(iowait_delta)/1000/1000)         io
,      round(sum(cpu_time_delta)/1000/1000)       cpu
,      sum(buffer_gets_delta)    gets
,      round(sum(disk_reads_delta))         disk_reads
from   dba_hist_snapshot sn
,      dba_hist_sqlstat  st
where  st.snap_id            = sn.snap_id
and    sn.instance_number = st.instance_number
and    st.sql_id             IN ('6xrx006fmqbq2')
group by trunc(sn.end_interval_time), st.sql_id, st.plan_hash_value
order by trunc(sn.end_interval_time) desc, elp desc; 
DT          SQL_ID        PLAN_HASH_VALUE        RWS      EXECS        ELP       ELPE         IO        CPU       GETS DISK_READS
----------- ------------- --------------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
10-NOV-2015 6xrx006fmqbq2      3871506300          5       1815     236211     130.14     226340       9993  757853211  745588219 
09-NOV-2015 6xrx006fmqbq2      3871506300          1       1232     156412     126.96     149660       6846  507571690  499305166 
09-NOV-2015 6xrx006fmqbq2      3803735407         53       6835        743        .11          3        723  190488567         41 
07-NOV-2015 6xrx006fmqbq2      3803735407         31       3079        230        .07          3        218   37385043         56 
06-NOV-2015 6xrx006fmqbq2      3803735407        166      18931       1200        .06         11       1128  180142678        484 
05-NOV-2015 6xrx006fmqbq2      3803735407        305       9904        553        .06          5        508   75239139         93 
04-NOV-2015 6xrx006fmqbq2      3803735407        160      18900       1089        .06          9       1027  150523728        204 
03-NOV-2015 6xrx006fmqbq2      3803735407        226      12048        859        .07         58        794  154111239        424 
02-NOV-2015 6xrx006fmqbq2      3803735407       1081      13327       1276         .1         66       1200  278129234        465 

Problem plan is obvious, right?

Started during the day on the 9th, continued overnight until eventually noticed.

PHV 3871506300 does a FTS and gets direct path reads which causes our IO stress.

So, as mentioned, no problem, hint it with a sql profile via COE_XFR_SQL_PROFILE.sql (see Metalink note 215187.1)

But it didn’t work.

Not properly.

See top line.

DT          SQL_ID        PLAN_HASH_VALUE        RWS      EXECS        ELP       ELPE         IO        CPU       GETS DISK_READS
----------- ------------- --------------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
10-NOV-2015 6xrx006fmqbq2      3353364157         20       4485     114084      25.44          4     111246 2907305774        216
10-NOV-2015 6xrx006fmqbq2      3871506300          5       1815     236211     130.14     226340       9993  757853211  745588219 
09-NOV-2015 6xrx006fmqbq2      3871506300          1       1232     156412     126.96     149660       6846  507571690  499305166 
09-NOV-2015 6xrx006fmqbq2      3803735407         53       6835        743        .11          3        723  190488567         41 
07-NOV-2015 6xrx006fmqbq2      3803735407         31       3079        230        .07          3        218   37385043         56 
06-NOV-2015 6xrx006fmqbq2      3803735407        166      18931       1200        .06         11       1128  180142678        484 
05-NOV-2015 6xrx006fmqbq2      3803735407        305       9904        553        .06          5        508   75239139         93 
04-NOV-2015 6xrx006fmqbq2      3803735407        160      18900       1089        .06          9       1027  150523728        204 
03-NOV-2015 6xrx006fmqbq2      3803735407        226      12048        859        .07         58        794  154111239        424 
02-NOV-2015 6xrx006fmqbq2      3803735407       1081      13327       1276         .1         66       1200  278129234        465 

We didn’t get “good” plan PHV 3803735407, we avoided “nasty” plan PHV 3871506300 but we got not-quite-so-nasty-but-still-not-nice PHV 3353364157.

The SQL Profile was not able to reproduce the desired PHV.

But at least PHV was not doing the problematic direct path reads!

I took the hints from PHV 3803735407 via

select * from table(dbms_xplan.display_awr('6xrx006fmqbq2',3803735407,format=>'+OUTLINE'));

I double checked they matched the hints in the SQL Profile – they did.

I tried using the hints manually in a SQL statement and the result was consistent with the SQL Profile – PHV 3353364157.

At this point, the DBA suggested using a SQL Plan Baseline.

I resisted but eventually gave in.

Why would it make any difference?

When a plan cannot be reproduced then there are good reasons why a baseline is much, much safer than a profile.

A baseline must reproduce the desired PHV or it will be rejected completely.

A profile has no knowledge of PHV so it is possible that under certain circumstances the hints in a profile may be only partially applied.

Profile was dropped and baseline was created.

This was the immediate aftermath (top line)… Success!

And V$SQL was showing that the baseline was being used.

DT          SQL_ID        PLAN_HASH_VALUE        RWS      EXECS        ELP       ELPE         IO        CPU       GETS DISK_READS
----------- ------------- --------------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
10-NOV-2015 6xrx006fmqbq2      3803735407          4       1866        120        .06          1        116   17401768         30 
10-NOV-2015 6xrx006fmqbq2      3353364157         20       4485     114084      25.44          4     111246 2907305774        216
10-NOV-2015 6xrx006fmqbq2      3871506300          5       1815     236211     130.14     226340       9993  757853211  745588219 
09-NOV-2015 6xrx006fmqbq2      3871506300          1       1232     156412     126.96     149660       6846  507571690  499305166 
09-NOV-2015 6xrx006fmqbq2      3871506300          1       1232     156412     126.96     149660       6846  507571690  499305166 
09-NOV-2015 6xrx006fmqbq2      3803735407         53       6835        743        .11          3        723  190488567         41 
07-NOV-2015 6xrx006fmqbq2      3803735407         31       3079        230        .07          3        218   37385043         56 
06-NOV-2015 6xrx006fmqbq2      3803735407        166      18931       1200        .06         11       1128  180142678        484 
05-NOV-2015 6xrx006fmqbq2      3803735407        305       9904        553        .06          5        508   75239139         93 
04-NOV-2015 6xrx006fmqbq2      3803735407        160      18900       1089        .06          9       1027  150523728        204 
03-NOV-2015 6xrx006fmqbq2      3803735407        226      12048        859        .07         58        794  154111239        424 
02-NOV-2015 6xrx006fmqbq2      3803735407       1081      13327       1276         .1         66       1200  278129234        465 

I maintained that this didn’t make sense but the ticket was closed so I couldn’t get an SPM trace.

At this point a copy of the statement without profile and without baseline and without the manual fullset of hints was returning the original PHV 3803735407.

I went away that night and on the train I had a thought: “what if it wasn’t round 1 of the baseline which worked but round 2”.

Reminder of the baseline reproduction system

Round 2 is OFE only.

Doesn’t matter whether it is round 1 or round 2, if it’s reproduced it’s reproduced.

But if it was under OFE only then that was no guarantee of stability.

Then the following night, late night call, the baseline was no longer reproducing PHV 3803735407 but was back to the big problem PHV 3871506300!

Although I didn’t have an SPM trace, I’m adamant it verified my theory about it being round 2 only.

DT          SQL_ID        PLAN_HASH_VALUE        RWS      EXECS        ELP       ELPE         IO        CPU       GETS DISK_READS
----------- ------------- --------------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
11-NOV-2015 6xrx006fmqbq2      3871506300          0        471      80513     170.94      77936       2543  193096137  188992511 
11-NOV-2015 6xrx006fmqbq2      3803735407        212       9933        690        .07         14        586   97253038        154 
10-NOV-2015 6xrx006fmqbq2      3803735407          4       1866        120        .06          1        116   17401768         30 
10-NOV-2015 6xrx006fmqbq2      3353364157         20       4485     114084      25.44          4     111246 2907305774        216
10-NOV-2015 6xrx006fmqbq2      3871506300          5       1815     236211     130.14     226340       9993  757853211  745588219 
09-NOV-2015 6xrx006fmqbq2      3871506300          1       1232     156412     126.96     149660       6846  507571690  499305166 
09-NOV-2015 6xrx006fmqbq2      3871506300          1       1232     156412     126.96     149660       6846  507571690  499305166 
09-NOV-2015 6xrx006fmqbq2      3803735407         53       6835        743        .11          3        723  190488567         41 
07-NOV-2015 6xrx006fmqbq2      3803735407         31       3079        230        .07          3        218   37385043         56 
06-NOV-2015 6xrx006fmqbq2      3803735407        166      18931       1200        .06         11       1128  180142678        484 
05-NOV-2015 6xrx006fmqbq2      3803735407        305       9904        553        .06          5        508   75239139         93 
04-NOV-2015 6xrx006fmqbq2      3803735407        160      18900       1089        .06          9       1027  150523728        204 
03-NOV-2015 6xrx006fmqbq2      3803735407        226      12048        859        .07         58        794  154111239        424 
02-NOV-2015 6xrx006fmqbq2      3803735407       1081      13327       1276         .1         66       1200  278129234        465 

Note that at times of change, no database structures were being changed.
Stats will have been gathered, sure – that was the agent of change no doubt.
But no index maintenance was happening, nothing was going unusable, no partition maintenance was happening.
Nothing was changing which should mean that a particular plan was impossible.

This means that the set of hints for this particular plan was not deterministic or rather the hintset was surely incomplete in some way which mean that the optimizer was free to do certain transformations which meant that the hintset was then invalid? Not 100% sure, still to be looked at in more detail…

At this point we dropped the baseline and went with a cut-down sql profile which applied only two hints – to politely request index scans for the two problematic FTS in each of PHV 3871506300 and 3353364157.

And this worked and has been ok since (so far) but it will not be the longer term solution.

DT          SQL_ID        PLAN_HASH_VALUE        RWS      EXECS        ELP       ELPE         IO        CPU       GETS DISK_READS
----------- ------------- --------------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
16-NOV-2015 6xrx006fmqbq2      3803735407         80       6887        348        .05         11        310   44574494         86 
13-NOV-2015 6xrx006fmqbq2      3803735407        114      13054        842        .06         19        713  117260543        482 
12-NOV-2015 6xrx006fmqbq2      3803735407        585      28074       1854        .07         12       1823  321890748       1983 
11-NOV-2015 6xrx006fmqbq2      3871506300          0        471      80513     170.94      77936       2543  193096137  188992511 
11-NOV-2015 6xrx006fmqbq2      3803735407        212       9933        690        .07         14        586   97253038        154 
10-NOV-2015 6xrx006fmqbq2      3871506300          5       1815     236211     130.14     226340       9993  757853211  745588219 
10-NOV-2015 6xrx006fmqbq2      3353364157         20       4485     114084      25.44          4     111246 2907305774        216 
10-NOV-2015 6xrx006fmqbq2      3803735407          4       1866        120        .06          1        116   17401768         30 
09-NOV-2015 6xrx006fmqbq2      3871506300          1       1232     156412     126.96     149660       6846  507571690  499305166 
09-NOV-2015 6xrx006fmqbq2      3803735407         53       6835        743        .11          3        723  190488567         41 
07-NOV-2015 6xrx006fmqbq2      3803735407         31       3079        230        .07          3        218   37385043         56 
06-NOV-2015 6xrx006fmqbq2      3803735407        166      18931       1200        .06         11       1128  180142678        484 
05-NOV-2015 6xrx006fmqbq2      3803735407        305       9904        553        .06          5        508   75239139         93 
04-NOV-2015 6xrx006fmqbq2      3803735407        160      18900       1089        .06          9       1027  150523728        204 
03-NOV-2015 6xrx006fmqbq2      3803735407        226      12048        859        .07         58        794  154111239        424 
02-NOV-2015 6xrx006fmqbq2      3803735407       1081      13327       1276         .1         66       1200  278129234        465

Never before seen an execution plan which had quite this problem interacting with SPM.

A test case to get to the root of the problem is still on the TODO list.

Something to do with a statement involving a WITH subquery and NO_MERGE
e.g.

WITH x AS (SELECT /*+ no_merge...)
SELECT 
FROM ...
WHERE EXISTS (SELECT
              FROM   x
              ...
              WHERE...);

Did it really fix it 1: OFE & nasty FIRST_ROWS plan on 11.2.0.4

Plan degradations on upgrade are normal.

This one’s no different.

On further investigation, turned out application was setting optimizer_mode = first_rows somewhere.

First point about this is that first_rows really shouldn’t be used either as a hint or as an optimizer_mode.

What does FIRST_ROWS mean?
From 11g doco:

The optimizer uses a mix of costs and heuristics to find a best plan for fast delivery of the first few rows.

If any sort of FIRST_ROWS type intervention is required, then it might be better to be using a more modern, fully-costed FIRST_ROWS_n (where n = 1 | 10 | 100 | 1000).
If it’s even required.

The 12c documentation is clearer about this:

FIRST_ROWS is available for backward compatibility and plan stability; use FIRST_ROWS_n instead.

But I don’t think we should attach too much credence to that statement not being in the 11g documentation.

Here’s the problem plan.

Pre upgrade, snippet of the plan looked as per below.
Please note position of T1.

-------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                   | Name                        | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                            |                             |    34 | 13668 |  3003   (1)| 00:00:22 |       |       |
|   1 |  SORT ORDER BY                              |                             |    34 | 13668 |  3003   (1)| 00:00:22 |       |       |
|   2 |   NESTED LOOPS OUTER                        |                             |    34 | 13668 |  3002   (1)| 00:00:22 |       |       |
|   3 |    NESTED LOOPS OUTER                       |                             |    34 | 12648 |  2968   (1)| 00:00:22 |       |       |
|   4 |     NESTED LOOPS OUTER                      |                             |    34 | 12274 |  2935   (1)| 00:00:22 |       |       |
|   5 |      NESTED LOOPS OUTER                     |                             |    34 | 11900 |  2867   (1)| 00:00:21 |       |       |
|   6 |       NESTED LOOPS OUTER                    |                             |    34 | 11628 |  2833   (1)| 00:00:21 |       |       |
|   7 |        NESTED LOOPS OUTER                   |                             |    34 | 10404 |  2799   (1)| 00:00:21 |       |       |
|   8 |         NESTED LOOPS OUTER                  |                             |    34 |  8228 |  2765   (1)| 00:00:20 |       |       |
|   9 |          NESTED LOOPS                       |                             |    34 |  6732 |  2731   (1)| 00:00:20 |       |       |
|  10 |           NESTED LOOPS                      |                             |    69 | 12282 |  2524   (1)| 00:00:19 |       |       |
|  11 |            NESTED LOOPS                     |                             |    69 |  9591 |  2386   (1)| 00:00:18 |       |       |
|* 12 |             TABLE ACCESS FULL               | T1                          |  1658 |   213K|   725   (3)| 00:00:06 |       |       |
|* 13 |             TABLE ACCESS BY INDEX ROWID     | T2                          |     1 |     7 |     1   (0)| 00:00:01 |       |       |
|* 14 |              INDEX UNIQUE SCAN              | T2_U1                       |     1 |       |     0   (0)| 00:00:01 |       |       |
|  15 |            TABLE ACCESS BY INDEX ROWID      | T3                          |     1 |    39 |     2   (0)| 00:00:01 |       |       |
|* 16 |             INDEX UNIQUE SCAN               | T3_U1                       |     1 |       |     1   (0)| 00:00:01 |       |       |
|  17 |           TABLE ACCESS BY GLOBAL INDEX ROWID| T4                          |     1 |    20 |     3   (0)| 00:00:01 | ROWID | ROWID |
|* 18 |            INDEX RANGE SCAN                 | T4_N1                       |     1 |       |     2   (0)| 00:00:01 |       |       |

Post upgrade, plan has degraded to:

--------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                    | Name                        | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                             |                             |    69 | 27738 |   508M  (3)|999:59:59 |       |       |
|   1 |  NESTED LOOPS OUTER                          |                             |    69 | 27738 |   508M  (3)|999:59:59 |       |       |
|   2 |   NESTED LOOPS OUTER                         |                             |    69 | 26979 |   508M  (3)|999:59:59 |       |       |
|   3 |    NESTED LOOPS OUTER                        |                             |    69 | 26220 |   508M  (3)|999:59:59 |       |       |
|   4 |     NESTED LOOPS OUTER                       |                             |    69 | 23184 |   508M  (3)|999:59:59 |       |       |
|   5 |      NESTED LOOPS OUTER                      |                             |    69 | 22632 |   508M  (3)|999:59:59 |       |       |
|   6 |       NESTED LOOPS OUTER                     |                             |    69 | 20562 |   508M  (3)|999:59:59 |       |       |
|   7 |        NESTED LOOPS OUTER                    |                             |    69 | 16146 |   508M  (3)|999:59:59 |       |       |
|   8 |         NESTED LOOPS                         |                             |    69 | 13662 |   508M  (3)|999:59:59 |       |       |
|   9 |          NESTED LOOPS                        |                             |  1675 |   312K|   508M  (3)|999:59:59 |       |       |
|  10 |           NESTED LOOPS                       |                             |   703K|    39M|  3476K  (1)| 06:54:55 |       |       |
|* 11 |            TABLE ACCESS BY GLOBAL INDEX ROWID| T4                          |   703K|    13M|  2068K  (1)| 04:06:54 | ROWID | ROWID |
|  12 |             INDEX FULL SCAN                  | T4_N2                       |  2240K|       |  7943   (1)| 00:00:57 |       |       |
|  13 |            TABLE ACCESS BY INDEX ROWID       | T3                          |     1 |    39 |     2   (0)| 00:00:01 |       |       |
|* 14 |             INDEX UNIQUE SCAN                | T3_U1                       |     1 |       |     1   (0)| 00:00:01 |       |       |
|* 15 |           TABLE ACCESS FULL                  | T1                          |     1 |   132 |   718   (3)| 00:00:06 |       |       |

In the post-upgrade plan, the most pertinent part was step 15.

A FULL TABLE SCAN as the inner rowsource of a NESTED LOOP, in other words for every row in the driving rowsource do a full table scan.”

Standard reaction to such a degradation is to test the SQL with OPTIMIZER_FEATURES_ENABLE (OFE).

With OFE set either via OPT_PARAM hint or via ALTER SESSION then the SQL reverted to pre-upgrade plan.

This is good!

But… how much confidence can we have that this actually fixed the problem?

It would be great if we could find a bug which was definitively responsible for this change in execution plan.

OFE is an umbrella but it doesn’t turn off ALL fixes/optimizer features in a release.

Many of those fixes covered by the OFE umbrella are represented in V$SYSTEM_FIX_CONTROL

So, I went looking for anything which sounded related and went back to default 11.2.0.4 OFE and starting turning off the BUGNOs one-by-one – no difference, still a bad plan

Then still at OFE 11.2.0.4, I turned off all the BUGNOs…. no difference, still a bad plan.

At this point, my confidence in OFE as a fix for this problem was starting to waiver.

So, I tried to reduce the problem SQL to it’s smallest form representing the problem.

alter session set optimizer_mode = first_rows;
alter session set optimizer_features_enable = '11.2.0.4';

 explain plan for
 SELECT *
 FROM   T1
 ,      T3
 ,      T4
 WHERE  T1.COL1             = T3.COL1
 AND    T3.COL2             = T4.COL2
 AND    T1.DATE_COL        '+OUTLINE'));

This distilled statement reproduced the issue at the heart of the original statement:

----------------------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name    | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |         |  6029 |  4533K|  1612M  (3)|999:59:59 |       |       |
|   1 |  NESTED LOOPS                        |         |  6029 |  4533K|  1612M  (3)|999:59:59 |       |       |
|   2 |   NESTED LOOPS                       |         |  2237K|   795M|  6551K  (1)| 13:01:53 |       |       |
|   3 |    TABLE ACCESS BY GLOBAL INDEX ROWID| T4      |  2240K|   188M|  2068K  (1)| 04:06:53 | ROWID | ROWID |
|   4 |     INDEX FULL SCAN                  | T4_N2   |  2240K|       |  7943   (1)| 00:00:57 |       |       |
|   5 |    TABLE ACCESS BY INDEX ROWID       | T3      |     1 |   285 |     2   (0)| 00:00:01 |       |       |
|*  6 |     INDEX UNIQUE SCAN                | T3_U1   |     1 |       |     1   (0)| 00:00:01 |       |       |
|*  7 |   TABLE ACCESS FULL                  | T1      |     1 |   397 |   718   (3)| 00:00:06 |       |       |
----------------------------------------------------------------------------------------------------------------

Is this test case a 100% indication that OFE wasn’t applicable to the original problem?
Not sure.
But at this point I had littleconfidence that OFE provided any sort of reliable fix to the original SQL statement.

I don’t know what the internal heuristics are/were for FIRST_ROWS.

But that is a stupid place for a full table scan regardless of the number of rows expected to return.

If we remove one of the T1 predicates above and the various row estimates go up then the FTS moves to a driving position.

I could/should get a 10053 trace but FIRST_ROWS shouldn’t be in use in the first place.

Solution – hunt down where the app code is doing the FIRST_ROWS thing and exterminate.

All the other possible OPTIMIZER_MODE modes gave a plan for the original SQL which started like this:

--------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                   | Name | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                            |      |    35 | 14105 |  1386   (2)| 00:00:10 |       |       |
|   1 |  SORT ORDER BY                              |      |    35 | 14105 |  1386   (2)| 00:00:10 |       |       |
|   2 |   NESTED LOOPS OUTER                        |      |    35 | 14105 |  1385   (2)| 00:00:10 |       |       |
|*  3 |    HASH JOIN OUTER                          |      |    35 | 12565 |  1350   (2)| 00:00:10 |       |       |
|*  4 |     HASH JOIN OUTER                         |      |    35 | 11305 |  1315   (2)| 00:00:10 |       |       |
|   5 |      NESTED LOOPS OUTER                     |      |    35 |  9065 |  1280   (2)| 00:00:10 |       |       |
|   6 |       NESTED LOOPS OUTER                    |      |    35 |  8015 |  1245   (2)| 00:00:09 |       |       |
|   7 |        NESTED LOOPS OUTER                   |      |    35 |  7735 |  1210   (2)| 00:00:09 |       |       |
|   8 |         NESTED LOOPS OUTER                  |      |    35 |  7350 |  1176   (2)| 00:00:09 |       |       |
|   9 |          NESTED LOOPS                       |      |    35 |  6965 |  1106   (2)| 00:00:08 |       |       |
|  10 |           NESTED LOOPS                      |      |    69 | 12351 |   899   (3)| 00:00:07 |       |       |
|* 11 |            HASH JOIN                        |      |    69 |  9660 |   761   (3)| 00:00:06 |       |       |
|* 12 |             TABLE ACCESS FULL               | T2   |   237 |  1659 |    36   (6)| 00:00:01 |       |       |
|* 13 |             TABLE ACCESS FULL               | T1   |  1671 |   217K|   725   (3)| 00:00:06 |       |       |

So…. when optimizer upgrade problems strike, you should always try OFE.

But it’s also worth further investigation to confirm that OFE really is a reliable fix for the problem.