Scalar Subselect Costing

This issue is an oldie but deserving of a quick post to stop me going off on a tangent in another post.

It is an oddity of scalar subselects/subqueries that their cost is not taken into account in the top level cost of a query.

In older versions of Oracle, it used to be the case that you didn’t even see the scalar subquery in the execution plan.

However, even in the latest versions, the cost still isn’t accounted for.

Always something to keep in mind.

For example:

SQL> create table t1
  2  (col1 number not null);

Table created.

SQL> 
SQL> insert into t1
  2  select rownum
  3  from   dual
  4  connect by rownum <= 10000;

10000 rows created.

SQL> 
SQL> commit;

Commit complete.

SQL> 
SQL> create table t2
  2  (col1 number not null primary key);

Table created.

SQL> 
SQL> 
SQL> insert into t2
  2  select rownum
  3  from   dual
  4  connect by rownum <= 10000;

10000 rows created.

SQL> 
SQL> commit;

Commit complete.

SQL> 

Let’s do a scalar subselect to do an index lookup on t2 for every row in t1:

SQL> explain plan for
  2  select t1.col1
  3  ,      (select t2.col1 from t2 where t2.col1 = t1.col1)
  4  from   t1;

Explained.

SQL> 
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------
Plan hash value: 2339000913

----------------------------------------------------------------------------------
| Id  | Operation         | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |              | 10000 |   126K|     8   (0)| 00:00:01 |
|*  1 |  INDEX UNIQUE SCAN| SYS_C0078310 |     1 |    13 |     1   (0)| 00:00:01 |
|   2 |  TABLE ACCESS FULL| T1           | 10000 |   126K|     8   (0)| 00:00:01 |
----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("T2"."COL1"=:B1)

You can see that the cost of the scalar subquery is 1 per execution and it’s not accounted for at the top level.

Let’s force a full table scan of the row-by-row lookup:

SQL> explain plan for
  2  select t1.col1
  3  ,      (select /*+ full(t2) */ t2.col1 from t2 where t2.col1 = t1.col1)
  4  from   t1;

Explained.

SQL> 
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------
Plan hash value: 637946564

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      | 10000 |   126K|     8   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T2   |     1 |    13 |     2   (0)| 00:00:01 |
|   2 |  TABLE ACCESS FULL| T1   | 10000 |   126K|     8   (0)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("T2"."COL1"=:B1)

Obviously a much more expensive operation but, again, not properly accounted for in the overall costing.

Wouldn’t it be preferable that as the optimizer has estimated the number of rows in the top level select:

|   0 | SELECT STATEMENT  |      | 10000 |   126K|     8   (0)| 00:00:01 |

and it has estimated the cost per execution of the scalar subselect:

|*  1 |  TABLE ACCESS FULL| T2   |     1 |    13 |     2   (0)| 00:00:01 |

that the top level cost include to some degree the cost of scalar subselect per execution * estimated executions?

For example, if we code a join roughly equivalent to the scalar subselect then:

SQL> explain plan for
  2  select /*+ 
  3           full(t2) 
  4           use_nl(t2)
  5           */
  6         t1.col1
  7  ,      t2.col1
  8  from   t1
  9  ,      t2
 10  where t2.col1 (+) = t1.col1;

Explained.

SQL> 
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------
Plan hash value: 2453408398

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      | 10000 |   253K| 66919   (7)| 00:05:35 |
|   1 |  NESTED LOOPS OUTER|      | 10000 |   253K| 66919   (7)| 00:05:35 |
|   2 |   TABLE ACCESS FULL| T1   | 10000 |   126K|     8   (0)| 00:00:01 |
|*  3 |   TABLE ACCESS FULL| T2   |     1 |    13 |     7  (15)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter("T2"."COL1"(+)="T1"."COL1")

Also see:
http://jonathanlewis.wordpress.com/2007/10/12/scalar-subqueries/

http://oracle-randolf.blogspot.com/2010/01/when-your-projection-is-not-cost-free.html

http://blog.sydoracle.com/2005/09/explain-plans-and-scalar-subqueries.html

About these ads

One Response to Scalar Subselect Costing

  1. lkafle says:

    cool Oracle subselect article on Subselect

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 75 other followers

%d bloggers like this: