Index Skip Scan skipping specified column

Everything I can remember reading about the Index Skip Scan access path has always talked about it being used in queries where the leading column in the index is not specified in a query AND where that leading column has a relatively low number of distinct values in that leading column.

For example:

SQL> drop table t1;

Table dropped.

SQL> 
SQL> create table t1
  2  as
  3  select case when mod(rownum,2) = 0
  4              then 'Y'
  5              else 'N'
  6         end flagyn
  7  ,      rownum id
  8  ,      rpad(rownum,200,'0') col2
  9  from   dual 
 10  connect by rownum <=100000;

Table created.

SQL> 
SQL> create index i1 on t1 (flagyn, id);

Index created.

SQL> 
SQL> exec dbms_stats.gather_table_stats(USER,'T1');

PL/SQL procedure successfully completed.

SQL> explain plan for
  2  select *
  3  from   t1
  4  where  id   = 3222;

Explained.

SQL>         
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------

-------------------------------------------------------------------------
| Id  | Operation                   | Name | Rows  | Bytes | Cost (%CPU)|
-------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |      |     1 |   208 |     4   (0)|
|   1 |  TABLE ACCESS BY INDEX ROWID| T1   |     1 |   208 |     4   (0)|
|*  2 |   INDEX SKIP SCAN           | I1   |     1 |       |     3   (0)|
-------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------

   2 - access("ID"=3222)
       filter("ID"=3222)


18 rows selected.

SQL> 

So, I was slightly surprised to the see this access path being used on a query where the leading column WAS specified. I can’t remember having noticed this before in similar circumstances.

Here is the output from the actual situation (test case not supplied).

SQL> explain plan for
  2  SELECT  *
  3  FROM   daily_accr_pos accr
  4  WHERE  accr.inst_num       = 243961
  5  AND    accr.accr_date      >= to_date(20110221120000,'YYYYMMDDHHMISS')
  6  AND    accr.accr_date      <= to_date(20110303120000,'YYYYMMDDHHMISS');

Explained.

SQL>         
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------

-----------------------------------------------------------------------------------------
| Id  | Operation                   | Name                 | Rows  | Bytes | Cost (%CPU)|
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                      |     1 |    86 |    11  (19)|
|   1 |  TABLE ACCESS BY INDEX ROWID| DAILY_ACCR_POS       |     1 |    86 |    11  (19)|
|*  2 |   INDEX SKIP SCAN           | DAILY_ACCR_POS_8_IDX |     1 |       |    10  (20)|
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------

   2 - access("ACCR"."ACCR_DATE">=TO_DATE(' 2011-02-21 12:00:00', 'syyyy-mm-dd
              hh24:mi:ss') AND "ACCR"."INST_NUM"=243961 AND "ACCR"."ACCR_DATE"<=TO_DATE('
              2011-03-03 12:00:00', 'syyyy-mm-dd hh24:mi:ss'))
       filter("ACCR"."INST_NUM"=243961)

20 rows selected.

Note that this index is defined on (ACCR_DATE, INST_NUM), both of which are supplied literals in the query above and also noted as ACCESS PREDICATES in the predicate section.

Some extra metadata about the index:

SQL> select index_name iname
  2  ,      uniqueness uq
  3  ,      blevel
  4  ,      leaf_blocks num_lf
  5  ,      distinct_keys dst_ky
  6  ,      avg_leaf_blocks_per_key avg_lf
  7  ,      avg_data_blocks_per_key avg_dt 
  8  ,      clustering_factor cf
  9  ,      num_rows  
 10  from   dba_indexes 
 11  where  index_name = 'DAILY_ACCR_POS_8_IDX';

INAME                UQ            BLEVEL     NUM_LF     DST_KY     AVG_LF     AVG_DT         CF
-------------------- --------- ---------- ---------- ---------- ---------- ---------- ----------
  NUM_ROWS
----------
DAILY_ACCR_POS_8_IDX NONUNIQUE          3     167844   30386593          1          1   44902638
  50323152


SQL> 
SQL> select column_name
  2  ,      column_position
  3  from   dba_ind_columns 
  4  where  index_name = 'DAILY_ACCR_POS_8_IDX';

COLUMN_NAME     COLUMN_POSITION
--------------- ---------------
ACCR_DATE                     1
INST_NUM                      2

SQL> 
SQL> select column_name
  2  ,      data_type
  3  ,      num_distinct num_dis
  4  ,      low_value, display_raw(low_value, data_type) lo_val
  5  ,      high_value, display_raw(high_value, data_type) hi_val
  6  ,      density
  7  ,      num_nulls
  8  from   dba_tab_columns 
  9  where  table_name = 'DAILY_ACCR_POS' 
 10  and    column_name in ('ACCR_DATE','INST_NUM');

COLUMN_NAME     DATA_TYP    NUM_DIS LOW_VALUE       LO_VAL          HIGH_VALUE      HI_VAL
--------------- -------- ---------- --------------- --------------- --------------- ---------------
   DENSITY  NUM_NULLS
---------- ----------
INST_NUM        NUMBER        91824 C112            17              C3646453        999982
 .00001089          0

ACCR_DATE       DATE           5693 77C30B07010101  07-NOV-1995     78950C1F010101  31-DEC-2049
.000175654          0


SQL> 
About these ads

4 Responses to Index Skip Scan skipping specified column

  1. Nigel Thomas says:

    This plan is skipping all the entries in the index which DO match the accr_date range but DON’T match the inst_num supplied value. The CBO has decided that when it runs out of index entries for ,243961 it’s quicker to skip to ,243961 than to scan serially through the index for all higher values of inst_num for day1 and all lower values for day2.

    Cheers Nigel

  2. That it expects to retrieve only one single row for the query predicates means that it expects to read only one index block.
    What happens when you actually run the query and trace it ? Or run it with gather_plan_statistics and then display_cursor ? How much real effort does it spend on the index ?

    • Dom Brooks says:

      Hi Hemant,

      Good points, which I would look into if I was pursuing this further.
      Estimates of 1 row are really a special case – all sorts of stranage things can happen, strange things which are particularly suboptimal when 1 is not the case.

      I was looking at adding a column to an existing index.

      I just thought I’d mention it because the orthodox explanations for when a skip scan are used are not necessarily the whole story – i.e. here’s a not particularly low cardinality leading column AND it’s specified in the predicates.

      Thanks for the input.

      Cheers,
      Dominic

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 75 other followers

%d bloggers like this: