Copy_table_stats

An observation on DBMS_STATS.COPY_TABLE_STATS in 11.2.0.3

Prior to 10.2.0.5, 11.1.0.7 (I think) and 11.2, COPY_TABLE_STATS copied the high/low value of the partition key from the source to the destination rather than use the implicit boundaries of that partition. Doug mentions it in his series on statistics on partitioned objects.

Looking at it in 11.2.0.3, this issue is still addressed in 11.2.0.3 but there’s a “but” if we copy at the SUBPARTITION level:

Setup:

SQL> create table t1
  2  (dt    DATE
  3  ,col1  VARCHAR2(10)
  4  ,stuff VARCHAR2(100))
  5  PARTITION BY RANGE (DT)
  6  SUBPARTITION BY LIST (COL1)
  7  SUBPARTITION TEMPLATE
  8  (SUBPARTITION A VALUES ('A'),
  9   SUBPARTITION B VALUES ('B'),
 10   SUBPARTITION C VALUES ('C'))
 11  (PARTITION T1_WK1 VALUES LESS THAN (TO_DATE('07-01-2012','DD-MM-YYYY')),
 12   PARTITION T1_WK2 VALUES LESS THAN (TO_DATE('14-01-2012','DD-MM-YYYY')));

Table created.

SQL> insert into t1
  2  values (TO_DATE('01-01-2012','DD-MM-YYYY'),'A',RPAD('X',100,'X'));

1 row created.

SQL> insert into t1
  2  values (TO_DATE('02-01-2012','DD-MM-YYYY'),'B',RPAD('X',100,'X'));

1 row created.

SQL> insert into t1
  2  values (TO_DATE('03-01-2012','DD-MM-YYYY'),'C',RPAD('X',100,'X'));

1 row created.

SQL> exec dbms_stats.gather_table_stats(USER,'T1','T1_WK1');

PL/SQL procedure successfully completed.

SQL> 

So, in one partition, we have 1 row per subpartition. The other partition is empty.

And we’ve gathered stats on one partition.
(Why just one partition? It’s not really important but I’ll touch on it at the end).

You should see the following statistics:

SQL> select global_stats
  2  ,      column_name
  3  ,      case
  4              when column_name = 'DT'   then display_raw(low_value,'DATE')
  5              when column_name = 'COL1' then display_raw(low_value,'VARCHAR2')
  6         end lo
  7  ,      case
  8              when column_name = 'DT'   then display_raw(high_value,'DATE')
  9              when column_name = 'COL1' then display_raw(high_value,'VARCHAR2')
 10         end hi
 11  from   user_tab_col_statistics
 12  where  table_name = 'T1'
 13  and    column_name IN ('DT','COL1');

GLO COLUMN_NAME          LO         HI
--- -------------------- ---------- ----------
YES DT                   01-JAN-12  03-JAN-12
YES COL1                 A          C

SQL> select global_stats
  2  ,      partition_name
  3  ,      column_name
  4  ,      case
  5              when column_name = 'DT'   then display_raw(low_value,'DATE')
  6              when column_name = 'COL1' then display_raw(low_value,'VARCHAR2')
  7         end lo
  8  ,      case
  9              when column_name = 'DT'   then display_raw(high_value,'DATE')
 10              when column_name = 'COL1' then display_raw(high_value,'VARCHAR2')
 11         end hi
 12  from   user_part_col_statistics
 13  where  table_name  = 'T1'
 14  and    column_name IN ('DT','COL1')
 15  order by partition_name, column_name;

GLO PARTITION_NAME       COLUMN_NAME          LO         HI
--- -------------------- -------------------- ---------- ----------
YES T1_WK1               COL1                 A          C
YES T1_WK1               DT                   01-JAN-12  03-JAN-12
NO  T1_WK2               COL1
NO  T1_WK2               DT

SQL> select subpartition_name
  2  ,      column_name
  3  ,      case
  4              when column_name = 'DT'   then display_raw(low_value,'DATE')
  5              when column_name = 'COL1' then display_raw(low_value,'VARCHAR2')
  6         end lo
  7  ,      case
  8              when column_name = 'DT'   then display_raw(high_value,'DATE')
  9              when column_name = 'COL1' then display_raw(high_value,'VARCHAR2')
 10         end hi
 11  from   user_subpart_col_statistics
 12  where  table_name  = 'T1'
 13  and    column_name IN ('DT','COL1')
 14  order by subpartition_name, column_name;

SUBPARTITION_NAME    COLUMN_NAME          LO         HI
-------------------- -------------------- ---------- ----------
T1_WK1_A             COL1                 A          A
T1_WK1_A             DT                   01-JAN-12  01-JAN-12
T1_WK1_B             COL1                 B          B
T1_WK1_B             DT                   02-JAN-12  02-JAN-12
T1_WK1_C             COL1                 C          C
T1_WK1_C             DT                   03-JAN-12  03-JAN-12
T1_WK2_A             COL1
T1_WK2_A             DT
T1_WK2_B             COL1
T1_WK2_B             DT
T1_WK2_C             COL1
T1_WK2_C             DT

12 rows selected.

SQL> 

Hopefully no surprises there.

BTW: display_raw

Now, if we use COPY_TABLE_STATS at a partition level:

SQL> exec dbms_stats.copy_table_stats(USER,'T1','T1_WK1','T1_WK2');

PL/SQL procedure successfully completed.

SQL>

the table level stats are unaffected as you might expect:

SQL> select global_stats
  2  ,      column_name
  3  ,      case
  4              when column_name = 'DT'   then display_raw(low_value,'DATE')
  5              when column_name = 'COL1' then display_raw(low_value,'VARCHAR2')
  6         end lo
  7  ,      case
  8              when column_name = 'DT'   then display_raw(high_value,'DATE')
  9              when column_name = 'COL1' then display_raw(high_value,'VARCHAR2')
 10         end hi
 11  from   user_tab_col_statistics
 12  where  table_name = 'T1'
 13  and    column_name IN ('DT','COL1');

GLO COLUMN_NAME          LO         HI
--- -------------------- ---------- ----------
YES DT                   01-JAN-12  03-JAN-12
YES COL1                 A          C

SQL> 

and, most importantly, the partition key high/low values are no longer just copied:

SQL> select global_stats
  2  ,      partition_name
  3  ,      column_name
  4  ,      case
  5              when column_name = 'DT'   then display_raw(low_value,'DATE')
  6              when column_name = 'COL1' then display_raw(low_value,'VARCHAR2')
  7         end lo
  8  ,      case
  9              when column_name = 'DT'   then display_raw(high_value,'DATE')
 10              when column_name = 'COL1' then display_raw(high_value,'VARCHAR2')
 11         end hi
 12  from   user_part_col_statistics
 13  where  table_name  = 'T1'
 14  and    column_name IN ('DT','COL1')
 15  order by partition_name, column_name;

GLO PARTITION_NAME       COLUMN_NAME          LO         HI
--- -------------------- -------------------- ---------- ----------
YES T1_WK1               COL1                 A          C
YES T1_WK1               DT                   01-JAN-12  03-JAN-12
YES T1_WK2               COL1                 A          C
YES T1_WK2               DT                   07-JAN-12  14-JAN-12

The statistics above show that the high/low of the destination partition match the partition boundaries which nearly makes sense but note the high value is the LESS THAN limit of the partition.
(Robin Moffat also mentioned this in a post)

Now copying the statistics at a PARTITION level, does nothing to the SUBPARTITION statistics, never has done.
That’s not changed nor was it expected to:

SQL>  select subpartition_name
  2   ,      column_name
  3   ,      case
  4               when column_name = 'DT'   then display_raw(low_value,'DATE')
  5               when column_name = 'COL1' then display_raw(low_value,'VARCHAR2')
  6          end lo
  7   ,      case
  8               when column_name = 'DT'   then display_raw(high_value,'DATE')
  9               when column_name = 'COL1' then display_raw(high_value,'VARCHAR2')
 10          end hi
 11   from   user_subpart_col_statistics
 12   where  table_name  = 'T1'
 13   and    column_name IN ('DT','COL1')
 14   order by subpartition_name, column_name;

SUBPARTITION_NAME    COLUMN_NAME          LO                   HI
-------------------- -------------------- -------------------- --------------------
T1_WK1_A             COL1                 A                    A
T1_WK1_A             DT                   01-JAN-12            01-JAN-12
T1_WK1_B             COL1                 B                    B
T1_WK1_B             DT                   02-JAN-12            02-JAN-12
T1_WK1_C             COL1                 C                    C
T1_WK1_C             DT                   03-JAN-12            03-JAN-12
T1_WK2_A             COL1
T1_WK2_A             DT
T1_WK2_B             COL1
T1_WK2_B             DT
T1_WK2_C             COL1
T1_WK2_C             DT

12 rows selected.

Perhaps then, if we value our subpartition statitics, we might think it’s a good idea to do our operations at a subpartition level instead.

What happens if we reset these statistics and do a COPY_TABLE_STATS at a subpartition level?

Reset:

SQL> exec dbms_stats.delete_table_stats(USER,'T1');

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.gather_table_stats(USER,'T1','T1_WK1');

PL/SQL procedure successfully completed.

And COPY_TABLE_STATS on the subpartitions:

SQL>  begin
  2    dbms_stats.copy_table_stats(USER,'T1','T1_WK1_A','T1_WK2_A');
  3    dbms_stats.copy_table_stats(USER,'T1','T1_WK1_B','T1_WK2_B');
  4    dbms_stats.copy_table_stats(USER,'T1','T1_WK1_C','T1_WK2_C');
  5   end;
  6  /

PL/SQL procedure successfully completed.

The table level stats are still unaffected.

But at the subpartition level, the RANGE PARTITION KEY (DT) is back to just being copied, not set to the boundary of the PARTITION KEY:

SQL>  select subpartition_name
  2   ,      column_name
  3   ,      case
  4               when column_name = 'DT'   then display_raw(low_value,'DATE')
  5               when column_name = 'COL1' then display_raw(low_value,'VARCHAR2')
  6          end lo
  7   ,      case
  8               when column_name = 'DT'   then display_raw(high_value,'DATE')
  9               when column_name = 'COL1' then display_raw(high_value,'VARCHAR2')
 10          end hi
 11   from   user_subpart_col_statistics
 12   where  table_name  = 'T1'
 13   and    column_name IN ('DT','COL1')
 14   order by subpartition_name, column_name;

SUBPARTITION_NAME    COLUMN_NAME          LO                   HI
-------------------- -------------------- -------------------- --------------------
T1_WK1_A             COL1                 A                    A
T1_WK1_A             DT                   01-JAN-12            01-JAN-12
T1_WK1_B             COL1                 B                    B
T1_WK1_B             DT                   02-JAN-12            02-JAN-12
T1_WK1_C             COL1                 C                    C
T1_WK1_C             DT                   03-JAN-12            03-JAN-12
T1_WK2_A             COL1                 A                    A
T1_WK2_A             DT                   01-JAN-12            01-JAN-12
T1_WK2_B             COL1                 B                    B
T1_WK2_B             DT                   02-JAN-12            02-JAN-12
T1_WK2_C             COL1                 C                    C
T1_WK2_C             DT                   03-JAN-12            03-JAN-12

12 rows selected.

And, at the partition level, any aggregation of these incorrect values depends on whether we already have GLOBAL STATS or not (hence the initial stats gathering on just the partition).

If no global stats then the error is propagated up from the subpartition:

SQL>  select global_stats
  2   ,      partition_name
  3   ,      column_name
  4   ,      case
  5               when column_name = 'DT'   then display_raw(low_value,'DATE')
  6               when column_name = 'COL1' then display_raw(low_value,'VARCHAR2')
  7          end lo
  8   ,      case
  9               when column_name = 'DT'   then display_raw(high_value,'DATE')
 10               when column_name = 'COL1' then display_raw(high_value,'VARCHAR2')
 11          end hi
 12   from   user_part_col_statistics
 13   where  table_name  = 'T1'
 14   and    column_name IN ('DT','COL1')
 15   order by partition_name, column_name;

GLO PARTITION_NAME       COLUMN_NAME          LO                   HI
--- -------------------- -------------------- -------------------- --------------------
YES T1_WK1               COL1                 A                    C
YES T1_WK1               DT                   01-JAN-12            03-JAN-12
NO  T1_WK2               COL1                 A                    C
NO  T1_WK2               DT                   01-JAN-12            03-JAN-12

Or if instead of gathering statistics at the partition level initially (and at reset), we do a table level gather, then
the PARTITION stats are not aggregated, i.e.

GLO PARTITION_NAME       COLUMN_NAME          LO         HI
--- -------------------- -------------------- ---------- ----------
YES T1_WK1               COL1                 A          C
YES T1_WK1               DT                   01-JAN-12  03-JAN-12
YES T1_WK2               COL1
YES T1_WK2               DT

So, in summary, I just wanted to show that whilst the issue with COPY_TABLE_STATS and HIGH/LOW has been addressed at the PARTITION level, there still seem to be issues for the PARTITION KEY at the SUBPARTITION LEVEL.

Partitioned table statistics are more complicated than you might expect.

Here are some of the excellent resources that can guide you through:

Leave a comment