Determining Deterministic IV – Again Not So Simple

In the second part, we skipped quickly over some simple performance benefits of a DETERMINISTIC function whilst in the third part, we touched every so briefly on the surprising influence of fetch sizing.

Obviously there are going to be variations in behaviour with different limits to the caching depending on the number and order of rows, the number of distinct values and the length and datatypes in and out of the function.

However, here are some interesting trial and error observations based on variations of the code used in the other articles. These observations have been made after messing with the sizing of the column PADDED_VAL1 (a column designed to give the deterministic mechanism a bit of width to deal with for a change), keeping the length of the data to that column limit and also by changing the number of distinct values across the 10000 rows.

I’ve ordered the rows by padded_val1 to try to get the best gains from DETERMINISTIC.

I was expecting to see some sweet spots but there are some surprising variations.

I’ve kept ARRAYSIZE consistent in all tests.

So, just following this script and modifying where indicated:

set arraysize 1000

drop table det1;

create table det1
(id          number
,val1        number
,padded_val1 varchar2(64)
,constraint pk_det1 primary key(id));

insert into det1
select rownum
,      mod(rownum, 8 )
,      rpad(mod(rownum, 8 ) , 64 )
from   dual
connect by rownum <= 10000;

exec dbms_stats.gather_table_stats(USER,'DET1');

exec dbms_application_info.set_client_info(0);

select sys_context( 'userenv', 'client_info') from dual;

select id, f2(padded_val1)
from   (select /*+ no_merge */
               id, padded_val1
        from   det1
        order by padded_val1);

select sys_context( 'userenv', 'client_info') from dual;

The observations were as follows:

VARCHAR2 length of padded_val1 col No of distinct values # fn calls when padded with ‘X‘ # fn calls when padded with ‘ ‘
64 4 12 12
128 4 511 12
256 4 511 12
512 4 12 12
64 8 16 16
64 9 18 18
64 10 10 10
64 16 24 648
16 8 1514 16
32 8 16 515
64 8 16 16
128 8 765 16
256 8 764 515
512 8 16 16

I could play around plugging in different values for ages.

This is something I might have to come back to another time, but for now if anyone has any thoughts on this behaviour they’d certainly be appreciated.

Determining Deterministic III – Not So Simple Performance Benefits

In the second part, I hopefully showed a simple example of the benefits of the DETERMINISTIC assertion in a function.

However, as mentioned, there was one influencing factor in the test results which I was deliberately not explicit about.

I was not expecting it to be a factor – I hadn’t even thought about it.

I only came upon it when investigating a difference in behaviour between tools.

Setting up as before:

SQL> drop table det1;

Table dropped.

SQL> 
SQL> create table det1
  2  (id          number
  3  ,val1        number
  4  ,padded_val1 varchar2(100)
  5  ,constraint pk_det1 primary key(id));

Table created.

SQL> 
SQL> insert into det1
  2  select rownum
  3  ,      mod(rownum, 10)
  4  ,      rpad(mod(rownum, 10),100)
  5  from   dual
  6  connect by rownum <= 1000;

1000 rows created.

SQL> 
SQL> exec dbms_stats.gather_table_stats(USER,'DET1');

PL/SQL procedure successfully completed.

SQL> 
SQL> create or replace function f1 (
  2    i1 in number
  3  )
  4  return number
  5  deterministic
  6  as
  7  begin
  8   dbms_application_info.set_client_info( sys_context( 'userenv', 'client_info')+1 );
  9   return i1;
 10  end;
 11  /

Function created.

SQL> 

PL/SQL procedure successfully completed.

What’s missing here?

SQL> XXXXXXXXXXXXXXXXXXX                      <---------------- What's this?
SQL> exec dbms_application_info.set_client_info(0);

PL/SQL procedure successfully completed.

SQL> 
SQL> select sys_context( 'userenv', 'client_info') from dual;

SYS_CONTEXT('USERENV','CLIENT_INFO')
-------------------------------------------------------------------------------
0

SQL> 
SQL> select id, val1, f1(val1)
  2  from   det1;

-- Cut rows

1000 rows selected.

SQL> 
SQL> 
SQL> select sys_context( 'userenv', 'client_info') from dual;

SYS_CONTEXT('USERENV','CLIENT_INFO')
-------------------------------------------------------------------
11

SQL> 
SQL> XXXXXXXXXXXXXXXXXXX                      <---------------- And what's this?
SQL> exec dbms_application_info.set_client_info(0);

PL/SQL procedure successfully completed.

SQL> 
SQL> select sys_context( 'userenv', 'client_info') from dual;

SYS_CONTEXT('USERENV','CLIENT_INFO')
---------------------------------------------------------------------------------
0

SQL> 
SQL> select id, val1, f1(val1)
  2  from   det1;

-- Cut rows
1000 rows selected.

SQL> 
SQL> select sys_context( 'userenv', 'client_info') from dual;

SYS_CONTEXT('USERENV','CLIENT_INFO')
------------------------------------------------------------------------------------
1000

SQL> 

Any ideas?































































It’s the fetch size – ARRAYSIZE in the case of SQL*PLUS.

Does it seem obvious that it would be factor?

Scalar subquery caching (a comparison with which I was leaving for a later article) is not similarly affected.

SQL> set arraysize 1
SQL> create or replace function f1 (
  2    i1 in number
  3  )
  4  return number
  5  --deterministic
  6  as
  7  begin
  8   dbms_application_info.set_client_info( sys_context( 'userenv', 'client_info')+1 );
  9   return i1;
 10  end;
 11  /

Function created.

SQL> exec dbms_application_info.set_client_info(0);

PL/SQL procedure successfully completed.

SQL> 
SQL> select sys_context( 'userenv', 'client_info') from dual;

SYS_CONTEXT('USERENV','CLIENT_INFO')
---------------------------------------------------------------------
0

SQL> 
SQL> select id, val1, (select f1(val1) from dual)
  2  from   det1;
-- Cut rows

1000 rows selected.

SQL> 
SQL> select sys_context( 'userenv', 'client_info') from dual;

SYS_CONTEXT('USERENV','CLIENT_INFO')
---------------------------------------------------------------------------------
10

Determining Deterministic II – Simple Performance Benefits

In the first part I said that determinism was initially implemented as a required keyword for the function of a function-based index and that any other documented performance benefits were not (fully) implemented until at least 10.2.

As stated in the first part, the intention is to keep the examples restricted to a single version – 11.2.0.2 in this case – as the intention is not to compare how things have changed over time, just to make some observations and to draw some not too spurious conclusions.

So, what does the 11.2 documentation now tell us about DETERMINISTIC?

From the Advanced Application Developer’s Guide:

Certain performance optimizations occur on invocations of functions that are marked DETERMINISTIC without any other action being required. These features require that any function used with them be declared DETERMINISTIC:

  • Any user-defined function used in a function-based index.
  • Any function used in a materialized view, if that view is to qualify for Fast Refresh or is marked ENABLE QUERY REWRITE.

The preceding functions features attempt to use previously calculated results rather than invoking the function when it is possible to do so.

It is good programming practice to make functions that fall into these categories DETERMINISTIC:

  • Functions used in a WHERE, ORDER BY, or GROUP BY clause
  • Functions that MAP or ORDER methods of a SQL type
  • Functions that help determine whether or where a row appears in a result set

Keep these points in mind when you create DETERMINISTIC functions:

  • The database cannot recognize if the action of the function is indeed deterministic. If the DETERMINISTIC keyword is applied to a function whose action is not truly deterministic, then the result of queries involving that function is unpredictable.
  • If you change the semantics of a DETERMINISTIC function and recompile it, then existing function-based indexes and materialized views report results for the prior version of the function. Thus, if you change the semantics of a function, you must manually rebuild any dependent function-based indexes and materialized views.

I started off in part one looking at the only way DETERMINISTIC made any difference when first implemented – in function-based indexes.

But I want to rewind and yet go forward, leaving the indexing until a future part of this series, and focusing on the benefits of DETERMINISTIC for a simple unindexed function call.

So, the supposed performance benefits of a DETERMINISTIC function come from repeated function calls within a statement.

Set Up

SQL> drop table det1;

Table dropped.

SQL> 
SQL> create table det1
  2  (id          number
  3  ,val1        number
  4  ,padded_val1 varchar2(100)
  5  ,constraint pk_det1 primary key(id));

Table created.

SQL> 
SQL> insert into det1
  2  select rownum
  3  ,      mod(rownum, 10)
  4  ,      rpad(mod(rownum, 10),100)
  5  from   dual
  6  connect by rownum <= 1000;

1000 rows created.

SQL> 
SQL> exec dbms_stats.gather_table_stats(USER,'DET1');

PL/SQL procedure successfully completed.

SQL> 

Run without DETERMINISTIC

SQL> 
SQL> create or replace function f1 (
  2    i1 in number
  3  )
  4  return number
  5  --deterministic
  6  as
  7  begin
  8   dbms_application_info.set_client_info( sys_context( 'userenv', 'client_info')+1 );
  9   return i1;
 10  end;
 11  /

Function created.

SQL> exec dbms_application_info.set_client_info(0);

PL/SQL procedure successfully completed.

SQL> 
SQL> select sys_context( 'userenv', 'client_info') from dual;

SYS_CONTEXT('USERENV','CLIENT_INFO')
----------------------------------------------------------------------------------------------
0

SQL> 
SQL> select id, val1, f1(val1)
  2  from   det1;
-- Cut output --

SQL> select sys_context( 'userenv', 'client_info') from dual;

SYS_CONTEXT('USERENV','CLIENT_INFO')
-----------------------------------------------------------------------
1000

SQL> 

So, as expected, from the initial run without the DETERMINISTIC assertion, 10000 rows, 10000 function calls.

Next run with DETERMINISTIC function:

SQL> create or replace function f1 (
  2    i1 in number
  3  )
  4  return number
  5  deterministic
  6  as
  7  begin
  8   dbms_application_info.set_client_info( sys_context( 'userenv', 'client_info')+1 );
  9   return i1;
 10  end;
 11  /

Function created.

SQL> exec dbms_application_info.set_client_info(0);

PL/SQL procedure successfully completed.

SQL> 
SQL> select sys_context( 'userenv', 'client_info') from dual;

SYS_CONTEXT('USERENV','CLIENT_INFO')
--------------------------------------------------------------------------------------------
0

SQL> 
SQL> select id, val1, f1(val1)
  2  from   det1;

-- Cut output

SQL> 
SQL> select sys_context( 'userenv', 'client_info') from dual;

SYS_CONTEXT('USERENV','CLIENT_INFO')
-----------------------------------------------------------------------------------
11

SQL> 

So, in this example, we’ve saved 989 function calls over our 1000 rows.

I’ve not touched on the actual performance benefits other than the number of calls to the function. The main benefit is usually going to be the reduction in context switching plus then any significant work that the function itself implements.

Obviously there are going to be variations in behaviour and different limits based on number of rows, number of distinct values and datatype.

Of course, the order of the data can make a difference.

This is all based on a simple example still, but if we state an order then we can make a further saving of one function call:

SQL> exec dbms_application_info.set_client_info(0);

PL/SQL procedure successfully completed.

SQL> 
SQL> select sys_context( 'userenv', 'client_info') from dual;

SYS_CONTEXT('USERENV','CLIENT_INFO')
------------------------------------------------------------------------
0

SQL> 
SQL> select id, f1(val1)
  2  from   (select /*+ no_merge */
  3                 id, val1
  4          from   det1
  5          order by val1);

-- cut output

SQL> select sys_context( 'userenv', 'client_info') from dual;

SYS_CONTEXT('USERENV','CLIENT_INFO')
---------------------------------------------------------------------------
10

SQL> 

This all seems mostly straightforward, but there’s one factor I’ve not been explicit about in these examples.

Any ideas?
(See next part)

Finally, there is no benefit to a DETERMINISTIC function between calls to the outer SQL – i.e. if I call the main SELECT above once and then again, there is no evidence that the second execution gets any benefit from the DETERMINISTIC function itseld. This is really the domain of the result cache.

And it should go without saying that if your DETERMINISTIC function involves a SELECT from a table, then it’s not really DETERMINISTIC is it? I may touch on another aspect of such misuse in a future article.

Determining Deterministic I – Introduction

There have been a couple of threads recently on the Oracle forums about deterministic functions and the performance benefits thereof.

These keep nagging me that I have been meaning to write something up about this DETERMINISTIC assertion and to also investigate a couple of aspects of this further.

Put simply, the DETERMINISTIC keyword asserts that a function when called with a certain input will always return the same output.

By the way, although you might declare your function as DETERMINISTIC, Oracle does not verify this assertion.

A brief history

The DETERMINISTIC keyword in the CREATE FUNCTION statement appeared in Oracle 8i.

Here’s what the Oracle 8.1.7 documentation said about the DETERMINISTIC assertion:

DETERMINISTIC is an optimization hint that allows the system to use a saved copy of the function’s return result (if such a copy is available). The saved copy could come from a materialized view, a function-based index, or a redundant call to the same function in the same SQL statement. The query optimizer can choose whether to use the saved copy or re-call the function.

From what I can remember, the first such optimisations were not available until about 10.2.

Nobody Move – FBI

So, prior to any of these optimisations, the main purpose of the DETERMINISTIC keyword was to enforce that any function used in a function-based index was declared DETERMINISTIC (although not necessarily implemented as such as mentioned above – an attempt otherwise would result in an error ORA-30553: The function is not deterministic.

And just to prove it:

SQL> create or replace function f1 (
  2    i1 in number
  3  )
  4  return number
  5  as
  6  begin
  7   return i1;
  8  end;
  9  /

Function created.

SQL> create table det1
  2  (id          number
  3  ,val1        number
  4  ,padded_val1 varchar2(128)
  5  ,constraint pk_det1 primary key(id));

Table created.

SQL> insert into det1
  2  select rownum
  3  ,      mod(rownum, 10)
  4  ,      rpad(mod(rownum, 10),128)
  5  from   dual
  6  connect by rownum <= 10000;

10000 rows created.

SQL> create index i_det1_f1_id on det1(f1(id));
create index i_det1_f1_id on det1(f1(id))
                                  *
ERROR at line 1:
ORA-30553: The function is not deterministic


SQL>                             

If we address this oversight, it should all be fine:

SQL> create or replace function f1 (
  2    i1 in number
  3  )
  4  return number
  5  deterministic
  6  as
  7  begin
  8   return i1;
  9  end;
 10  /

Function created.

SQL> create index i_det1_f1_id on det1(f1(id));

Index created.

SQL> select index_name, table_name, index_type, status, funcidx_status
  2  from   dba_indexes
  3  where  index_name = 'I_DET1_F1_ID';

INDEX_NAME                     TABLE_NAME                     INDEX_TYPE                  STATUS   FUNCIDX_
------------------------------ ------------------------------ --------------------------- -------- -
I_DET1_F1_ID                   DET1                           FUNCTION-BASED NORMAL       VALID    ENABLED

SQL> 

Just for completeness – and this is really function-based index 101 rather than determinism – what happens if we change the function after the index has been created?

… And here is where I get hit by a distraction. I was trying to restrict all my examples in this series to 11gR2 for simplicity (and because of the environments I have available currently).

On 11.2.0.2:

SQL> create or replace function f1 (
  2    i1 in number
  3  )
  4  return number
  5  --deterministic
  6  as
  7  begin
  8   return i1;
  9  end;
 10  /

Function created.

SQL> select index_name, table_name, index_type, status, funcidx_status
  2  from   dba_indexes
  3  where  index_name = 'I_DET1_F1_ID';

INDEX_NAME                     TABLE_NAME                     INDEX_TYPE                  STATUS   FUNCIDX_
------------------------------ ------------------------------ --------------------------- -------- -
I_DET1_F1_ID                   DET1                           FUNCTION-BASED NORMAL       VALID    ENABLED

SQL> 

That’s not quite what I expected to see in the column FUNCIDX_STATUS but it’s been a long time since I’ve deliberately messed around with FBIs.

If you were to do this in 9i, then as this output from 9.2.0.8 shows:

SQL> select index_name, table_name, index_type, status, funcidx_status
  2  from   dba_indexes
  3  where  index_name = 'I_DET1_ID';

INDEX_NAME                     TABLE_NAME                     INDEX_TYPE                  STATUS   FUNCIDX_
------------------------------ ------------------------------ --------------------------- -------- -
I_DET1_F1_ID                   DET1                           FUNCTION-BASED NORMAL       VALID    DISABLED

SQL> 

So, not a lot happens in 11gR2

I’m 90% sure that the behaviour in 11.1.0.6 was the same as 9.2.0.8 – FUNCIDX_STATUS gets set to DISABLED – because I’ve previously built release scripts based on that status column which worked in 11.1 but I’ve not got access to such an environment to compare right now.

I was thinking bug perhaps but perhaps this quote from the 11gR2 documentation justifies this behaviour:

If you change the semantics of a DETERMINISTIC function and recompile it, then existing function-based indexes and materialized views report results for the prior version of the function. Thus, if you change the semantics of a function, you must manually rebuild any dependent function-based indexes and materialized views.

I’m not sure – but it’s a change certainly.

Next part – some benefits of determinism.

Determining Deterministic

This is a bit weird.

I’m putting together a couple of articles on DETERMINISTIC functions – this is a landing page for those – but it’s a feature I never use unless I’ve got a function-based index.

Hopefully, I’ll go over some of the reasons why.

But by crickey it’s really highlighted to me again how hard it is to come up with isolated, consistent, simple demonstrations. It’s like opening Pandora’s box as you come across little oddities and influencing factors.

Anyway… for what it’s worth:

Follow

Get every new post delivered to your Inbox.

Join 69 other followers