Thoughts on TM_DELTA_TIME
July 6, 2012 7 Comments
What are these columns in V$ACTIVE_SESSION_HISTORY – TM_DELTA_TIME, TM_DELTA_CPU_TIME and TM_DELTA_DB_TIME?
- TM_DELTA_TIME – Time interval (in microseconds) over which TM_DELTA_CPU_TIME and TM_DELTA_DB_TIME are accumulated
- TM_DELTA_CPU_TIME – Amount of time this session spent on CPU over the last TM_DELTA_TIME microseconds
- TM_DELTA_DB_TIME – Amount of time spent by this session in database calls over the last TM_DELTA_TIME microseconds
Need a reminder on TIME MODEL STATISTICS?
The most important of the time model statistics is DB time.
This statistics represents the total time spent in database calls and is an indicator of the total instance workload.
It is calculated by aggregating the CPU and wait times of all sessions not waiting on idle wait events (non-idle user sessions).
i.e. DB TIME is time spent either actively working or actively waiting
What this effectively means is that TM_DELTA_CPU_TIME and TM_DELTA_DB_TIME have no context without TM_DELTA_TIME.
So when looking at TM_DELTA_CPU_TIME and TM_DELTA_DB_TIME, all other sample times/periods are pretty much irrelevant apart from TM_DELTA_TIME – the latter is the only value that gives context to the other two.
TM_DELTA_DB_TIME is time spent actively working or actively waiting, some of this might be time spent on CPU – TM_DELTA_CPU_TIME – the rest will be time spent actively waiting on some non-idle wait events plus some overheads no doubt.
And TM_DELTA_DB_TIME should never exceed TM_DELTA_TIME, right?
The difference between TM_DELTA_TIME and TM_DELTA_DB_TIME should be idle time.
Sounds like something that might be quite useful particularly to counterbalance some of the pitfalls of looking at sampled data?
How might we investigate this further?
Shall we start with a plsql loop that should do nothing but sit on CPU for a short while?
SQL> select * from v$version; BANNER ------------------------------------------------------------------------------- Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production PL/SQL Release 11.2.0.3.0 - Production CORE 11.2.0.3.0 Production TNS for Linux: Version 11.2.0.3.0 - Production NLSRTL Version 11.2.0.3.0 - Production SQL> select distinct sid from v$mystat; SID ---------- 642 SQL> alter session set plsql_optimize_level = 0; Session altered. SQL> set timing on SQL> DECLARE 2 BEGIN 3 FOR i IN 1 .. 1000000000 LOOP 4 NULL; 5 END LOOP; 6 END; 7 / PL/SQL procedure successfully completed. Elapsed: 00:00:16.46 SQL>
What does ASH say?
SQL> select to_char(sample_time,'HH24:MI:SS') 2 , is_awr_sample 3 , event 4 , session_state 5 , to_char(tm_delta_time) tm_delta_time 6 , to_char(tm_delta_cpu_time) tm_delta_cpu_time 7 , to_char(tm_delta_db_time) tm_delta_db_time 8 from v$active_session_history 9 where session_id = 642 10 and sql_id = '6b5drmtrvs6vv' 11 order by sample_id desc; TO_CHAR( I EVENT SESSION TM_DELTA_TIME TM_DELTA_CPU_TIME TM_DELTA_DB_TIME -------- - ------------------------- ------- --------------- ----------------- ---------------- 14:45:25 N ON CPU 14:45:24 N ON CPU 2019831 2016694 2019831 14:45:23 N ON CPU 14:45:22 N ON CPU 2019978 2017693 2019978 14:45:21 N ON CPU 14:45:20 N ON CPU 2019848 2017694 2019848 14:45:19 N ON CPU 14:45:18 Y ON CPU 14:45:17 N ON CPU 2019934 2017693 2019934 14:45:16 N ON CPU 14:45:15 N ON CPU 2019803 2017693 2019803 14:45:14 N ON CPU 14:45:13 N ON CPU 2019925 2017693 2019925 14:45:12 N ON CPU 14:45:11 N ON CPU 4711327 1921708 1926278 14:45:10 N ON CPU 40504273 130979 134479 16 rows selected. SQL>
So, in this example, roughly every two seconds I get feedback of what my session was doing for that DELTA TIME period.
Unsurprisingly, given what we were doing, the session mainly spent 100% of it’s time in TM_DELTA_DB_TIME actively working, and TM_DELTA_CPU_TIME is the major component of that.
The difference between CPU_TIME – DB_TIME would be the time spent actively waiting plus some overhead.
If did more waiting on a non-idle wait event – IO for example – our DB_TIME would be similar but the element of CPU_TIME would be smaller, e.g.
SQL> DECLARE 2 BEGIN 3 FOR i IN (SELECT * FROM all_source) LOOP 4 NULL; 5 END LOOP; 6 END; 7 / PL/SQL procedure successfully completed. Elapsed: 00:00:37.82
which gives
SQL> select to_char(sample_time,'HH24:MI:SS') 2 , is_awr_sample 3 , event 4 , session_state 5 , to_char(tm_delta_time) tm_delta_time 6 , to_char(tm_delta_cpu_time) tm_delta_cpu_time 7 , to_char(tm_delta_db_time) tm_delta_db_time 8 from v$active_session_history 9 where session_id = 642 10 and sql_id IN ('2nj4dm3qwmkqh','34y8jhfh5w4q9') 11 order by sample_id desc; TO_CHAR( I EVENT SESSION TM_DELTA_TIME TM_DELTA_CPU_TIME TM_DELTA_DB_TIME -------- - ------------------------- ------- --------------- ----------------- ---------------- 15:05:43 N ON CPU 2019899 1089835 2019899 15:05:42 Y db file scattered read WAITING 15:05:41 N db file scattered read WAITING 2020879 1359793 2020879 15:05:40 N ON CPU 15:05:39 N db file scattered read WAITING 2022891 1487774 2022891 15:05:38 N db file scattered read WAITING 15:05:37 N db file scattered read WAITING 2021913 1123828 2021913 15:05:36 N ON CPU 15:05:35 N db file scattered read WAITING 2036914 953856 2036914 15:05:34 N ON CPU 15:05:33 N db file scattered read WAITING 1017936 686895 1017936 15:05:32 Y db file scattered read WAITING 2039867 897864 2039867 15:05:31 N db file scattered read WAITING 15:05:30 N db file scattered read WAITING 1928599 1035843 1928599 15:05:29 N ON CPU 15:05:28 N ON CPU 2069218 1185819 2069218 15:05:27 N ON CPU 15:05:26 N ON CPU 2092901 840872 2092901 15:05:25 N ON CPU 15:05:24 N db file scattered read WAITING 1998272 1044841 1998272 15:05:23 N db file scattered read WAITING 15:05:22 Y db file scattered read WAITING 2052570 1131828 2052570 15:05:21 N ON CPU 15:05:20 N db file scattered read WAITING 2021863 1411786 2021863 15:05:19 N ON CPU 15:05:18 N db file scattered read WAITING 1674137 797879 1674137 15:05:17 N db file scattered read WAITING 15:05:16 N ON CPU 2350338 270958 2350338 15:05:15 N ON CPU 15:05:14 N db file scattered read WAITING 1944057 165975 1944057 15:05:13 N db file scattered read WAITING 15:05:12 Y db file scattered read WAITING 2078929 1194819 2078929 15:05:11 N db file scattered read WAITING 15:05:10 N ON CPU 2052015 1266807 2052015 15:05:09 N ON CPU 15:05:08 N db file scattered read WAITING 985873021 915860 1846906 15:05:07 N ON CPU 15:05:06 N ON CPU 6746807 836873 843056 82 rows selected. SQL>
As expected, we see from TM_DELTA_DB_TIME that we are mainly spending 100% of our time actively working or actively waiting.
But, because we’ve had to do some IO, the proportion of that time which is CPU has reduced and the difference between TM_DELTA_DB_TIME and TM_DELTA_CPU_TIME should be roughly the time spent actively waiting during that TM_DELTA_TIME.
But in this output immediately above, do we start to see some of the limitations in this data?
What about this?
TO_CHAR( I EVENT SESSION TM_DELTA_TIME TM_DELTA_CPU_TIME TM_DELTA_DB_TIME -------- - ------------------------- ------- --------------- ----------------- ---------------- 15:05:08 N db file scattered read WAITING 985873021 915860 1846906
Why the big TM_DELTA_TIME all of a sudden? What’s that all about?
I’ve no idea and if anyone does then please comment.
I would presume that it’s a bug in the accounting.
And if it is a bug, it seriously limits the usefulness of this data for now and the ability to build diagnostic scripts on top of it.
So, moving quickly to how this data is stored down into the repository – DBA_HIST_ACTIVE_SESS_HISTORY.
You probably already know that every 10th second’s worth of the ASH buffer is marked for storage in the repository, as marked by V$ACTIVE_SESSION_HISTORY.IS_AWR_SAMPLE (included in the output above).
From the last buffer output above executing the loop of cpu+io workload, it should be these rows:
TO_CHAR( I EVENT SESSION TM_DELTA_TIME TM_DELTA_CPU_TIME TM_DELTA_DB_TIME -------- - ------------------------- ------- --------------- ----------------- ---------------- 15:05:42 Y db file scattered read WAITING 15:05:32 Y db file scattered read WAITING 2039867 897864 2039867 15:05:22 Y db file scattered read WAITING 2052570 1131828 2052570 15:05:12 Y db file scattered read WAITING 2078929 1194819 2078929
My initial thoughts were to wonder whether these TM_DELTA figures are persisted “as is”.
In which case, their value would be seriously limited.
Don’t want to wait for the flushing of ASH buffer to the repository?
We can force the flush by creating an AWR snapshot.
SQL> exec dbms_workload_repository.create_snapshot; PL/SQL procedure successfully completed. SQL>
And wham, bam, we can check out what has actually been persisted to DBA_HIST_ACTIVE_SESSION_HISTORY:
SQL> select to_char(sample_time,'HH24:MI:SS') 2 , event 3 , session_state 4 , to_char(tm_delta_time) tm_delta_time 5 , to_char(tm_delta_cpu_time) tm_delta_cpu_time 6 , to_char(tm_delta_db_time) tm_delta_db_time 7 from dba_hist_active_sess_history 8 where session_id = 642 9 and sql_id IN ('2nj4dm3qwmkqh','34y8jhfh5w4q9') 10 order by sample_id desc; TO_CHAR( EVENT SESSION TM_DELTA_TIME TM_DELTA_CPU_TIME TM_DELTA_DB_TIME -------- ------------------------- ------- --------------- ----------------- ---------------- 15:05:42 db file scattered read WAITING 9120533 5612146 703679604684133 15:05:32 db file scattered read WAITING 10128857 7202905 281471853489246 15:05:22 db file scattered read WAITING 10042965 7493861 20148167 15:05:12 db file scattered read WAITING 1194282071 20199931 31190611 9 rows selected. SQL>
So, the good news would be that the values from the buffer are not just copied down but seem to be an aggregation, presumably of the sampled rows. I’m not sure because some of the numbers add up, some don’t.
But the big, bad news is that there are similar accounting bugs – note the values in the first two rows for TM_DELTA_DB_TIME.
It should not be possible for TM_DELTA_DB_TIME to be greater than TM_DELTA_TIME.
As a result, this data seems occasionally unreliable and this unreliability undermines it’s usefulness.
Unless of course someone else has another explanation?
I have tried to use these three columns and find excessively large TM_DELTA_CPU_TIME appearing in both V$ACTIVE_SESSION_HISTORY and DBA_HIST_ACTIVE_SESS_HISTORY (naturally following from the source). This is in 11.2.0.2 A search on support.oracle.com doesn’t yield any useful notes.
Hemant – Thanks for the feedback and confirmation. So it seems that it’s not really possible to use these columns reliably yet.
Ran my queries on 11.2.0.3 and got similar excessively large values.
Hi Dominic,
just to give some more thoughts on this: All these columns added in 11.2 are actually used by Real-Time SQL Monitoring to populate the “Metrics” tab in the “EM/Active” (flash-based) version of the report.
There is an interesting query run by the DBMS_SQLTUNE.BUILD_STASH_XML function as part of generating the final XML embedded into the HTML document representing the document.
This uses all these newly added fields to come up with some bucket-based aggregation of that data which in turn seems to be used by the function PL/SQL code to generate the corresponding XML data the “Metrics” tab content is based on.
The query also seems to filter out those “outliers” that you’ve identified.
Note that my XPLAN_ASH tool in “experimental” mode also tries to analyze that data (the I/O related part) but it looks like that at least the I/O figures are not really corresponding to the actual amount of I/O – quite often they seem to fall short of the actual amount.
I haven’t investigated if a common cause for this can be found, but I have test cases where I can see that the I/O related graphs in the “Metrics” tab show the same inconsistency, so it doesn’t look a problem of my code handling the ASH data.
Randolf
If I am doing the math correctly, it looks like there 985 seconds in your first session record in ASH at 15:05. This is about 16 minutes. Your previous record was 2:45PM, or about 20 minutes prior. I am wondering if you didn’t do something else in your session at about 2:49PM, and then nothing for 16 minutes. Could there be a bug where the time is recorded since the previous measurement? I know that defeats the entire purpose of tracking “active” sessions, but the math at least makes me ask the question.
Pingback: What AWR isn’t telling you | OraStory
Hi, TM_DELTA_DB_TIME greater than TM_DELTA_TIME isn’t explained by the fact that are more cpus when TM_DELTA_TIME is merely microseconds passed?