Correlation / causation

On Tuesday morning I walked into a right sh*t storm.

I hadn’t been called but on Monday night performance had been absolutely catastrophic – SLAs missed left, right and centre.

Straight into a meeting where the following observations were made:

  • Performance was somewhat suboptimal the previous night
  • Some long-running queries were observed that aren’t normally top of the lists, these were “uncontrolled” access via Excel or Access, not via the reporting Front End or other application processes
  • When these queries stopped, performance returned to normal

So:

  • Why had these queries brought the database to its knees?
  • Was it stats and the stats job on Sunday night?
  • Was it “plan flips”? (I hear a lot about “plan flips” and them being “bad” – do people really expect plan stability without plan stability features?)
  • Was it cardinality feedback? (Admittedly this does seem to have caused us more positives than negatives particularly when working in conjunction with dynamic sampling when I’m not sure it should be).

At this point, I still hadn’t seen any evidence or symptoms of the issue, just hearsay.

So, what did the problem look like?

Below is a snippet from an AWR report from a “normal” period.
(It’s longer than usually advisable – I just want to show the symptoms over the problem period)

Platform                         CPUs Cores Sockets Memory(GB)
-------------------------------- ---- ----- ------- ----------
Linux x86 64-bit                   24    12       2     141.44

              Snap Id      Snap Time      Sessions Curs/Sess
            --------- ------------------- -------- ---------
Begin Snap:      6678 21-May-12 21:30:26       584      22.8
  End Snap:      6682 21-May-12 23:30:33       565      24.6
   Elapsed:              120.13 (mins)
   DB Time:              582.50 (mins)

Foreground Wait Events
                                                             Avg
                                        %Time Total Wait    wait    Waits   % DB
Event                             Waits -outs   Time (s)    (ms)     /txn   time
-------------------------- ------------ ----- ---------- ------- -------- ------
db file sequential read       2,664,063     0      7,050       3      2.5   20.2
log file sync                 1,027,185     0      3,501       3      1.0   10.0
direct path read                159,706     0      1,191       7      0.2    3.4
db file scattered read           61,649     0        284       5      0.1     .8

Background Wait Events
                                                             Avg
                                        %Time Total Wait    wait    Waits   % bg
Event                             Waits -outs   Time (s)    (ms)     /txn   time
-------------------------- ------------ ----- ---------- ------- -------- ------
log file parallel write         839,190     0      1,465       2      0.8   23.6
db file parallel write          551,198     0        674       1      0.5   10.9
Log archive I/O                  21,277     0        311      15      0.0    5.0
log file sequential read          6,714     0        150      22      0.0    2.4
control file parallel writ       14,311     0         48       3      0.0     .8

and this was what had happened:

              Snap Id      Snap Time      Sessions Curs/Sess
            --------- ------------------- -------- ---------
Begin Snap:      7014 28-May-12 21:30:28       560      22.8
  End Snap:      7018 28-May-12 23:30:01       538      23.7
   Elapsed:              119.54 (mins)
   DB Time:            2,067.54 (mins)

Foreground Wait Events
                                                             Avg
                                        %Time Total Wait    wait    Waits   % DB
Event                             Waits -outs   Time (s)    (ms)     /txn   time
-------------------------- ------------ ----- ---------- ------- -------- ------
log file sync                   399,560     0     83,938     210      1.0   67.7
direct path read                123,631     0     15,639     126      0.3   12.6
db file sequential read         363,158     0      9,406      26      0.9    7.6
db file scattered read           93,140     0      4,083      44      0.2    3.3

Background Wait Events
                                                             Avg
                                        %Time Total Wait    wait    Waits   % bg
Event                             Waits -outs   Time (s)    (ms)     /txn   time
-------------------------- ------------ ----- ---------- ------- -------- ------
db file parallel write          394,366     0      8,048      20      1.0   37.6
log file parallel write         101,489     0      6,197      61      0.2   28.9
Log archive I/O                  12,026     0      1,814     151      0.0    8.5
control file parallel writ       11,240     0      1,793     160      0.0    8.4
log file sequential read          9,587     0      1,179     123      0.0    5.5

So, hang on a second.

We think that a couple of long-running queries caused this devastation – log file syncs out to 70x our chosen “normal” above and no CPU issues? All IO basically choked.

Doesn’t make sense – wouldn’t that take some serious misconfiguration?

And yet, how could this be a storage issue if it all resolved itself when those long-running queries finished? It was this doubt that meant we spent longer than needed checking and double checking this box and this database before contacting the storage team.

Then to cut a long story short and to miss out a whole bunch of toing and froing and whys and wherefores… the storage team put the focus on another of our databases that uses the same disk groups.

That database didn’t seem to be doing very much:

              Snap Id      Snap Time      Sessions Curs/Sess
            --------- ------------------- -------- ---------
Begin Snap:      8271 28-May-12 21:00:33        71        .9
  End Snap:      8273 28-May-12 23:00:17        72        .9
   Elapsed:              119.74 (mins)
   DB Time:              155.20 (mins)

Although the few things it was doing were similarly slow:

                                                           Avg
                                                          wait   % DB
Event                                 Waits     Time(s)   (ms)   time Wait Class
------------------------------ ------------ ----------- ------ ------ ----------
control file sequential read        272,003       6,778     25   72.8 System I/O
db file sequential read              36,346       1,409     39   15.1 User I/O
db file parallel read                 1,127         797    707    8.6 User I/O
log file sync                           300         253    843    2.7 Commit
DB CPU                                               72            .8

But, hang on a sec. What’s that:

Time Model Statistics

Statistic Name                                       Time (s) % of DB Time
------------------------------------------ ------------------ ------------
RMAN cpu time (backup/restore)                       42,305.3        454.3

and

Background Wait Events
                                                             Avg
                                        %Time Total Wait    wait    Waits   % bg
Event                             Waits -outs   Time (s)    (ms)     /txn   time
-------------------------- ------------ ----- ---------- ------- -------- ------
RMAN backup & recovery I/O    1,166,261     0     72,242      62 2.65E+04   60.2

Turns out that it was an RMAN backup, running when it shouldn’t have been, on a 3.5T archive database on one of the other nodes of the cluster and which uses the same disk groups.

SQL> select to_char(start_time,'DD-MON-YYYY HH24:MI')
  2  ,      to_char(end_time,'DD-MON-YYYY HH24:MI')
  3  ,      input_bytes_display
  4  ,      output_bytes_display
  5  ,      time_taken_display  
  6  from v$rman_backup_job_details  
  7  where session_stamp = 784494089  
  8  order by start_time desc  
  9  /

TO_CHAR(START_TIM TO_CHAR(END_TIME, INPUT_BYTES_DISPLAY  OUTPUT_BYTES_DISPLAY TIME_TAKEN_DISPLAY
----------------- ----------------- -------------------- -------------------- --------------------
28-MAY-2012 19:01 28-MAY-2012 23:47     2.99T              532.82G            04:46:30

SQL> 

That correlation was the causation, not those queries and this backup finished just before those long-running queries finished and then everything was ok.

But the long-running queries were just a red-herring and another symptom of everything that was running on the live database was choking.

Sometimes it’s difficult to distinguish between cause and symptom.

HugePages – Overhead of Not Using

I’ve read a fair bit about HugePages and their importance especially for Oracle databases with large SGAs and a large number of dedicated connections.

Admittedly, some of what I read is over my head – I’m not a production DBA and quite often I don’t even have access to the box unfortunately – but I know who the experts are and I trust them (doesn’t mean that an expert’s opinion shouldn’t be tested of course):

Now, the title in Kevin’s posts asks ( rhetorically) “Configuring Linux Hugepages for Oracle Database Is Just Too Difficult! Isn’t It?”

And I think a lot of the time, it is. Not because of the topic but because of the structure of modern IT departments.

When I tried to tackle the subject before at a client, it was awkward – it wasn’t a subject I could talk about with any great authority although I could refer them to the expert material above and … well, to cut a short story shorter and keep me out of trouble… we didn’t. I was overruled.  No-one else at the client was using HugePages on 11g. So no HugePages and no chance to test it.

Well, in the end, we didn’t go with AMM, because for our applications, according to our tests, we were better suited to ASMM (damn these acronyms).

Statements like this from Kevin’s articles above are easy to understand:

Reasons for Using Hugepages

  1. Use hugepages if OLTP or ERP. Full stop.
  2. Use hugepages if DW/BI with large numbers of dedicated connections or a large SGA. Full stop.
  3. Use hugepages if you don’t like the amount of memory page tables are costing you (/proc/meminfo). Full stop.

and

“Large number of dedicated connections” and “large SGA” in the context of hugepages can only be quantified by the amount of memory wasted in page tables and whether the administrator is satisfied with that cost.”

Those are pretty clear cut.

So, if I understand everything correctly, I should not like this, right?:

$ grep "PageTables" /proc/meminfo
PageTables:   41622228 kB
$

That’s 40 gig of overhead for an SGA of 60 gig and a whole bunch of connections.
So, don’t need HugePages, eh?
Bit late now.

Why did that report run slow? ASH says log file sequential read

“Dump Online Redo for Buffers in Pin History”

I’ve just been investigating why a certain report ran slow last night.
Read more of this post

Regexp hungry for CPU? Real time sql monitoring shows just how

Not exactly a snappy blog post title…

First up, an execution plan showing the old problem of how the costs of a scalar subquery are not properly accounted for, even in the latest 11gR2 releases.
Read more of this post

Poll: SQL Plan Management

I’m interested in what SQL Plan Management features people are actively using.
Read more of this post

ORA-00600 [kkqtSetOp.1] – Join Factorization

Just a quick note about an ORA-00600 that recently occurred following an upgrade to 11.2.0.3.

I’ve not been able to distill an isolated test case from the specific production code generating this error so I’ll skip the application-specific example.

The error message reported was:
Read more of this post

Timeslot SQL

A reminder about sql logic required to deal with express datetimes to the nearest x mins – for example, this particular question come up on an internal forum last week where the poster wanted to aggregate data per quarter hour.

There are two approaches depending on the exact requirement.

The first option is to generate the timeslots using a data generator and then join to the source data.

SQL>  alter session set nls_date_format = 'DD-MON-YYYY Hh24:MI';

Session altered.

SQL> with times as
  2  (select trunc(sysdate) + ((rownum-1)*15/60/24) slot
  3   from   dual
  4   connect by rownum <= 10)  --<-- just to keep the output short
  5  select * from times;

SLOT
-----------------
30-APR-2012 00:00
30-APR-2012 00:15
30-APR-2012 00:30
30-APR-2012 00:45
30-APR-2012 01:00
30-APR-2012 01:15
30-APR-2012 01:30
30-APR-2012 01:45
30-APR-2012 02:00
30-APR-2012 02:15

10 rows selected.

The second options is to round/floor/ceil the relevant existing date/timestamp in the table to the nearest x minutes.

For this solution, there is a generic approach as follows, demonstrated using sysdate:

SQL> select sysdate,
  2         trunc(sysdate)
  3       + floor(to_char(sysdate,'sssss') / 900) / 96 floor_slot,
  4         trunc(sysdate)
  5       + round(to_char(sysdate,'sssss') / 900) / 96 round_slot,
  6         trunc(sysdate)
  7       + ceil(to_char(sysdate,'sssss') / 900) / 96 ceil_slot
  8  from   dual;

SYSDATE           FLOOR_SLOT        ROUND_SLOT        CEIL_SLOT
----------------- ----------------- ----------------- -----------------
30-APR-2012 09:58 30-APR-2012 09:45 30-APR-2012 10:00 30-APR-2012 10:00

SQL> 

Where the basic forumla is:

 trunc(sysdate)
+ floor(to_char(sysdate,'sssss') / x ) / y 
  1. The date – example uses sysdate but this would probably be a column – truncated to start of day.
  2. The usage of FLOOR / ROUND / CEIL depends on the requirement.
  3. to_char(sysdate,’sssss’) is the number of seconds since midnight
  4. x is the number of seconds in the timeslot length we’re interested in – for the example, 15 minutes * 60 seconds per minute = 900
  5. Because date arithmetic is done in days, y is the number of timeslots in a day – for this example 4 timeslots per hour * 24 hours per day = 96

So, if we want five minute timeslots, x is 5*60=300; y is 12*24=288:

SQL> select sysdate,
  2         trunc(sysdate)
  3       + floor(to_char(sysdate,'sssss') / 300) / 288 floor_stamp,
  4         trunc(sysdate)
  5       + round(to_char(sysdate,'sssss') / 300) / 288 round_stamp,
  6         trunc(sysdate)
  7       + ceil(to_char(sysdate,'sssss') / 300) / 288 ceil_stamp
  8  from   dual;

SYSDATE           FLOOR_STAMP       ROUND_STAMP       CEIL_STAMP
----------------- ----------------- ----------------- -----------------
30-APR-2012 10:13 30-APR-2012 10:10 30-APR-2012 10:15 30-APR-2012 10:15

SQL> 

Or, for half hour timeslots, x is 60*30=1800; y is 2*24=48:

SQL> select sysdate,
  2         trunc(sysdate)
  3       + floor(to_char(sysdate,'sssss') / 1800) / 48 floor_stamp,
  4         trunc(sysdate)
  5       + round(to_char(sysdate,'sssss') / 1800) / 48 round_stamp,
  6         trunc(sysdate)
  7       + ceil(to_char(sysdate,'sssss') / 1800) / 48 ceil_stamp
  8  from   dual;

SYSDATE           FLOOR_STAMP       ROUND_STAMP       CEIL_STAMP
----------------- ----------------- ----------------- -----------------
30-APR-2012 10:19 30-APR-2012 10:00 30-APR-2012 10:30 30-APR-2012 10:30

SQL> 
Follow

Get every new post delivered to your Inbox.

Join 72 other followers