Parallel Window Consolidator Calls The Stops
November 23, 2020 Leave a comment
Sharing observations of performance issue arising out of testing 19.6 upgrade from 11.2.0.4 (bug possibly from 12+ looking at the fixes which work).
Sharing this one in particular as it doesn’t seem that common from anecdotal evidence (forums, blogs, support), it doesn’t appear fixed yet, progress with support has been disappointing, however circumstances don’t seem particularly niche.
We’ve had a few issues with a number of parallel queries just “hanging” forever.
Issue seems to be related to WINDOW CONSOLIDATOR and the parallel distribution method.
Haven’t been able to find a good matching bug via Oracle Support (nor happy with the progress of SR with them) but found a good match on a Lothar Flatz blog post which led me to the PQ_DISTRIBUTE_WINDOW as cause.
SQL contains parallel + join to view, view contains analytic
SQL Executes in under a second “normally”.
Without a fix, the SQL will just hang for a long time. The longest I’ve left it is 3 days but given the wait states of the sessions involved, there’s no reason to think it would stop of its own accord.
Jumping ahead to the workarounds which work, any of the following:
- Turn off fix_control 13345888 via alter session or opt_param
- Turn off “_adaptive_window_consolidator_enabled” via alter session/system or opt_param
- Hint a PQ distribution method other than 2. 3 doesn’t work here so 1 is the other option – PQ_DISTRIBUTE_WINDOW(@query_block 1)
Here we see some evidence from just now, running for nearly 30 minutes:
select s.inst_id i,s.sid,s.serial#,s.module
, s.blocking_session blks,s.blocking_instance bi,s.final_blocking_session f_blk_s, s.final_blocking_instance fbi
, CASE WHEN state != 'WAITING' THEN 'WORKING' ELSE 'WAITING' END AS state
, CASE WHEN state != 'WAITING' THEN 'On CPU / runqueue' ELSE event END AS sw_event, s.seconds_in_wait wait
, s.sql_id,s.sql_exec_id,to_char(s.sql_exec_start,'DD-MON-YYYY HH24:MI:SS') exec_start
from gv$session s
where sql_id = '5xyrp7v4djagv';
I SID SERIAL# MODULE BLKS BI F_BLK_S FBI STATE SW_EVENT WAIT SQL_ID SQL_EXEC_ID EXEC_START
---------- ---------- ---------- -------------------- ---------- ---------- ---------- ---------- ------- ------------------------- ---------- ------------- ----------- -----------------------------
2 2805 15947 SQL Developer WAITING PX Deq: Execution Msg 1492 5xyrp7v4djagv 33554434 26-OCT-2020 12:18:59
2 3095 3608 SQL Developer WAITING PX Deq: Execution Msg 1492 5xyrp7v4djagv 33554434 26-OCT-2020 12:18:59
2 3367 28066 SQL Developer WAITING PX Deq: Execution Msg 1492 5xyrp7v4djagv 33554434 26-OCT-2020 12:18:59
2 5610 8452 SQL Developer WAITING PX Deq: Table Q Normal 1492 5xyrp7v4djagv 33554434 26-OCT-2020 12:18:59
2 5885 54481 SQL Developer 9672 2 9672 2 WAITING PX Deq: Execute Reply 1492 5xyrp7v4djagv 33554434 26-OCT-2020 12:18:59
2 8828 57832 SQL Developer WAITING PX Deq: Execution Msg 1492 5xyrp7v4djagv 33554434 26-OCT-2020 12:18:59
2 9111 37143 SQL Developer WAITING PX Deq: Execution Msg 1492 5xyrp7v4djagv 33554434 26-OCT-2020 12:18:59
2 9383 50792 SQL Developer WAITING PX Deq: Execution Msg 1492 5xyrp7v4djagv 33554434 26-OCT-2020 12:18:59
2 9672 29993 SQL Developer WAITING PX Deq: Execution Msg 1492 5xyrp7v4djagv 33554434 26-OCT-2020 12:18:59
Real-time SQL Monitoring report looks per below (note Duration compared to any other progress/metrics)… note that in later versions of Oracle, RTSM has a habit of timing out and reporting DONE(ERROR) even though the SQL is still going:
Global Information
------------------------------
Status : EXECUTING
Instance ID : 2
SQL ID : 5xyrp7v4djagv
SQL Execution ID : 33554434
Execution Started : 10/26/2020 12:18:59
First Refresh Time : 10/26/2020 12:18:59
Last Refresh Time : 10/26/2020 12:18:59
Duration : 1592s
Module/Action : SQL Developer/-
Service : LNTDH8U.UK.DB.COM
Program : SQL Developer
Global Stats
=========================================
| Elapsed | Cpu | Other | Buffer |
| Time(s) | Time(s) | Waits(s) | Gets |
=========================================
| 0.00 | 0.00 | 0.00 | 3 |
=========================================
Parallel Execution Details (DOP=4 , Servers Allocated=8)
==========================================================================================
| Name | Type | Server# | Elapsed | Cpu | Other | Buffer | Wait Events |
| | | | Time(s) | Time(s) | Waits(s) | Gets | (sample #) |
==========================================================================================
| PX Coordinator | QC | | 0.00 | 0.00 | 0.00 | 3 | |
| p000 | Set 1 | 1 | | | | | |
| p001 | Set 1 | 2 | | | | | |
| p002 | Set 1 | 3 | | | | | |
| p003 | Set 1 | 4 | | | | | |
| p004 | Set 2 | 1 | | | | | |
| p005 | Set 2 | 2 | | | | | |
| p006 | Set 2 | 3 | | | | | |
| p007 | Set 2 | 4 | | | | | |
==========================================================================================
SQL Plan Monitoring Details (Plan Hash Value=3844894891)
===============================================================================================================================================================================================
| Id | Operation | Name | Rows | Cost | Time | Start | Execs | Rows | Activity | Activity Detail |
| | | | (Estim) | | Active(s) | Active | | (Actual) | (%) | (# samples) |
===============================================================================================================================================================================================
| 0 | SELECT STATEMENT | | | | | | 1 | | | |
| -> 1 | PX COORDINATOR | | | | 1 | +0 | 1 | 0 | 100.00 | Cpu (1) |
| 2 | PX SEND QC (RANDOM) | :TQ20006 | 2M | 445 | | | | | | |
| 3 | HASH JOIN | | 2M | 445 | | | | | | |
| 4 | PART JOIN FILTER CREATE | :BF0000 | 232 | 441 | | | | | | |
| 5 | PX RECEIVE | | 232 | 441 | | | | | | |
| 6 | PX SEND BROADCAST | :TQ20005 | 232 | 441 | | | | | | |
| 7 | VIEW | | 232 | 441 | | | | | | |
| 8 | WINDOW CONSOLIDATOR BUFFER | | 232 | 441 | | | | | | |
| 9 | PX RECEIVE | | 232 | 441 | | | | | | |
| 10 | PX SEND HASH | :TQ20004 | 232 | 441 | | | | | | |
| 11 | WINDOW BUFFER | | 232 | 441 | | | | | | |
| 12 | HASH JOIN | | 232 | 441 | | | | | | |
| 13 | NESTED LOOPS | | 505 | 4 | | | | | | |
| 14 | TABLE ACCESS BY GLOBAL INDEX ROWID | D_B | 1 | 3 | | | | | | |
| 15 | PX RECEIVE | | 1 | 2 | | | | | | |
| 16 | PX SEND HASH (BLOCK ADDRESS) | :TQ20002 | 1 | 2 | | | | | | |
| 17 | PX SELECTOR | | | | | | | | | |
| 18 | INDEX UNIQUE SCAN | D_B_PK | 1 | 2 | | | | | | |
| 19 | INDEX FULL SCAN | D_R_UK01 | 505 | 1 | | | | | | |
| 20 | PX RECEIVE | | 108 | 437 | | | | | | |
| 21 | PX SEND BROADCAST | :TQ20003 | 108 | 437 | | | | | | |
| 22 | VIEW | V_V2 | 108 | 437 | | | | | | |
| 23 | WINDOW CONSOLIDATOR BUFFER | | 108 | 437 | | | | | | |
| 24 | PX RECEIVE | | 108 | 437 | | | | | | |
| 25 | PX SEND HASH | :TQ20001 | 108 | 437 | | | | | | |
| 26 | WINDOW SORT | | 108 | 437 | | | | | | |
| 27 | NESTED LOOPS | | 108 | 436 | | | | | | |
| 28 | NESTED LOOPS | | 108 | 4 | | | | | | |
| 29 | TABLE ACCESS BY GLOBAL INDEX ROWID | D_B | 1 | 3 | | | | | | |
| 30 | PX RECEIVE | | 1 | 2 | | | | | | |
| 31 | PX SEND HASH (BLOCK ADDRESS) | :TQ20000 | 1 | 2 | | | | | | |
| 32 | PX SELECTOR | | | | | | | | | |
| 33 | INDEX UNIQUE SCAN | D_B_PK | 1 | 2 | | | | | | |
| 34 | INDEX RANGE SCAN | D_BS_UK01 | 108 | 1 | | | | | | |
| 35 | PX COORDINATOR | | | | | | | | | |
| 36 | PX SEND QC (RANDOM) | :TQ10002 | 1 | 4 | | | | | | |
| 37 | BUFFER SORT | | 2M | | | | | | | |
| 38 | VIEW | V_V1 | 1 | 4 | | | | | | |
| 39 | UNION ALL PUSHED PREDICATE | | | | | | | | | |
| 40 | TABLE ACCESS BY GLOBAL INDEX ROWID | D_S | 1 | 2 | | | | | | |
| 41 | BUFFER SORT | | | | | | | | | |
| 42 | PX RECEIVE | | 1 | 1 | | | | | | |
| 43 | PX SEND HASH (BLOCK ADDRESS) | :TQ10000 | 1 | 1 | | | | | | |
| 44 | PX SELECTOR | | | | | | | | | |
| 45 | INDEX UNIQUE SCAN | D_S_PK | 1 | 1 | | | | | | |
| 46 | TABLE ACCESS BY INDEX ROWID | D_S | 1 | 2 | | | | | | |
| 47 | BUFFER SORT | | | | | | | | | |
| 48 | PX RECEIVE | | 1 | 1 | | | | | | |
| 49 | PX SEND HASH (BLOCK ADDRESS) | :TQ10001 | 1 | 1 | | | | | | |
| 50 | PX SELECTOR | | | | | | | | | |
| 51 | INDEX UNIQUE SCAN | D_S | 1 | 1 | | | | | | |
| 52 | PX BLOCK ITERATOR ADAPTIVE | | 8255 | 3 | | | | | | |
| 53 | TABLE ACCESS STORAGE FULL | F_FACT | 8255 | 3 | | | | | | |
===============================================================================================================================================================================================
ASH says it’s not doing anything (as we’d expect it to, being consistent with RTSM report above).
select count(*), min(sample_time), max(sample_time), sysdate from gv$active_session_history where sql_id = '5xyrp7v4djagv';
COUNT(*) MIN(SAMPLE_TIME) MAX(SAMPLE_TIME) SYSDATE
---------- ---------------------------- ---------------------------- --------------------
1 26-OCT-20 12.18.59.828000000 26-OCT-20 12.18.59.828000000 26-OCT-2020 12:50:29
Miscellaneous Slave Info
select
decode(px.qcinst_id,NULL,'QC',
' - '||lower(substr(pp.SERVER_NAME,
length(pp.SERVER_NAME)-4,4) ) )"Username",
decode(px.qcinst_id,NULL, 'QC', '(Slave)') "QC/Slave" ,
to_char( px.server_set) "SlaveSet",
to_char(s.sid) "SID",
to_char(px.inst_id) "Slave INST",
decode(px.qcinst_id, NULL ,to_char(s.sid) ,px.qcsid) "QC SID",
to_char(px.qcinst_id) "QC INST",
px.req_degree "Req. DOP",
px.degree "Actual DOP"
from gv$px_session px,
gv$session s ,
gv$px_process pp
where px.sid=s.sid (+)
and px.serial#=s.serial#(+)
and px.inst_id = s.inst_id(+)
and px.sid = pp.sid (+)
and px.serial#=pp.serial#(+)
and s.sql_id = '5xyrp7v4djagv'
order by 6 , 1 desc
/
Usernam QC/Slav SlaveSet SID Slave INST QC SID QC INST Req. DOP Actual DOP
------- ------- ---------- --------- ---------- ---------- -------- ------------- -----------
QC QC 5885 2 5885
- p007 (Slave) 2 9672 2 5885 2 4 4
- p006 (Slave) 2 3367 2 5885 2 4 4
- p005 (Slave) 2 9395 2 5885 2 4 4
- p004 (Slave) 2 3095 2 5885 2 4 4
- p003 (Slave) 1 9111 2 5885 2 4 4
- p002 (Slave) 1 2805 2 5885 2 4 4
- p001 (Slave) 1 8828 2 5885 2 4 4
- p000 (Slave) 1 5610 2 5885 2 4 4
Recent Comments