Pythian Blog: Technical Track

Oracle parallel query hints reference - part 4: GBY_PUSHDOWN

Welcome to part 4 of the series. The GBY_PUSHDOWN is a very interesting hint, introduced in 10.2 - it doesn't have the PQ prefix, but nevertheless plays a crucial role in Parallel query plans. Purpose: When GBY_PUSHDOWN is activated, an extra "aggregation" step is introduced in the plan, before the PQ workers that read the data send the data to the PQ receivers after a re-shuffle. Example NO PUSHDOWN: [code] select /*+PARALLEL(4) NO_GBY_PUSHDOWN */ mod5_id, count(*) from tlarge group by mod5_id; -------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | E-Rows |E-Bytes| Cost (%CPU)| E-Time | TQ |IN-OUT| PQ Distrib | -------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 5 | 15 | 640 (1)| 00:00:01 | | | | | 1 | PX COORDINATOR | | | | | | | | | | 2 | PX SEND QC (RANDOM) | :TQ10001 | 5 | 15 | 640 (1)| 00:00:01 | Q1,01 | P->S | QC (RAND) | | 3 | HASH GROUP BY | | 5 | 15 | 640 (1)| 00:00:01 | Q1,01 | PCWP | | | 4 | PX RECEIVE | | 100K| 292K| 638 (0)| 00:00:01 | Q1,01 | PCWP | | | 5 | PX SEND HASH | :TQ10000 | 100K| 292K| 638 (0)| 00:00:01 | Q1,00 | P->P | HASH | | 6 | PX BLOCK ITERATOR | | 100K| 292K| 638 (0)| 00:00:01 | Q1,00 | PCWC | | | 7 | TABLE ACCESS STORAGE FULL| TLARGE | 100K| 292K| 638 (0)| 00:00:01 | Q1,00 | PCWP | | -------------------------------------------------------------------------------------------------------------------------- [/code] Example PUSHDOWN [code] select /*+PARALLEL(4) GBY_PUSHDOWN */ mod5_id, count(*) from tlarge group by mod5_id; --------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | E-Rows |E-Bytes| Cost (%CPU)| E-Time | TQ |IN-OUT| PQ Distrib | --------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 5 | 15 | 640 (1)| 00:00:01 | | | | | 1 | PX COORDINATOR | | | | | | | | | | 2 | PX SEND QC (RANDOM) | :TQ10001 | 5 | 15 | 640 (1)| 00:00:01 | Q1,01 | P->S | QC (RAND) | | 3 | HASH GROUP BY | | 5 | 15 | 640 (1)| 00:00:01 | Q1,01 | PCWP | | | 4 | PX RECEIVE | | 5 | 15 | 640 (1)| 00:00:01 | Q1,01 | PCWP | | | 5 | PX SEND HASH | :TQ10000 | 5 | 15 | 640 (1)| 00:00:01 | Q1,00 | P->P | HASH | | 6 | HASH GROUP BY | | 5 | 15 | 640 (1)| 00:00:01 | Q1,00 | PCWP | | | 7 | PX BLOCK ITERATOR | | 100K| 292K| 638 (0)| 00:00:01 | Q1,00 | PCWC | | | 8 | TABLE ACCESS STORAGE FULL| TLARGE | 100K| 292K| 638 (0)| 00:00:01 | Q1,00 | PCWP | | --------------------------------------------------------------------------------------------------------------------------- [/code] See the extra step #6 HASH GROUP BY. Each PQ reader will read a set of BLOCKs and aggregate the data in its local memory space by mdo5_id. As a reminder, there are only 5 values for mod5_id, so the memory for aggregation will be very small. When there is no more work assigned for the PQ reader process, each PQ reader will distribute its aggregated data based on the hash key for final aggregation. This is best illustrated with the communication report for each query (V$PQ_TQSTAT report). Note that this can only be extracted from the session that ran the SQL. [code] v$PQ_TQSTAT query: select dfo_number "d", tq_id as "t", server_type, num_rows,rpad('x',round(num_rows*10/nullif(max(num_rows) over (partition by dfo_number, tq_id, server_type),0)),'x') as "pr", round(bytes/1024/1024) mb, process, instance i,round(ratio_to_report (num_rows) over (partition by dfo_number, tq_id, server_type)*100) as "%", open_time, avg_latency, waits, timeouts,round(bytes/nullif(num_rows,0)) as "b/r" from v$pq_tqstat order by dfo_number, tq_id, server_type desc, process; [/code] [code] select /*+PARALLEL(2) NO_GBY_PUSHDOWN */ mod5_id, count(*) from tlarge t group by mod5_id; d, t, SERVER_TYPE, NUM_ROWS, pr, MB, PROCESS, I, %, OPEN_TIME, AVG_LATENCY, WAITS, TIMEOUTS, b/r 1 0 Producer 53608 xxxxxxxxxx 0 P002 1 54 0 0 37 18 4 1 0 Producer 46392 xxxxxxxxx 0 P003 1 46 0 0 36 18 4 1 0 Consumer 60000 xxxxxxxxxx 0 P000 1 60 0 0 188 185 4 1 0 Consumer 40000 xxxxxxx 0 P001 1 40 0 0 188 185 4 1 1 Producer 3 xxxxxxxxxx 0 P000 1 60 0 0 24 12 16 1 1 Producer 2 xxxxxxx 0 P001 1 40 0 0 16 8 20 1 1 Consumer 5 xxxxxxxxxx 0 QC 1 100 0 0 3 0 17 [/code] [code] select /*+PARALLEL(2) GBY_PUSHDOWN */ mod5_id, count(*) from tlarge t group by mod5_id; d, t, SERVER_TYPE, NUM_ROWS, pr, MB, PROCESS, I, %, OPEN_TIME, AVG_LATENCY, WAITS, TIMEOUTS, b/r 1 0 Producer 5 xxxxxxxxxx 0 P002 1 50 0 0 0 0 25 1 0 Producer 5 xxxxxxxxxx 0 P003 1 50 0 0 0 0 25 1 0 Consumer 6 xxxxxxxxxx 0 P000 1 60 0 0 84 81 24 1 0 Consumer 4 xxxxxxx 0 P001 1 40 0 0 84 81 28 1 1 Producer 3 xxxxxxxxxx 0 P000 1 60 0 0 2 1 16 1 1 Producer 2 xxxxxxx 0 P001 1 40 0 0 2 1 20 1 1 Consumer 5 xxxxxxxxxx 0 QC 1 100 0 0 3 0 17 [/code] Notice how for the query WITHOUT the PUSHDOWN each of the producers sent 53,608 and 46,392 rows to each consumer. Given that I have five values and two consumers, one of the consumers had 20,000 extra values, so the consumers took 60,000 and 40,000 rows each. While in the case WITH the PUSHDOWN, the producers only produced five rows, significantly reducing the amount of data exchanged between the Parallel Query workers. The drawback of this optimization is if the number of unique values in the groups identified by the columns in the "group by" clause (i.e. each group has very few members), then this optimization will do more work. But in most cases, the savings are quite significant. NOTE: This is a cost-based optimization, so it's very possible for the Optimiser to choose NOT to use it - when in fact it should.

No Comments Yet

Let us know what you think

Subscribe by email