Pythian Blog: Technical Track

Reading execution plans for parallel DML

Parallel execution in Oracle seems to be a topic of myths, mysteries and misunderstandings. In this post I hope to clear up at least one of them by diving into a discussion about parallel DML. Actually, I want to show you two things that can go wrong with it. My demo setup and case are pretty simple, I just create two copies of dba_objects:
SQL> create table a as select * from dba_objects;
 
 Table created.
 
 SQL> create table b as select * from dba_objects;
 
 Table created.
Now let's delete from one of the tables with a subselect to the other one, and let's make it parallel. When, if and how it makes sense to actually use parallel DML for a speedup is not the point of this post. In fact, this is a poor example as the overhead of extra communication and spinning up the processes just made the actual delete run slower.
SQL> delete /*+ PARALLEL(a 42) */ from a where object_id in (select object_id from b);
 
 91277 rows deleted.
 
 SQL> rollback;
And at first glance the execution plan for it seems to confirm that the operation was performed in parallel:
PLAN_TABLE_OUTPUT
 -------------------------------------
 SQL_ID 1q2m84vq594pj, child number 0
 -------------------------------------
 delete /*+ PARALLEL(a 42) */ from a where object_id in (select
 object_id from b)
 
 Plan hash value: 1475376193
 
 --------------------------------------------------------------------------------------------------------------------
 | Id | Operation  | Name | Rows | Bytes | Cost (%CPU)| Time | TQ |IN-OUT| PQ Distrib |
 --------------------------------------------------------------------------------------------------------------------
 | 0 | DELETE STATEMENT  |  | | | 855 (100)|  | | |  |
 | 1 | DELETE  | A  | | |  |  | | |  |
 | 2 | PX COORDINATOR  |  | | |  |  | | |  |
 | 3 | PX SEND QC (RANDOM)  | :TQ10002 | 91277 | 891K| 855 (1)| 00:00:01 | Q1,02 | P->S | QC (RAND) |
 |* 4 | HASH JOIN SEMI BUFFERED|  | 91277 | 891K| 855 (1)| 00:00:01 | Q1,02 | PCWP |  |
 | 5 | PX RECEIVE  |  | 91277 | 445K| 427 (1)| 00:00:01 | Q1,02 | PCWP |  |
 | 6 | PX SEND HASH  | :TQ10000 | 91277 | 445K| 427 (1)| 00:00:01 | Q1,00 | P->P | HASH  |
 | 7 | PX BLOCK ITERATOR |  | 91277 | 445K| 427 (1)| 00:00:01 | Q1,00 | PCWC |  |
 |* 8 | TABLE ACCESS FULL | A  | 91277 | 445K| 427 (1)| 00:00:01 | Q1,00 | PCWP |  |
 | 9 | PX RECEIVE  |  | 91278 | 445K| 427 (1)| 00:00:01 | Q1,02 | PCWP |  |
 | 10 | PX SEND HASH  | :TQ10001 | 91278 | 445K| 427 (1)| 00:00:01 | Q1,01 | S->P | HASH  |
 | 11 | PX SELECTOR  |  | | |  |  | Q1,01 | SCWC |  |
 | 12 | TABLE ACCESS FULL | B  | 91278 | 445K| 427 (1)| 00:00:01 | Q1,01 | SCWP |  |
 --------------------------------------------------------------------------------------------------------------------
But a closer look reveals that the actual delete that we wanted to run in parallel is performed only after (above and to the left in the execution plan) the parallel part coordinator is done with the query. Also, because we only specified parallelism for Table A, the scan of Table B is being performed sequentially. The results are then distributed to parallel workers, and hash joined against the parallel query of A. Those results are buffered and returned to the coordinator which then serially performs the actual deletes. Sounds silly? It is. And the second issue is that even though the hint gets ignored for the DELETE part, it is still applied to the query and join, forcing a half-parallel operation.

Enable parallel DML mode

The issue to be aware of with parallel DML in Oracle is that it is disabled by default and you have to explicitly enable it in the session before running parallel DML operations. And at least DBMS_XPLAN in 12c will also warn you about not using parallel DML in it's output:
Note
 -----
  - Degree of Parallelism is 42 because of hint
  - PDML is disabled in current session
So let's try the same thing again, but enable parallel DML this time:
SQL> alter session enable parallel dml;
 
 Session altered.
 
 SQL> delete /*+ PARALLEL(a 42) */ from a where object_id in (select object_id from b);
 
 91277 rows deleted.
 
 SQL> rollback;
 
 PLAN_TABLE_OUTPUT
 -------------------------------------
 SQL_ID 1q2m84vq594pj, child number 1
 -------------------------------------
 delete /*+ PARALLEL(a 42) */ from a where object_id in (select
 object_id from b)
 
 Plan hash value: 2691386263
 
 --------------------------------------------------------------------------------------------------------------------------
 | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | TQ |IN-OUT| PQ Distrib |
 --------------------------------------------------------------------------------------------------------------------------
 | 0 | DELETE STATEMENT |  |  |  | 438 (100)|  |  |  | |
 | 1 | PX COORDINATOR |  |  |  | |  |  |  | |
 | 2 | PX SEND QC (RANDOM) | :TQ10003 | 91277 | 891K| 438 (1)| 00:00:01 | Q1,03 | P->S | QC (RAND) |
 | 3 | DELETE | A  |  |  | |  | Q1,03 | PCWP | |
 | 4 | PX RECEIVE |  | 91277 | 891K| 438 (1)| 00:00:01 | Q1,03 | PCWP | |
 | 5 | PX SEND HASH (BLOCK ADDRESS)| :TQ10002 | 91277 | 891K| 438 (1)| 00:00:01 | Q1,02 | P->P | HASH (BLOCK|
 |* 6 | HASH JOIN SEMI BUFFERED |  | 91277 | 891K| 438 (1)| 00:00:01 | Q1,02 | PCWP | |
 | 7 | PX RECEIVE |  | 91277 | 445K| 11 (0)| 00:00:01 | Q1,02 | PCWP | |
 | 8 | PX SEND HASH | :TQ10000 | 91277 | 445K| 11 (0)| 00:00:01 | Q1,00 | P->P | HASH |
 | 9 | PX BLOCK ITERATOR |  | 91277 | 445K| 11 (0)| 00:00:01 | Q1,00 | PCWC | |
 |* 10 |  TABLE ACCESS FULL | A  | 91277 | 445K| 11 (0)| 00:00:01 | Q1,00 | PCWP | |
 | 11 | PX RECEIVE |  | 91278 | 445K| 427 (1)| 00:00:01 | Q1,02 | PCWP | |
 | 12 | PX SEND HASH | :TQ10001 | 91278 | 445K| 427 (1)| 00:00:01 | Q1,01 | S->P | HASH |
 | 13 | PX SELECTOR |  |  |  | |  | Q1,01 | SCWC | |
 | 14 |  TABLE ACCESS FULL | B  | 91278 | 445K| 427 (1)| 00:00:01 | Q1,01 | SCWP | |
 --------------------------------------------------------------------------------------------------------------------------
Now we see the DELETE operation below the PX COORDINATOR, which means it gets executed in parallel by the parallel workers. B is still read serially because we only specified table A in the hint. Let me just add this 100% parallel plan for the sake of completeness...
PLAN_TABLE_OUTPUT
 -------------------------------------
 SQL_ID 02w21a0rgz02m, child number 1
 -------------------------------------
 delete /*+ PARALLEL(42) */ from a where object_id in (select object_id
 from b)
 
 Plan hash value: 149866034
 
 --------------------------------------------------------------------------------------------------------------------------
 | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | TQ |IN-OUT| PQ Distrib |
 --------------------------------------------------------------------------------------------------------------------------
 | 0 | DELETE STATEMENT |  |  |  | 23 (100)|  |  |  | |
 | 1 | PX COORDINATOR |  |  |  | |  |  |  | |
 | 2 | PX SEND QC (RANDOM) | :TQ10003 | 91277 | 891K| 23 (0)| 00:00:01 | Q1,03 | P->S | QC (RAND) |
 | 3 | DELETE | A  |  |  | |  | Q1,03 | PCWP | |
 | 4 | PX RECEIVE |  | 91277 | 891K| 23 (0)| 00:00:01 | Q1,03 | PCWP | |
 | 5 | PX SEND HASH (BLOCK ADDRESS)| :TQ10002 | 91277 | 891K| 23 (0)| 00:00:01 | Q1,02 | P->P | HASH (BLOCK|
 |* 6 | HASH JOIN SEMI BUFFERED |  | 91277 | 891K| 23 (0)| 00:00:01 | Q1,02 | PCWP | |
 | 7 | PX RECEIVE |  | 91277 | 445K| 11 (0)| 00:00:01 | Q1,02 | PCWP | |
 | 8 | PX SEND HASH | :TQ10000 | 91277 | 445K| 11 (0)| 00:00:01 | Q1,00 | P->P | HASH |
 | 9 | PX BLOCK ITERATOR |  | 91277 | 445K| 11 (0)| 00:00:01 | Q1,00 | PCWC | |
 |* 10 |  TABLE ACCESS FULL | A  | 91277 | 445K| 11 (0)| 00:00:01 | Q1,00 | PCWP | |
 | 11 | PX RECEIVE |  | 91278 | 445K| 11 (0)| 00:00:01 | Q1,02 | PCWP | |
 | 12 | PX SEND HASH | :TQ10001 | 91278 | 445K| 11 (0)| 00:00:01 | Q1,01 | P->P | HASH |
 | 13 | PX BLOCK ITERATOR |  | 91278 | 445K| 11 (0)| 00:00:01 | Q1,01 | PCWC | |
 |* 14 |  TABLE ACCESS FULL | B  | 91278 | 445K| 11 (0)| 00:00:01 | Q1,01 | PCWP | |
 --------------------------------------------------------------------------------------------------------------------------

Conclusion

Keep in mind that parallel DML has to be enabled at the session level specifically. And when reviewing execution plans with parallel DML, make sure the update, insert or delete operation is below the PX coordinator or else you will serialize again before performing the actual operation.

No Comments Yet

Let us know what you think

Subscribe by email