Pythian Blog: Technical Track

MERGE Without an INSERT -- It's Not Always Like an UPDATE

Before you proceed with reading this post, I strongly encourage you to read Tom Kyte's trilogy about write consistency, since I'll do only a brief introduction to the subject. The way Oracle ensures UPDATE write consistency is through a mechanism called restart. The restart takes place when UPDATE detects that the current version of the column data used in the WHERE clause no longer matches the consistent version. That is, the data were changed after the UPDATE was started. Let's take a look at an example before we proceed with the main topic of this blog post.
SQL> create table t
  2 (
  3 n number,
  4 m number
  5 );
 
 Table created.
 
 SQL> insert into t values (1,1);
 
 1 row created.
 
 SQL> insert into t values (1,2);
 
 1 row created.
 
 SQL> commit;
 
 Commit complete.
The simplest way to demonstrate that the restart is actually happening is to use the before update row trigger.
SQL> create trigger bu_t before update on t for each row
  2 begin
  3 dbms_output.put_line(rowidtochar(:old.rowid));
  4 end;
  5 /
 
 Trigger created.
All we need now is two sessions doing a concurrent update with our table data:
SQL> select * from t;
 
  N M
 ---------- ----------
  1 1
  1 2
The "order" of rows is important since we would like to block our UPDATE on the second row, after it has updated the first row. Session One
SQL> update t set n=m*3 where m=2;
 
 1 row updated.
Session Two
SQL> set serveroutput on
 SQL> update t set m=n*3 where n=1;
 
 --this session is blocked on the first session
 --commit the first session now and you should see:
 
 AAAMLCAABAAANL6AAA
 AAAMLCAABAAANL6AAB
 AAAMLCAABAAANL6AAA
 
 1 row updated.
Our trigger fired three times for a two-row table and we updated only one row. The trigger started, updated the first row ( one), and got blocked on the second row. After the first session issued a commit, the update proceeded with the second row ( two), but discovered that the data had changed. So UPDATE did a rollback and started the whole process from scratch, assuming the new start time. It passed through first row again ( three), and skipped the second row since it was filtered out by the where clause. The final result from those two transactions should be:
SQL> select * from t;
 
  N M
 ---------- ----------
  1 3
  6 2
Now the question, continuing my idea, " simple things can be not-so-simple". Will there be any difference if we substitute the following MERGE for the last UPDATE?
merge into t
 using (select null from dual)
 on (n=1)
 when matched then update set m=n*3;
This substitution is often used to get rid of an ORA-01779 error when you know you have the right data but can't proceed with an UPDATE due to the data model or some other restrictions. (There is a hint for an update to bypass the ORA-01779 check, but it's undocumented and I would recommend against using it.) Restore the table data, run update from the first session, and then substitute MERGE for UPDATE in a second session:
SQL> merge into t
  2 using (select null from dual)
  3 on (n=1)
  4 when matched then update set m=n*3;
 
 --this session is blocked by on first session
 --commit the first session now and you should see:
 
 AAAMHXAABAAANL6AAA
 AAAMHXAABAAANL6AAB
 
 2 rows merged.
Ok, so where is our restart? And if we look at the data…
SQL> select * from t;
 
  N M
 ---------- ----------
  1 3
  6 3
…it looks different. The MERGE actually did an update to a second row using the consistent value of N. I crafted an update as "multiply the other column by three" on purpose -- you should not have N being equal to M*2 for updated rows. Does that all mean that there is no restart for MERGE at all? No, it doesn't. Revert to the original data set but change the UPDATE in the first session to: Session One
SQL> update t set m=n*3 where m=2;
 
 1 row updated.
Now run the same MERGE in the second session: Session Two
SQL> merge into t
  2 using (select null from dual)
  3 on (n=1)
  4 when matched then update set m=n*3;
 
 --this session is blocked on the first session
 --commit the first session now and you should see:
 
 AAAMHXAABAAANL6AAA
 AAAMHXAABAAANL6AAB
 AAAMHXAABAAANL6AAA
 AAAMHXAABAAANL6AAB
 
 2 rows merged.
MERGE has a restart, but not for columns listed in the on part of the statement. So we cannot substitute our original update with this MERGE. Is there a way to write a MERGE that behave the same as our update? Yes there is, and the previous statement can give us some clue. Let's revert to original data set and try again. Session One
SQL> update t set n=m*3 where m=2;
 
 1 row updated.
Session Two
SQL> merge into t
  2 using (select null from dual)
  3 on (1=1)
  4 when matched then update set m=n*3 where n=1;
 
 --this session is blocked on the first session
 --commit the first session now and you should see:
 
 AAAMHXAABAAANL6AAA
 AAAMHXAABAAANL6AAB
 AAAMHXAABAAANL6AAA
 
 1 row merged.
Looks familiar, right? And our data set…
SQL> select * from t;
 
  N M
 ---------- ----------
  1 3
  6 2
…is what we need. Apparently, the MERGE restart happens it two cases. The only thing I don't like about this MERGE is the on (1=1) clause, since it will always lead to an FTS against our table even if a more optimal execution plan exists. To overcome this limitation, we need to combine the best of both worlds and rewrite our MERGE as:
merge into t
 using (select null from dual)
 on (n=1)
 when matched then update set m=n*3 where n=1;
This version can use N as an access predicate, and it has a restart in the desired situation: making MERGE behave as an UPDATE. Now you know that adding a where n=1 (redundant at first sight), in addition to an on (n=1) changes how MERGE handles write consistency. P.S.: All tests were performed on 10.2.0.3.

No Comments Yet

Let us know what you think

Subscribe by email