Pythian Blog: Technical Track

Resolving High Water Enqueue Contention

Recently, I had a few email exchanges — on the Oracle-l list and offline — about HW enqueue contention. A few interesting observations emerged from test cases I created during that discussion.

HW Enqueue

When a session needs access to a resource, it requests a lock on that resource in a specific mode. Internally, lock and resource structures are used to control access to a resource. Enqueues, as the name suggests, have a First In/First Out queuing mechanism. You can find more information about the internals of locks in my paper, printed in 2001, Internals of locks.

Segments have a High Water Mark (HWM) indicating that blocks below that HWM have been formatted. New tables or truncated tables (that is, truncated without a reuse storage clause), have the HWM value set to the segment header block — meaning, there are zero blocks below the HWM. As new rows are inserted or existing rows updated (increasing row length), more blocks are added to the free lists, and the HWM is bumped up to reflect these new blocks. HW enqueues are acquired in Exclusive mode before updating the HWM, and essentially, HW enqueues operate as a serializing mechanism for HWM updates.

In non-ASSM tablespaces, the HWM is bumped up by five blocks at a time. (Actually, the undocumented parameter _bump_highwater_mark_count controls this behavior, and it defaults to five.) Heavy inserts into a table can result in increased HWM activity leading to HW enqueue contention. This issue is prevalent if the table has LOB columns or if the row length is large.

Measuring HW Enqueue Contention

We will use a few test cases to see how the underlying extent size and table structures are affecting HW enqueue contention. But first we need to find a way to measure the total number of gets on HW enqueue. If the total number of gets on the HW enqueue is reduced, enqueue contention can be relieved.

The fixed table x$ksqst stores statistics about the total number of enqueue gets, and the successes and failures of those gets at the instance level. For example, to see total number of gets on HW enqueue, the following query can be used. (The ksqstreq column indicates total number of gets; ksqstwat shows the total number of waits.)

SQL> select ksqstreq, ksqstwat from x$ksqst where ksqsttyp='HW';

  KSQSTREQ   KSQSTWAT
---------- ----------
    546953         50

From Oracle version 10g and above, x$ksqst is externalized as v$enqueue_statistics.

These statistics, however, show activity at the instance level. While we can use these statistics to measure HW enqueue activity, we need to make sure that there is no other session acquiring the HW enqueue. Event 10704 can be used to trace enqueues, and every call to get an enqueue prints a few lines in the trace file. The SQL statement to dump this information to trace file is:

alter session set events '10704 trace name context forever, level 15';

Event 10704 is documented as below:

10704, 00000, "Print out information about what enqueues are being obtained"
// *Cause:  When enabled, prints out arguments to calls to ksqcmi and
//          ksqlrl and the return values.
// *Action: Level indicates details:
//   Level: 1-4: print out basic info for ksqlrl, ksqcmi
//          5-9: also print out stuff in callbacks:  ksqlac, ksqlop
//          10+: also print out time for each line

A few lines from the trace files are printed below. ksq is the internal Oracle module name for enqueues, and ksqgtl is to get locks on a resource. From the lines below, we can see that the HW enqueue is acquired in mode 6, exclusive mode. A timestamp is also printed since we enabled this event at level 15. If we need count the number of HW enqueue gets, we need to count the occurrences of the string HW- in the trace file.

*** 2008-05-04 10:08:35.734
ksqgtl *** HW-00000007-01800014 mode=6 flags=0x11 timeout=21474836 ***
ksqgtl: xcb=0x1E283158, ktcdix=2147483647, topxcb=0x1E283158
	ktcipt(topxcb)=0x0

*** 2008-05-04 10:08:35.734
ksucti: init session DID from txn DID:
ksqgtl:
	ksqlkdid: 0001-0014-00000016

*** 2008-05-04 10:08:35.734
*** ksudidTrace: ksqgtl
	ktcmydid(): 0001-0014-00000016
	ksusesdi:   0000-0000-00000000
	ksusetxn:   0001-0014-00000016
ksqgtl: RETURNS 0

*** 2008-05-04 10:08:35.750
ksqrcl: HW,7,1800014
ksqrcl: returns 0

Now, we can grep for HW- in the trace file, count it, and match that against v$enqueue_statistics. The following test case illustrates this method using an example table:

SQL> select ksqstreq, ksqstwat from x$ksqst where ksqsttyp='HW';

  KSQSTREQ   KSQSTWAT
---------- ----------
    546198         50

SQL> insert into test_hw
     select n, lpad(n, 4000,'a') v1 from
     (select level n from dual connect by level  SQL> SQL> commit;

Commit complete.

SQL> select ksqstreq, ksqstwat from x$ksqst where ksqsttyp='HW';

  KSQSTREQ   KSQSTWAT
---------- ----------
    546953         50

So there is a difference of 755 (546953-546198) total gets to the HW enqueue. Searching for HW- enqueues in the trace file also prints 755.

/oracle/app/oracle/admin/TEST1/udump> grep 'HW-' test1_ora_26668.trc |wc -l
755

So, in my test database, my session is the only session, and we could query v$enqueue_statistics for our test cases.

Test Case

The following test case will be used to see the impact of extent size and segment management on HW enqueue gets. If the total number of HW enqueue gets can be reduced, contention can be relieved.

The script creates a tablespace, creates a table with a LOB column in that tablespace, inserts 9999 rows into that table and prints the difference in total enqueue gets before and after the insert statement.

-- Script enq_hw1.sql -----------
variable v_begin_cnt number
variable v_end_cnt number

prompt
prompt  Tablespace: Locally managed with \&1
prompt
drop tablespace TS_LMT_HW including contents and datafiles;

create tablespace TS_LMT_HW datafile 'D:\ORACLE\ORADATA\ORCL11G\TS_LMT_AUTO_1M_01.DBF'
 size 200M
 extent management local
\&1;

create table test_hw (n1 number ,  c1 clob )     tablespace TS_LMT_HW;

begin
  select total_req# into :v_begin_cnt from v$enqueue_statistics where 	eq_type ='HW';
end;
/
insert into test_hw
 select n, lpad(n, 4000,'a') v1 from
   (select level n from dual connect by level <10000);
commit;
select total_req#, succ_req#, failed_req# from v$enqueue_statistics where 	eq_type ='HW';

begin
  select total_req# into :v_end_cnt from v$enqueue_statistics where 	eq_type ='HW';
end;
/

select :v_end_cnt - :v_begin_cnt diff from dual;

--- script end enq_hw1.sql -----------

This script is called by the following script passing various tablespace attributes, essentially running same test for different types of tablespaces:

spool call_eng_hw1.lst
REM Test cases #1 through #6
@enq_hw1 "uniform size 5M segment space management manual"
@enq_hw1 "uniform size 5M segment space management auto"
@enq_hw1 "uniform size 40K segment space management manual"
@enq_hw1 "uniform size 40K segment space management auto"
@enq_hw1 "autoallocate segment space management manual"
@enq_hw1 "autoallocate segment space management auto"
spool off

Test Results

I tested above script for tablespace in versions 10.2.0.4 and 11.1.0.6. Here are the test results:

test Extent management Segment space management 10.2.0.4 11.1.0.6
#1 uniform size 5M manual 2049 64
#2 uniform size 5M auto 50 48
#3 uniform size 40K manual 6604 6045
#4 uniform size 40K auto 7552 7554
#5 autoallocate manual 2231 279
#6 autoallocate auto 269 291

There are a few key points here.

  1. In test cases #1 and #5, there is a dramatic decrease in enqueue gets between 10g and 11g for a tablespace with segment space management set to “manual”. It looks like a new feature — faster LOB — is kicking in. I tested the above script for a table without LOB column. There is virtually no difference between 10g and 11g if the table has no LOB column. Row length is kept around 3000 bytes for this test.
      create table test_hw (n1 number ,  v1 varchar2(1000), v2 varchar2(1000), v3 varchar2(1000) )     tablespace TS_LMT_HW;
    ...
    declare
       i number;
    begin
      for i in 1 .. 10000
       loop
        insert into test_hw  values ( i, lpad(i, 1000, 'a'), lpad(i, 1000, 'a'), lpad(i, 1000, 'a')  );
    	commit;
       end loop;
    end;
    /
    ...
    test Extent management Segment space management 10.2.0.4 11.1.0.6
    #1 uniform size 5M manual 1020 1019
    #2 uniform size 5M auto 33 31
    #3 uniform size 40K manual 3004 3004
    #4 uniform size 40K auto 3055 3055
    #5 autoallocate manual 1132 1132
    #6 autoallocate auto 163 164
  2. If the tablespace uniform size is too low, then the number of HW enqueue gets increases sharply. Compare test cases #1 & #3 for 11g. The enqueue gets decreased ten times for a tablespace with an appropriate extent size.
  3. In all test cases, automatic segment space management tablespaces performed fewer enqueue gets. In ASSM tablespaces, the HWM is bumped up by a much higher number. I am not suggesting the use of ASSM, since space-waste can increase in ASSM tablespaces. However, if resolving HW contention is a primary issue, then consider ASSM.
  4. Allocating an additional extent with an instance keyword seems to help in non-ASSM tablespace (The behavior is different for an ASSM tablespace; it needs more research before I understand that.) In test case #1 above, the HWM for the table was set at file_id 6, block_id 80, and that extent starting at file 6, block_id 9. That is, 70 blocks were initially below the HWM. After allocating an extent with the “instance 1” keyword, the HWM was bumped up by 1209 blocks. (Yong has a test case for this using the dbms_space package.)From the segment header block, before allocating extent: HWM: 0x01800050 (dba for file 6,block 80).
    alter table test_hw allocate extent (instance 1);

    From the segment header block, after allocating extent: HWM: 0x01802f89 (dba for file 6,block 12169).

    Extent Map
    -----------------------------------------------------------------
    0x0180000a  length: 639
    0x01802d09  length: 640
  5. High HW enqueue contention is also prevalent during upgrades of Oracle applications. During one of our recent upgrades from 11.5.8 to 11.5.0, there was a heavy HW enqueue contention on the sys.source$ table due to the many packages being compiled/added during the upgrade. The only option was to increase _bump_highwater_mark_count during the upgrade to relieve excessive waits on HW enqueue contention. Of course, Oracle support must be contacted before adding any underscore parameters in a production environment. The side effect of setting this parameter is that, for smaller tables, more blocks might be added below the HWM, leading to a full table scan reading more blocks than is necessary.

No Comments Yet

Let us know what you think

Subscribe by email