Pythian Blog: Technical Track

Standby Log Apply Elapsed Time

This has been discussed before by my colleague Paul Moen in his article on Oracle standby recovery rate monitoring, but I recently made a discovery that makes it easier to generate both statistics on log apply performance, and more useful stats too.

First, let me say that this discovery is based on my observations and has not been verified with Oracle Support nor by any insider. If you know one who can confirm this, please ask him or her.

The discovery is a hidden column in the x$kcclh table, which is the underlying table for gv$log_history (and therefore v$log_history). It’s the only column not exposed in the fixed views. Why? I don’t know but it sure would be nice to have it exposed and clearly (officially) defined. The column is x$kcclh.lhtsm. It stores the timestamp of when the log began to be applied. The same timestamp as the one printed in the alert log for the “Media Recovery Log…” message and the same timestamp as the one recorded in the v$dataguard_status.message text column.

The advantage of accessing this information from the x$kcclh table is that you get access to more historical data (the number of days will vary based on your redo log switch rate and the size of the control files and possibly on other factors).

The time difference between two consecutive logs, say 1 and 2, gives us the time spent applying the log 1. At least that’s the theory. Some factors that we can’t really take into account are the log delay set in the log_archive_dest_n parameter on the primary db (it’s an optional setting), and any unexpected events, such as an MRP0 process getting hung or a FAL server having trouble to send a log, etc.

Nevertheless, I feel it is very useful information. Any anomaly can be easily identified as it would come up with an unusual elapse time.

Here’s the query I use to get the log apply elapse time:

set linesize 80
select b.lhseq sequence#, b.lhtsm fromtime, a.lhtsm totime,
 ((to_date(a.lhtsm,'MM/DD/RR HH24:MI:SS') - 
   to_date(b.lhtsm,'MM/DD/RR HH24:MI:SS')) * 1440) minutes_to_apply
  from sys.x$kcclh a, sys.x$kcclh b
 where a.lhrid = b.lhrid + 1
 order by a.lhrid
;

Sample output:

SEQUENCE# FROMTIME            TOTIME              MINUTES_TO_APPLY
--------- ------------------- ------------------- ----------------
    97077 09/18/2008 03:28:32 09/18/2008 03:43:31       14.9833333
    97078 09/18/2008 03:43:31 09/18/2008 03:58:31               15
    97079 09/18/2008 03:58:31 09/18/2008 04:13:31               15
    97080 09/18/2008 04:13:31 09/18/2008 04:28:31               15
    97081 09/18/2008 04:28:31 09/18/2008 04:43:33       15.0333333
    97082 09/18/2008 04:43:33 09/18/2008 04:58:33               15
    97083 09/18/2008 04:58:33 09/18/2008 05:13:36            15.05
    97084 09/18/2008 05:13:36 09/18/2008 05:28:33            14.95
    97085 09/18/2008 05:28:33 09/18/2008 05:43:31       14.9666667
    97086 09/18/2008 05:43:31 09/18/2008 05:58:31               15

Here’s a query to show the min, max and average elapse time:

set linesize 140
col min_log_date newline
col max_log_date newline
col max_log_apply_duration for a30 newline
col min_log_apply_duration for a30 newline
col avg_log_apply_duration for a30 newline
select min(lhlot) min_log_date, max(lhlot) max_log_date,
       numtodsinterval(max(to_number(a_lhtsm - b_lhtsm)),'DAY') max_log_apply_duration,
       numtodsinterval(min(to_number(a_lhtsm - b_lhtsm)),'DAY') min_log_apply_duration,
       numtodsinterval(avg(to_number(a_lhtsm - b_lhtsm)),'DAY') avg_log_apply_duration
  from (
        select a.lhlot lhlot, 
               to_date(a.lhtsm,'MM/DD/RR HH24:MI:SS') a_lhtsm, 
               to_date(b.lhtsm,'MM/DD/RR HH24:MI:SS') b_lhtsm,
               a.lhrid
          from sys.x$kcclh a, sys.x$kcclh b
         where a.lhrid = b.lhrid + 1
       )
 order by lhrid;

Sample output:

MIN_LOG_DATE        
--------------------
MAX_LOG_DATE
--------------------
MAX_LOG_APPLY_DURATION
------------------------------
MIN_LOG_APPLY_DURATION
------------------------------
AVG_LOG_APPLY_DURATION
------------------------------
09/18/2008 04:28:30  
10/21/2008 15:06:56  
+000000000 04:16:55.999999999  
+000000000 00:00:05.999999999  
+000000000 00:11:47.666911225

The following query shows logs that took over 30 minutes to apply:

select a.lhseq, b.lhtsm fromtime, a.lhtsm totime,
       ((to_date(a.lhtsm,'MM/DD/RR HH24:MI:SS') -
         to_date(b.lhtsm,'MM/DD/RR HH24:MI:SS')) * 1440) minutes_to_apply
  from sys.x$kcclh a, sys.x$kcclh b
 where to_date(a.lhlot,'MM/DD/RR HH24:MI:SS') > sysdate - 60
   and a.lhrid = b.lhrid + 1
   and  ((to_date(a.lhtsm,'MM/DD/RR HH24:MI:SS') -
          to_date(b.lhtsm,'MM/DD/RR HH24:MI:SS')) * 1440) > 30
order by a.lhrid
;

Sample output:

 LHSEQ FROMTIME             TOTIME               MINUTES_TO_APPLY
------ -------------------- -------------------- ----------------
 98276 09/28/2008 08:04:41  09/28/2008 11:04:41               180
 98278 09/28/2008 11:25:24  09/28/2008 12:07:06              41.7
 98280 09/28/2008 12:15:35  09/28/2008 14:23:01        127.433333
 98294 09/28/2008 14:33:29  09/28/2008 15:13:33        40.0666667
 98297 09/28/2008 15:51:18  09/28/2008 16:31:42              40.4
 98423 09/28/2008 19:36:32  09/28/2008 20:20:56              44.4
 98533 09/28/2008 22:16:22  09/28/2008 22:54:11        37.8166667
 98662 09/29/2008 02:29:11  09/29/2008 03:37:19        68.1333333
 98663 09/29/2008 03:37:19  09/29/2008 04:13:37              36.3
 98664 09/29/2008 04:13:37  09/29/2008 04:44:14        30.6166667
 98669 09/29/2008 05:54:33  09/29/2008 07:19:25        84.8666667
100220 10/13/2008 07:50:41  10/13/2008 08:50:27        59.7666667
100422 10/15/2008 06:10:58  10/15/2008 10:27:54        256.933333
100537 10/16/2008 08:40:24  10/16/2008 10:05:32        85.1333333

I hope you find these queries useful for monitoring your standby database log apply rate and efficiency.

Enjoy!
Marc.

No Comments Yet

Let us know what you think

Subscribe by email