Adaptive Log File Sync: Oracle, Please Don't Do That Again
Disclaimer: Much of what follows is pure speculation on my part. It could be completely wrong, and I'm putting it out there in the hopes that it'll eventually be proven one way or the other.
The Summary
- Underscore parameter _use_adaptive_log_file_sync
- Default value changed in 11.2.0.3 from FALSE to TRUE
- Dynamic parameter
- Enables a new method of communication for LGWR to notify foreground processes of commit
- Old method used semaphores, LGWR had to explicitly "post" every waiting process
- New method has the FG processes sleep and "poll" to see if commit is complete
- Advantage is to free LGWR from CPU work required to inform lots of processes about commits
- LGWR dynamically switches between old and new method based on load and responsiveness
- Method can switch frequently at runtime, max frequency is 3 switches per minute (configurable)
- Switch is logged in LGWR tracefile, we have seen several switches per day
- Few problems in general, possible issues seem to be in RAC and/or the switching process itself
The Story
We're working with a customer who had a very successful Exadata go-live last weekend. They moved a large application mostly unchanged from 9i to Exadata - and we've generally had very positive feedback. (!!) It might be that just getting on modern hardware and software accounts for this, but it still gives everyone a good feeling. Nevertheless, there are always a few little adventures. One of these was a mysterious orange glob on the Cloud Control radar Monday at 1am. Orange is the "commit" class - so this was a little surprising! (And reminiscent of Karl Arao's famous halloween monster!) A few drill-down clicks and we're looking at "log file sync" events. To make a long story short, it wasn't any of the usual suspects, and the necessary diagnostics info wasn't available after the fact to say what happened with certainty. We've done a fairly comprehensive incident report and it hasn't recurred... So I'm happy for now. Anyway, I haven't gotten to the really fun part yet. The really fun part is the #everydaylearning as Yury would say. In the process of analyzing the orange glob, I ended up investigating whether it might be related to a new feature called Adaptive Log File Sync. I'd never heard of this before so I had quite a bit of fun learning, and it seemed worthwhile to share a few things I found. Before I dive in, one other important thing: I have to mention Christo Kutrovsky. We were both on-site and digging into this feature at the same time. There was a constant exchange of competing theories and different ideas about how things might work. Together we assembled a strong, in-depth, and working theory very, very quickly - I love working on a great team.Log File Sync
First off, there are a few basic starting points for anyone troubleshooting time spent in the log file sync event.- For past problems, use AWR reports and ASH data.
- For live problems, use sql trace and Tanel's snapper.
- Read Riyaj Shamsudeen's excellent article about Tuning 'log file sync' Event Waits.
- Read Kevin Closson's excellent article about LGWR processing.
- Note 34592.1 is a generic reference note explaining the wait event.
- Note 137696.1 is a useful troubleshooting guide.
- Note 1064487.1 is a useful script to collect some diagnostic info for LFS troubleshooting.
Adaptive Log File Sync
Note 1462942.1 describes a feature whereby LGWR can switch between log write methods. This feature is enabled through an underscore parameter called _use_adaptive_log_file_sync and the description of this parameter is: adaptively switch between post/wait and polling. Now I'll be honest: searching on the internet and searching in Oracle Support's KnowledgeBase for "adaptive log file sync" yields almost nothing. I do believe in the principle of BAAG - but this is a case where some guesswork might be useful, especially to guide experimentation that could nail down more concrete answers. Hence the disclaimer at the top of this article. The words "post" and "wait" indicate that we're talking about semaphores. For some general background on semaphores, check out the Wikipedia article and for more detail about the unix post() and wait() calls, a good resource is chapter 30 from Andrea Arpaci-Dusseau's (UW Madison) textbook on Operating Systems. Oracle uses semaphores extensively. In fact, LGWR and commits are specifically mentioned as an example in the Performance Tuning Guide - and Oracle even has an API to replace semaphore use with a third-party driver for lightweight post-wait implementations. (Like an ODM for your OS kernel.) If you look for references to "commit" in the Oracle docs, you'll find the word "post" everywhere when they talk about communication between the foreground processes and LGWR. Now, remember that a COMMIT has two options: first, IMMEDIATE or BATCH and second, WAIT or NOWAIT. It looks like this to me:- Immediate: FG process will post to LGWR, triggering I/O (default)
- Batch: FG process will not post LGWR
- Wait: LGWR will post FG process when I/O is complete (default)
- NoWait: LGWR will not post FG process when I/O is complete
"LGWR is unable to post the processes fast enough, due to excessive commits. It is quite possible that there is no starvation for CPU or memory, and that I/O performance is decent enough. Still, if there are excessive commits, LGWR has to perform many writes/semctl calls, and this can increase ‘log file sync’ waits. This can also result in sharp increase in redo wastage statistics."Maybe Oracle is reading Riyaj's blog? It appears that they came up with a new algorithm where LGWR doesn't post. My guess: foreground processes can probably still post LGWR - but LGWR never posts back. Instead, foreground processes in WAIT mode "poll" either a memory structure or LGWR itself. It could be an in-house implementation by Oracle, it could still use semaphores, or it could use message queues somehow [seems like a long shot but the unix poll() call is found there]. There's one interesting challenge that I can think of in implementing this. With the semaphore approach, all the active commits are *sleeping* (off the CPU) while LGWR flushes the log buffer. There could be dozens or hundreds of foreground processes simultaneously commiting on a very busy system. If we switch to a polling method, how do we ensure that these hundreds of processes don't start spinning and steal CPU from LGWR, making the whole system even worse than it was in the beginning? The answer might lie in a quick search for "adaptive_log_file_sync" from the underscore parameters. There are five more hidden parameters with that string:
select a.ksppinm name, b.ksppstvl value, a.ksppdesc description from sys.x$ksppi a, sys.x$ksppcv b where a.indx = b.indx and a.ksppinm like '\_%adaptive\_log%' escape '\' order by name
Name | Default | Description |
---|---|---|
_adaptive_log_file_sync_high_switch_freq_threshold | 3 | Threshold for frequent log file sync mode switches (per minute) |
_adaptive_log_file_sync_poll_aggressiveness | 0 | Polling interval selection bias (conservative=0, aggressive=100) |
_adaptive_log_file_sync_sched_delay_window | 60 | Window (in seconds) for measuring average scheduling delay |
_adaptive_log_file_sync_use_polling_threshold | 200 | Ratio of redo synch time to expected poll time as a percentage |
_adaptive_log_file_sync_use_postwait_threshold | 50 | Percentage of foreground load from when post/wait was last used |
Name | Default | Description |
---|---|---|
_fg_sync_sleep_usecs | 0 | Log file sync via usleep |
*** 2012-10-16 01:47:50.289 kcrfw_update_adaptive_sync_mode: poll->post current_sched_delay=0 switch_sched_delay=1 current_sync_count_delta=8 switch_sync_count_delta=59 *** 2012-10-16 01:47:50.289 Log file sync switching to post/wait Current approximate redo synch write rate is 2 per sec *** 2012-10-16 02:51:19.285 kcrfw_update_adaptive_sync_mode: post->poll long#=51 sync#=352 sync=4600 poll=1061 rw=500 rw+=500 ack=7 min_sleep=1061 *** 2012-10-16 02:51:19.285 Log file sync switching to polling Current scheduling delay is 1 usec Current approximate redo synch write rate is 117 per secI mentioned that the underscore parameters were introduced in 11g; however, my current understanding is that only in 11.2.0.3 did the default value of _use_adaptive_log_file_sync change to true!! Is it comforting that they waited so long, or is it scary that they'd make this change with no available documentation or troubleshooting information? I guess it depends on your perspective. Here's mine:
Explanation Please?
- This is a change of critical, core code that impacted every single database installation and upgrade.
- I think it is a strange departure from Oracle’s normal practices to change such important code with absolutely no announcement, documentation, or technical troubleshooting information even in their customer-only support database.
- In my opinion this requires explanation; they can do (and have done) much better.
Share this
You May Also Like
These Related Stories
Merge Replication Identity Range Management - An Identity Disorder? Part 1
Merge Replication Identity Range Management - An Identity Disorder? Part 1
Jan 24, 2014
10
min read
Pythian Postcards: Selamat Datang Ke Singapore!
Pythian Postcards: Selamat Datang Ke Singapore!
Nov 4, 2020
9
min read
Getting started with Orchestrator
Getting started with Orchestrator
Jun 22, 2018
5
min read
No Comments Yet
Let us know what you think