Log Write Methods can Cause 'log file sync' Waits (文件 ID 1462942.1)

yyp2009發表於2015-09-14

Applies to:

Oracle Database - Enterprise Edition - Version 11.2.0.3 and later
Information in this document applies to any platform.

Symptoms

  • The database version is 11.2.0.3
  • The log writer process on the database is blocking most of the other sessions and taking an average wait time of 10ms to commit.
  • On checking the hardware, the I/O wait time is very small from the hardware side
  • This means that all the time is on CPU processing the request.
  • Database is OLTP-based and has high commits/second
  • The same issue did not occur in 10g.
  • The problem seems to 'go away' temporarily after the database is restarted but ultimately returns after a few days.
  • AWR may show 'log file sync' waits as the highest wait:

    Top 5 Timed Foreground Events
    Event            Waits        Time(s)  Avg wait (ms) % DB time Wait Class
    log file sync   845,096        8,747         10          88.05        Commit
  • Hang analysis may show wait on log file sync as well:
    Chains most likely to have caused the hang:
    [a] Chain 1 Signature: 'rdbms ipc message'<='log file sync'
        Chain 1 Signature Hash: 0xcfe404d3
    [b] Chain 2 Signature: 'rdbms ipc message'<='log file sync'
        Chain 2 Signature Hash: 0xcfe404d3
    [c] Chain 3 Signature: 'rdbms ipc message'<='log file sync'
        Chain 3 Signature Hash: 0xcfe404d3
  • The stack of the LGWR process may look like following:

    STACK TRACE:
    ------------
    ksedsts()+380<-ksdxfstk()+52<-ksdxcb()+3524<-sspuser()+140<-__sighndlr()+12<-c
    all_user_handler()+992 <-sigacthandler()+104<-_syscall6()+32<-sskgpwwait()+236<-ksliwat()+1752<-kslwa
    it()+240<-ksarcv()+212 <-ksbabs()+764<-ksbrdp()+1616<-opirip()+1680<-opidrv()+748<-sou2o()+88<-opimai
    _real()+276<-ssthrdmain()+316<-main()+316<-_start()+380


Changes

 The database was upgraded from 10.2.0.4 to 11.2.0.3.1.

Cause

The following bug was filed for this issue:
Bug:13930580 Bug 13930580: LGWR IS BLOCKING SESSIONS

One of the findings in the bug indicated frequent switching back and forth between post/wait and polling as shown in the LGWR trace file:

*** 2012-04-24 19:07:09.319
kcrfw_update_adaptive_sync_mode: poll->post current_sched_delay=0
switch_sched_delay=154 current_sync_count_delta=887
switch_sync_count_delta=837

*** 2012-04-24 19:07:09.319
Log file sync switching to post/wait
Current approximate redo synch write rate is 295 per sec
.
.
.
*** 2012-04-24 22:36:36.917
Warning: log write elapsed time 828ms, size 11KB
kcrfw_update_adaptive_sync_mode: post->poll long#=79 sync#=569 sync=531877
poll=10012 rw=406 rw+=409 ack=0 min_sleep=10012

 
In this case the bug recommended disabling the _use_adaptive_log_file_sync by changing it from its default setting of true to false as a possible solution.
This parameter was introduced in Oracle 11g. However, it was not turned on to true until 11.2.0.3.
The bug describes the adaptive switching controlled by the parameter in the following way:

"
With " _use_adaptive_log_file_sync " = true in 11.2.0.3 ,adaptive switching between post/wait and polling for log file sync is implemented. When polling is adaptively selected, the polling interval is also dynamically adjusted. If post/wait is selected and the foreground processes fail to receive a post from LGWR, an incident is recorded, diagnostic traces are performed, and polling is used instead of post/wait.
"

This implies that as there are no other symptoms of issues with I/O or in other areas, the problem could be with excessive switching between post/wait and polling wait methods.

Solution

Set the parameter _use_adaptive_log_file_sync = false and restart the database:

SQL> ALTER SYSTEM SET "_use_adaptive_log_file_sync" = FALSE;


In cases where a restart is not feasible, then you can set in memory and also in the SP file for when a restart does occur:

SQL> ALTER SYSTEM SET "_use_adaptive_log_file_sync"=FALSE; -- to set in memory
SQL> ALTER SYSTEM SET "_use_adaptive_log_file_sync"=FALSE scope=sfile sid='*'; -- to set on spfile, so parameter is used at next restart.



This will disable the adaptive switching.
If this solution does not alleviate the problem then see:

Document 1376916.1 Troubleshooting: "Log File Sync" Waits

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/13750068/viewspace-1799368/,如需轉載,請註明出處,否則將追究法律責任。

相關文章