Troubleshooting I/O-related waits [ID 223117.1]

rongshiyuan發表於2013-06-08

Troubleshooting I/O-related waits [ID 223117.1]

In this Document

Purpose
Purpose
Scope and Application
Troubleshooting Steps
Troubleshooting I/O-related waits
Determining the Real Significance of I/O Wait Events
Misinterpreting the Impact of Wait Events: Examples
Example 1: Statspack before Oracle9i Release 2
Example 2: AWR after Oracle10i Release 2
General Approaches for Handling I/O Problems
Reduce the I/O requirements of the database by tuning SQL:
Reduce the I/O requirements of the database by tuning instance parameters:
Optimizing I/O at the Operating System level
Balancing the database I/O by usage of Oracle ASM (Automatic Storage Manager)
Balancing the database I/O by usage of Striping, RAID, SAN or NAS
Redistribute database I/O by manual placement of database files across different filesystems, controllers and physical devices
Reducing the data volumes of the current database by moving older data out.
Investing in more and/or faster hardware.
Datafile I/O related Wait Events
'db file sequential read'
'db file scattered read'
'db file parallel read'
Direct Path Reads and Writes
Controlfile I/O related Wait Events
'control file parallel write'
'control file sequential read' and 'control file single write'
Redo Logging I/O-Related Wait Events
'log file parallel write'
'log file sync'
'log file sequential read' and 'log file single write'
'switch logfile command' ,'log file switch completion' and 'log file switch (clearing log file)'
'log file switch (checkpoint incomplete)'
'log switch/archive' and 'log file switch (archiving needed)'
Buffer Cache I/O-Related Wait Events
'db file parallel write' , 'db file single write', 'write complete waits', 'free buffer waits'
Footnote
Who to contact for more information?
References

Applies to:

Oracle Database - Enterprise Edition - Version 7.3.4.0 and later
Oracle Database - Personal Edition - Version 7.3.4.0 and later
Oracle Database - Standard Edition - Version 7.3.4.0 and later
Information in this document applies to any platform.

Purpose

Purpose

This article provides guidelines for tuning an Oracle database when the main source of contention is I/O-related.

Scope and Application

The techniques described here can be followed when:

  • Statspack or AWR reports show I/O wait events in the "Top 5 Wait/Timed Events" section.
  • SQL Tracing with wait events of a database session shows it is limited mainly by I/O wait events.
  • Operating System tools show very high utilization or saturation of disks used for storage of database files.

The article should be of use to Database Administrators, Support Engineers,Consultants and Database Performance Analysts.

Troubleshooting Steps

Troubleshooting I/O-related waits

A critical activity in Database Performance Tuning is Response Time Analysis. This consists of finding out where time is being spent in a database.

Time is the most important property in Performance Tuning. Users perceive the performance of a system through the response time they experience for their transactions or batch jobs.

Response Time Analysis for an Oracle Database is done using the following equation:

Response Time = Service Time + Wait Time

'Service Time' is measured using the statistic 'CPU used by this session'
'Wait Time' is measured by summing up time spent on Wait Events

Note: Although similar in appearance, this equation is not the fundamental equation of Queueing Theory.

Performance Tuning methods using tools such as AWR and statspack work by evaluating the relative impact of the various components of overall Response Time and direct the tuning effort to those components having the most impact in terms of time consumed.

For a detailed discussion of this subject please refer to

Document 190124.1 THE COE PERFORMANCE METHOD

Starting with Oracle10g the above process is carried out automatically by the Automatic Database Diagnostic Monitor (ADDM). See

Document 250655.1 How to use the Automatic Database Diagnostic Monitor

Determining the Real Significance of I/O Wait Events

Many tools including AWR and Statspack produce listings of the most significant Wait Events. Previous to Oracle9i Release 2 Statspack reports contain this information in a section called "Top 5 Wait Events".

When presented with such a list of top Wait Events it sometimes becomes easy to simply start dealing with the listed Wait Events and to forget evaluating their impact on overall Response Time first.

In situations where 'Service Time' i.e. CPU usage is much more significant than 'Wait Time', it is very likely that investigating Wait Events will not produce significant savings in 'Response Time'.

Therefore, one should always compare the time taken by the top wait events to the 'CPU used by this session' and direct the tuning effort to the biggest consumers.

Starting with Oracle9i Release 2, the "Top 5 Wait Events" section has been renamed to "Top 5 Timed Events" where 'Service Time' as measured by the statistic 'CPU used by this session' is listed as 'CPU time'. This means that it is now easier to accurately measure the impact of Wait Events in overall 'Response Time' and to correctly target the subsequent tuning effort.

Misinterpreting the Impact of Wait Events: Examples

Following are 2 real life example of why it is important to look at both 'Wait Time' and 'Service Time' when investigating database performance.

Example 1: Statspack before Oracle9i Release 2

Following is the "Top 5 Wait Events" section of a Statspack report generated from two snapshots 46 minutes apart:

Top 5 Wait Events                                                             
~~~~~~~~~~~~~~~~~                                             Wait     % Total
Event                                               Waits  Time (cs)   Wt Time
-------------------------------------------- ------------ ------------ -------
direct path read                                    4,232       10,827   52.01
db file scattered read                              6,105        6,264   30.09
direct path write                                   1,992        3,268   15.70
control file parallel write                           893          198     .95
db file parallel write                                 40          131     .63
         -------------------------------------------------------------   


Based on this listing we may be tempted to immediately start looking at the causes between the 'direct path read' and 'db file scattered read' waits and to try to tune them. This approach would not take into account 'Service Time'.

Following is the statistic that measures 'Service Time' from the same report:

Statistic                                    Total   per Second    per Trans  
--------------------------------- ---------------- ------------ ------------  
CPU used by this session                   358,806        130.5     12,372.6   


Let's do some simple maths from these figures:
'Wait Time' = 10,827 x 100% / 52,01% = 20,817 cs
'Service Time' = 358,806 cs
'Response Time' = 358,806 + 20,817 = 379,623 cs

If we now calculate percentages for all the 'Response Time' components:

CPU time                    = 94.52%
direct path read            =  2.85%
db file scattered read      =  1.65%
direct path write           =  0.86%
control file parallel write =  0.05%
db file parallel write      =  0.03%


It is now obvious that the I/O-related Wait Events are not really a significant component of the overall Response Time (less than 6%) and that subsequent tuning should be directed to the Service Time component i.e. CPU consumption.

Example 2: AWR after Oracle10i Release 2

Note: Similar information is displayed in Statspack Report from Oracle 9i Release 2 onwards.

Top 5 Timed Foreground Events 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 
                                                          Avg  
                                                         wait   % DB 
Event                                 Waits     Time(s)   (ms)   time Wait Class 
------------------------------ ------------ ----------- ------ ------ ----------
DB CPU                                           33,615          82.0           
db file sequential read           3,101,013       7,359      2   18.0 User I/O  
log file sync                       472,958         484      1    1.2 Commit    
read by other session                46,134         291      6     .7 User I/O  
db file parallel read                91,982         257      3     .6 User I/O  

In AWR it is much easier to see that CPU is a significant proportion of the time becasue a CPU component is included in the "Top 5 Timed Foreground Events" section. In the above example we see again that the wait events are less than 20% of the total time and that subsequent tuning should be directed to the Service Time component i.e. CPU consumption.

General Approaches for Handling I/O Problems

After an analysis of the database's Response Time using fo example Statspack has shown that performance is limited by I/O-related Wait Events, a number of possible approaches can be followed.

Refer to the next section for the approaches to follow for each Wait Event.

Some of the approaches can be used regardless of the particular Wait Event.
In this section we present and explain the concepts and rationale behind each approach.

  • Reduce the I/O requirements of the database by tuning SQL:

    A database with no user SQL being run generates little or no I/O. Ultimately all I/O generated by a database is directly or indirectly due to the nature and amount of user SQL being submitted for execution.

    This means that it is possible to limit the I/O requirements of a database by controlling the amount of I/O generated by individual SQL statements. This can be accomplished by tuning SQL statements so that their execution plans result in a minimum number of I/O operations.
    Typically in a problematic situation there will only be a few SQL statements with suboptimal execution plans generating a lot more physical I/O than necessary and degrading the overall performance for the database.

    Starting with Oracle10g, ADDM aids the SQL tuning process by automatically
    identifying the SQL statements with most impact. The SQL Tuning Advisor can
    then be used to automatically tune these statements and reduce their I/O
    resource consumption. For more information please see

    Document 262687.1 How to use the Sql Tuning Advisor
  • Reduce the I/O requirements of the database by tuning instance parameters:

    1. Using memory caching to limit I/O:
      The amount of I/O required by the database is limited by the use of a number of memory caches e.g. the Buffer Cache, the Log Buffer, various Sort Areas etc. Increasing the Buffer Cache, up to a point, results in more buffer accesses by database processes (logical I/Os) being satisfied from memory instead of having to go to disk (physical I/Os). With larger Sort Areas in memory, the likelihood of them being exhausted during a sorting operation and having to use a temporary tablespace on disk is reduced.
      The other caches also work according to similar concepts.
    2. Tuning the size of multiblock I/O (relevant to pre 10g)
      The size of individual multiblock I/O operations can be controlled by instance parameters.Up to a limit, multiblock I/Os are executed faster when there are fewer larger I/Os than when there are more smaller I/Os. For example, transferring 100Mb of data will complete faster if it is done in 100 requests of size 1Mb each than if it is done in 1,000 requests of size 100Kb each or 10,000 requests of 10Kb each. After this limit is reached, the difference is no longer important: transferring 1Gb of data in 100 requests of size 10Mb each (if allowed by limits on maximum I/O transfer size of Operating Systems) would be almost as efficient as a single transfer of size 1Gb. This is because the time taken to service an I/O involves two main components:

      • I/O Setup Time: tends to be fairly constant across different I/O sizes and for small I/O sizes tends to dominate the total service time.
      • I/O Transfer Time: tends to increase in proportion to the size of the I/O and for small I/O sizes is usually less than the I/O Setup Time.

      The consequence of the above is that before 10g Release 2, it was usually better to configure instance so that the database issues larger and fewer multiblock I/Os by setting DB_FILE_MULTIBLOCK_READ_COUNT.

      After 10g Release 2, this parameter is set automatically and it is not recommended to set it. See:

      Document 841444.1 How To Set DB_FILE_MULTIBLOCK_READ_COUNT in 10g
  • Optimizing I/O at the Operating System level

    This involves making use of I/O capabilities such as Asynchronous I/O or using Filesystems with advanced capabilities such as Direct I/O (bypassing the Operating System's File Caches). Another possible action is to raise the limit of maximum I/O size per transfer (referred to as max_io_size in this article).
  • Balancing the database I/O by usage of Oracle ASM (Automatic Storage Manager)

    ASM is introduced with Oracle10g. It is a file system and volume manager built into the database kernel. It automatically does load balancing in parallel across all available disk drives to prevent hot spots and maximize performance, even with rapidly changing data usage patterns. It prevents fragmentation so that there is never a need to relocate data to reclaim space. Data is well balanced and striped over all disks.

    For details please see

    Document 249992.1 New Feature on ASM (Automatic Storage Manager)
  • Balancing the database I/O by usage of Striping, RAID, SAN or NAS

    This approach relies on storage technologies such as Striping, RAID, Storage Area Networks (SAN) and Network Attached Storage (NAS) to automatically load balance database I/O across multiple available physical disks in order to avoid disk contention and I/O bottlenecks when there is still available unused disk throughput in the storage hardware.

    For more detailed discussions on these technologies please refer to

    "Optimal Storage Configuration Made Easy" by J. Loaiza
    Document 30286.1 I/O Tuning with Different RAID Configurations
  • Redistribute database I/O by manual placement of database files across different filesystems, controllers and physical devices

    This is an approach used in the absence of advanced modern storage technologies. Again the aim is to distribute the database I/O so that no single set of disks or controller becomes saturated from I/O requests when there is still unused disk throughput. It is harder to get right than the previous approach and most often less successful.

    It is important to remember that some I/O will always exist in most databases. After all the guidelines above have been considered, if performance is still not satisfactory on the existing system, you can consider:
    • Reducing the data volumes of the current database by moving older data out.

    • Investing in more and/or faster hardware.

Datafile I/O related Wait Events

Following Wait Events occur on I/O operations to datafiles.

'db file sequential read'

Document 34559.1 WAITEVENT: "db file sequential read" Reference Note


This is one of the most common I/O-related waits.It is in most cases a single block read e.g. for index data blocks or for table data blocks accessed through an index but can also be seen for reads
on datafile header blocks. In earlier versions it could be a multiblock read from Sort segments on disk to contiguous ('sequential') buffers in the Buffer Cache.

If this Wait Event is a significant portion of Wait Time then a number of approaches are possible:

  • Find the Top SQL statements in Physical Reads (from a Statspack or AWR report
    in the section titled "SQL ordered by Reads" or from the view V$SQL)
    and tune them in order to reduce their I/O requirements:
    • If Index Range scans are involved, more blocks than necessary could be being visited if the index is unselective.
      By forcing or enabling the use of a more selective index, we can access the same table data by visiting fewer index blocks (and doing fewer physical I/Os).
    • If indexes are fragmented, again we have to visit more blocks because there is less index data per block.
      In this case, rebuilding the index will compact its contents into fewer blocks.
    • If the index being used has a large Clustering Factor, then more table data blocks have to be visited in order to get the rows in each index block: by rebuilding the table with its rows sorted by the particular index columns we can reduce the Clustering Factor and hence the number of table data blocks that we have to visit for each index block.
      For example, if the table has columns A, B, C & D and the index is on B, D
      then we can rebuild the table as:
      CREATE TABLE new AS SELECT * FROM old ORDER BY b,d;

      Document 39836.1 Clustering Factor
    • Use Partitioning to reduce the number of index and table data blocks to be visited for each SQL statement by usage of Partition Pruning.
  • If there are no particular SQL statements with bad execution plans doing more physical I/Os than necessary, then one of the following may be happening:

    • I/Os on particular datafiles may be being serviced slower due to excessive activity on their disks. In this case, looking at the Statspack "File I/O Statistics" section (or V$FILESTAT) will help us find such hot disks and spread out the I/O by manually moving datafiles to other storage or by making use of Striping, RAID and other technologies to automatically perform. I/O load balancing for us.
    • Starting with Oracle 9.2, we can also find which segments (tables or indexes) have the most Physical Reads being performed against them by using the new Segment Statistics data from view V$SEGMENT_STATISTICS.We can then look in detail at such segments and see if e.g. indexes should be rebuilt or Partitioning could be used to reduce I/O on them.
      Statspack also generates a "Segment Statistics" report starting at level 7.
  • If there is no SQL with suboptimal execution plans and I/O is evenly spread out with similar response times from all disks then a larger Buffer Cache may help:

    - In Oracle8i experiment with gradual increments of DB_BLOCK_BUFFERS followed
    by measurements of the Buffer Cache Hit Ratio from Statspack until there is no further improvement to it.
    • In Oracle9i and above use the Buffer Cache Advisory facility (also available in the Statspack report) to tune the size of the Buffer Cache.
      For details please refer to the manual:

      Oracle9i Database Performance Guide and Reference
      Ch. 14 Memory Configuration and Use, Configuring and Using the Buffer Cache
    • In Oracle10g and above Automatic Shared Memory Management (ASMM) can be used to enable the database to automatically determine the optimal size for the Buffer Cache according to recent workload. For more information see

      Document 257643.1 Oracle Database 10g Automated SGA Memory Tuning
    • For hot segments, usage of Multiple Buffer Pools can be explored: place
      such hot indexes and tables in the KEEP Buffer Pool. For details refer to

      Document 76374.1 Multiple Buffer Pools
  • Finally, you can consider reducing the data held in the most frequently accessed segments (by moving older unneeded data out of the database) or moving these segments to new faster disks to reduce the response time on their I/Os.

'db file scattered read'

Document 34558.1 WAITEVENT: "db file scattered read" Reference Note

This is another very common Wait Event. It occurs when Oracle performs multiblock reads from disk into non-contiguous ('scattered') buffers in the Buffer Cache. Such reads are issued for up to
DB_FILE_MULTIBLOCK_READ_COUNT blocks at a time. These typically happen for Full Table Scans and for Fast Full Index scans.

If this Wait Event is a significant portion of Wait Time then a number of approaches are possible:

  • Find which SQL statements perform. Full Table or Fast Full Index scans and tune them to make sure these scans are necessary and not the result of a suboptimal plan.

    Starting with Oracle9i the new view V$SQL_PLAN view can help: (ignore data dictionary SQL in the output of these queries)

    For Full Table scans:

    select sql_text from v$sqltext t, v$sql_plan p
    where t.hash_value=p.hash_value and p.operation='TABLE ACCESS'
    and p.options='FULL'
    order by p.hash_value, t.piece;

    For Fast Full Index scans:

    select sql_text from v$sqltext t, v$sql_plan p
    where t.hash_value=p.hash_value and p.operation='INDEX'
    and p.options='FULL SCAN'
    order by p.hash_value, t.piece;

    In Oracle8i a possible approach is to find sessions performing multiblock reads by querying V$SESSION_EVENT for this Wait Event and then SQL Tracing them. Alternatively, the Top SQL statements for Physical Reads can be investigated to see if their execution plans contain Full Table or Fast Full Index scans.
  • In cases where such multiblock scans occur from optimal execution plans it is possible to tune the size of multiblock I/Os issued by Oracle by setting the instance parameter DB_FILE_MULTIBLOCK_READ_COUNT so that
    DB_BLOCK_SIZE x DB_FILE_MULTIBLOCK_READ_COUNT = max_io_size of system

    For more information refer to

    Document 30712.1 Init.ora Parameter "DB_FILE_MULTIBLOCK_READ_COUNT" Reference
    Document 1037322.6 WHAT IS THE DB_FILE_MULTIBLOCK_READ_COUNT PARAMETER?

    As stated previously, starting with Oracle10g Release 2 DB_FILE_MULTIBLOCK_READ_COUNT initialization parameter is now automatically tuned to use a default value when this parameter is not set explicitly. This default value corresponds to the maximum I/O size that can be performed efficiently. This value is platform-dependent and is 1MB for most platforms.Because the parameter is expressed in blocks, it will be set to a value that is equal to the maximum I/O size that can be performed efficiently divided by the standard block size.
  • As blocks read using Full Table and Fast Full Index scans are placed on the least recently used end of the Buffer Cache replacement lists, sometimes it may help to use Multiple Buffer Pools and place such segments in the KEEP pool.
    For more information please refer to

    Document 76374.1 Multiple Buffer Pools
  • Partitioning can also be used to reduce the amount of data to be scanned as Partition Pruning can restrict the scan to a subset of the segment's partitions.
  • Finally, you can consider reducing the data held in the most frequently accessed segments (by moving older unneeded data out of the database) or moving these segments to new faster disks to reduce the response time on their I/Os.

'db file parallel read'



This Wait Event is used when Oracle performs in parallel reads from multiple datafiles to non-contiguous buffers in memory (PGA or Buffer Cache). This is done during recovery operations or when buffer prefetching is being used as an optimization i.e. instead of performing multiple single-block reads.

If this wait is an important component of Wait Time, follow the same guidelines as 'db file sequential read'.

Direct Path Reads and Writes

'direct path read'

Document 50415.1WAITEVENT: "direct path read" Reference Note

'direct path write'

Document 50416.1 WAITEVENT: "direct path write" Reference Note

'direct path read (lob)'

'direct path write (lob)'

These occur when database processes perform. special types of multiblock I/Os between the disk and process PGA memory, thus bypassing the Buffer Cache.Such I/Os may be performed both synchronously and asynchronously.

Examples where they may be used are:
o Sort I/Os when memory Sort areas are exhausted and temporary tablespaces are used
to perform. the sort
o Parallel Execution (Query and DML)
o Readahead operations (buffer prefetching)
o Direct Load operations
o I/O to LOB segments (which are not cached in the Buffer Cache)

Due to the way in which time for these waits is recorded (it does not measure the time taken to perform. the I/O), their relative position in listings such as Statspack's "Top 5 Wait/Timed Events" cannot be used to evaluate their true impact.

Guidelines for tuning:

  • Usage of Asynchronous I/O is recommended where available.
  • In Oracle8i, minimize the number of I/O requests by setting the DB_FILE_DIRECT_IO_COUNT instance parameter so that DB_BLOCK_SIZE x DB_FILE_DIRECT_IO_COUNT = max_io_size of system

    In Oracle8i the default for this is 64 blocks.

    (In Oracle9i, it is replaced by _DB_FILE_DIRECT_IO_COUNT which governs the size of direct I/Os in BYTES (not blocks). The default is 1Mb but will be sized down if the max_io_size of the system is smaller.)

    Document 47324.1 Init.ora Parameter "DB_FILE_DIRECT_IO_COUNT" Reference Note
  • Tune memory Sort areas so that disk I/O for Sorting is minimized:
    In 9i and above use Automated SQL Execution Memory Management.
    In 8i tune the various Sort areas manually.

    Document 147806.1 Automated SQL Execution Memory Management
    Document 109907.1 How to Determine an Optimal SORT_AREA_SIZE
  • For LOB segments, store them on filesystems where an Operating System File
    Buffer Cache can provide some memory caching.
  • Identify sessions performing direct I/Os by querying V$SESSION_EVENT for these Wait Events or V$SESSTAT for statistics:
    'physical reads direct', 'physical reads direct (lob)',
    'physical writes direct' & 'physical writes direct (lob)'
    and tune their SQL statements.
  • Identify datafiles on bottlenecked disk storage and move elsewhere using V$FILESTAT or Statspack's "File IO Statistics" section.

Controlfile I/O related Wait Events

These Wait Events occur during I/O to one or all copies of the controlfile.Frequency of Controlfile access is governed by activities such as redo logfile switching and checkpointing. Therefore it can only be influenced indirectly by tuning these activities.

'control file parallel write'

This occurs when a server process is updating all copies of the controlfile. If it is significant, check for bottlenecks on the I/O paths (controllers,physical disks) of all of the copies of the controlfile.

Possible solutions:

  • Reduce the number of controlfile copies to the minimum that ensures that not all copies can be lost at the same time.
  • Use Asynchronous I/O if available on your platform.
  • Move the controlfile copies to less saturated storage locations.

'control file sequential read' and 'control file single write'

These wait events occur on I/O to a single copy of the controlfile.If they are significant find out whether the waits are on particular copy of the controlfile and if so whether its I/O path is saturated.

The following query can be used to find which controlfile is being accessed. It has to be run when the problem is occuring:

select P1 from V$SESSION_WAIT
where EVENT like 'control file%' and STATUS='WAITING';


Possible solutions:

  • Move the problematic controlfile copy to a less saturated storage location.
  • Use Asynchronous I/O if available on your platform.

Redo Logging I/O-Related Wait Events

There are a number of Wait Events that happen during Redo Logging activities and most of them are I/O-related. The two most important ones are 'log file sync' and 'log file parallel write'.
Oracle foreground processes wait for 'log file sync' whereas the LGWR process waits for 'log file parallel write'.

Although we usually find 'log file sync' in the "Top 5 Wait/Timed Events" section of the Statspack report, in order to understand it we will first look at 'log file parallel write':

'log file parallel write'

Document 34583.1 WAITEVENT: "log file parallel write" Reference Note


The LGWR background process waits for this event while it is copying redo records from the memory Log Buffer cache to the current redo group's member logfiles on disk. Asynchronous I/O will be used if available to make the write parallel, otherwise these writes will be done sequentially one member after the other.

However, LGWR has to wait until the I/Os to all member logfiles are complete before the wait is completed. Hence, the factor that determines the length of this wait is the speed with which the I/O subsystem can perform. the writes to the logfile members.

To reduce the time waited for this event, one approach is to reduce the amount of redo generated by the database:

  • Make use of UNRECOVERABLE/NOLOGGING options.
  • Reduce the number of redo group members to the minimum necessary to ensure
    not all members can be lost at the same time.
  • Do not leave tablespaces in BACKUP mode for longer than necessary.
  • Only use the minimal level of Supplemental Logging required to achieve the required functionality e.g. in LogMiner, Logical Standby or Streams.

Another approach is to tune the I/O itself:

  • Place redo group members on storage locations so that parallel writes do not contend with each other.
  • Do not use RAID-5 for redo logfiles.
  • Use Raw Devices for redo logfiles.
  • Use faster disks for redo logfiles.
  • If archiving is being used setup redo storage so that writes for the current redo group members do not contend with reads for the group(s) currently being archived.

'log file sync'

Document 34592.1 WAITEVENT: "log file sync" Reference Note

This Wait Event occurs in Oracle foreground processes when they have issued a COMMIT or ROLLBACK operation and are waiting for it to complete.Part (but not all) of this wait includes waiting for LGWR to copy the redo records for the session's transaction from Log Buffer memory to disk.

So, in the time that a foreground process is waiting for 'log file sync', LGWR will also wait for a portion of this time on 'log file parallel write'.

The key to understanding what is delaying 'log file sync' is to compare average times waited for 'log file sync' and 'log file parallel write':

  • If they are almost similar, then redo logfile I/O is causing the delay and the guidelines for tuning it should be followed.
  • If 'log file parallel write' is significantly different i.e smaller, then the delay is caused by the other parts of the Redo Logging mechanism that occur during a COMMIT/ROLLBACK (and are not I/O-related).
    Sometimes there will be latch contention on redo latches, evidenced by 'latch free' or 'LGWR wait for redo copy' wait events.

'log file sequential read' and 'log file single write'


Both these Wait Events are I/O-related so they are likely to appear together with 'log file parallel write' if there is I/O contention on the redo logs. Follow the same guidelines for tuning them.

'switch logfile command' ,'log file switch completion' and 'log file switch (clearing log file)'


More LGWR I/O-related Wait Events, tune as before.

'log file switch (checkpoint incomplete)'

This Wait Event occurs when checkpointing activities are not occurring quickly enough.
For guidelines on tuning checkpoint operations please refer to:

Document 147468.1 Checkpoint Tuning and Troubleshooting Guide
Document 76713.1 8i Parameters that Influence Checkpoints

'log switch/archive' and 'log file switch (archiving needed)'

These Wait Events occur when archiving is enabled and indicate that archiving is not performing fast enough.
For guidelines on tuning archiving operations please refer to:

Document 45042.1 Archiver Best Practices

Buffer Cache I/O-Related Wait Events

These Wait Events occur because of Buffer Cache operations involving the DBWR process(es) and I/O Slaves.

'db file parallel write' , 'db file single write', 'write complete waits', 'free buffer waits'

Document 34416.1 WAITEVENT: "db file parallel write" Reference Note

For guidelines on tuning these waits please refer to the following articles:

Document 62172.1 Understanding and Tuning Buffer Cache and DBWR
Document 76713.1 8i Parameters that Influence Checkpoints
Document 147468.1 Checkpoint Tuning and Troubleshooting Guide

Footnote

As a final note in this article, whenever I/O performance and response times are low it is worth checking for related errors in Operating System logs. There is little point in investigating I/O performance at the Oracle database level if the I/O subsystem is malfunctioning. If this is the case your Hardware, Operating System or Filesystem vendor should be contacted for assistance.

Please ensure that all steps described in Oracle Installation manuals and Administrator's Reference guides involving Operating System patches, Kernel parameters & related configuration tasks have been performed on systems hosting Oracle databases.

Who to contact for more information?

If you have a specific question, why not open a thread in the MOS Database Tuning Community:

Document 1383594.1 Collaborate With MOS Database Tuning Community Members

Following notes dealing with I/O may also prove useful:

Document 1275596.1 How to Tell if the IO of the Database is Slow
Document 432854.1 Asynchronous I/O Support on OCFS/OCFS2 and Related Settings: filesystemio_options, disk_asynch_io
Document 30286.1 I/O Tuning with Different RAID Configurations

References

NOTE:841444.1 - How To Set DB_FILE_MULTIBLOCK_READ_COUNT in 10g
NOTE:1037322.6 - WHAT IS THE DB_FILE_MULTIBLOCK_READ_COUNT PARAMETER?
NOTE:109907.1 - How to Determine an Optimal SORT_AREA_SIZE


NOTE:257643.1 - Oracle Database Automated SGA Memory Tuning
NOTE:147806.1 - Oracle9i New Feature: Automated SQL Execution Memory Management
NOTE:262687.1 - Using the DBMS_SQLTUNE package to Run the Sql Tuning Advisor
NOTE:190124.1 - THE COE PERFORMANCE METHOD
NOTE:30286.1 - I/O Tuning with Different RAID Configurations
NOTE:34558.1 - WAITEVENT: "db file scattered read" Reference Note
NOTE:34559.1 - WAITEVENT: "db file sequential read" Reference Note


NOTE:34583.1 - WAITEVENT: "log file parallel write" Reference Note





NOTE:34592.1 - WAITEVENT: "log file sync" Reference Note


NOTE:39836.1 - Clustering Factor
NOTE:432854.1 - Asynchronous I/O Support on OCFS/OCFS2 and Related Settings: filesystemio_options, disk_asynch_io
NOTE:45042.1 - Archiver Best Practices
NOTE:47324.1 - Init.ora Parameter "DB_FILE_DIRECT_IO_COUNT" Reference Note
NOTE:249992.1 - New Feature on ASM (Automatic Storage Manager).
NOTE:147468.1 - Checkpoint Tuning and Troubleshooting Guide
NOTE:250655.1 - How to use the Automatic Database Diagnostic Monitor
NOTE:30712.1 - Init.ora Parameter "DB_FILE_MULTIBLOCK_READ_COUNT" Reference Note
NOTE:34412.1 - WAITEVENT: "control file parallel write" Reference Note


NOTE:34416.1 - WAITEVENT: "db file parallel write" Reference Note
NOTE:1275596.1 - How to Tell if the IO of the Database is Slow

NOTE:50415.1 - WAITEVENT: "direct path read" Reference Note
NOTE:50416.1 - WAITEVENT: "direct path write" Reference Note


NOTE:62172.1 - Understanding and Tuning Buffer Cache and DBWR
NOTE:76374.1 - Multiple Buffer Pools
NOTE:76713.1 - 8i Parameters that Influence Checkpoints

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/17252115/viewspace-763560/,如需轉載,請註明出處,否則將追究法律責任。

相關文章