Wait event (二) 摘自官檔 Oracle版權所有

urgel_babay發表於2016-02-29

SQL*Net Events

The following events signify that the database process is waiting for acknowledgment from a database link or a client process:

  • SQL*Net break/reset to client

  • SQL*Net break/reset to dblink

  • SQL*Net message from client

  • SQL*Net message from dblink

  • SQL*Net message to client

  • SQL*Net message to dblink

  • SQL*Net more data from client

  • SQL*Net more data from dblink

  • SQL*Net more data to client

  • SQL*Net more data to dblink

If these waits constitute a significant portion of the wait time on the system or for a user experiencing response time issues, then the network or the middle-tier could be a bottleneck.

Events that are client-related should be diagnosed as described for the event SQL*Net message from client. Events that are dblink-related should be diagnosed as described for the event SQL*Net message from dblink.

10.3.1.1 SQL*Net message from client

Although this is an idle event, it is important to explain when this event can be used to diagnose what is not the problem. This event indicates that a server process is waiting for work from the client process. However, there are several situations where this event could accrue most of the wait time for a user experiencing poor response time. The cause could be either a network bottleneck or a resource bottleneck on the client process.

10.3.1.1.1 Network Bottleneck

A network bottleneck can occur if the application causes a lot of traffic between server and client and the network latency (time for a round-trip) is high. Symptoms include the following:

  • Large number of waits for this event

  • Both the database and client process are idle (waiting for network traffic) most of the time

To alleviate network bottlenecks, try the following:

  • Tune the application to reduce round trips.

  • Explore options to reduce latency (for example, terrestrial lines opposed to VSAT links).

  • Change system configuration to move higher traffic components to lower latency links.

10.3.1.1.2 Resource Bottleneck on the Client Process

If the client process is using most of the resources, then there is nothing that can be done in the database. Symptoms include the following:

  • Number of waits might not be large, but the time waited might be significant

  • Client process has a high resource usage

In some cases, you can see the wait time for a waiting user tracking closely with the amount of CPU used by the client process. The term client here refers to any process other than the database process (middle-tier, desktop client) in the n-tier architecture.

10.3.1.2 SQL*Net message from dblink

This event signifies that the session has sent a message to the remote node and is waiting for a response from the database link. This time could go up because of the following:

  • Network bottleneck

    For information, see "SQL*Net message from client".

  • Time taken to execute the SQL on the remote node

    It is useful to see the SQL being run on the remote node. Login to the remote database, find the session created by the database link, and examine the SQL statement being run by it.

  • Number of round trip messages

    Each message between the session and the remote node adds latency time and processing overhead. To reduce the number of messages exchanged, use array fetches and array inserts.

10.3.1.3 SQL*Net more data to client

The server process is sending more data or messages to the client. The previous operation to the client was also a send.

10.3.2 buffer busy waits

This wait indicates that there are some buffers in the buffer cache that multiple processes are attempting to access concurrently. Query V$WAITSTAT for the wait statistics for each class of buffer. Common buffer classes that have buffer busy waits include data block, segment header, undo header, and undo block.

Check the following V$SESSION_WAIT parameter columns:

  • P1 - File ID

  • P2 - Block ID

  • P3 - Class ID

10.3.2.1 Causes

To determine the possible causes, first query V$SESSION to identify the value of ROW_WAIT_OBJ# when the session waits for buffer busy waits. For example:

SELECT row_wait_obj# 
  FROM V$SESSION 
 WHERE EVENT = 'buffer busy waits';

To identify the object and object type contended for, query DBA_OBJECTS using the value for ROW_WAIT_OBJ# that is returned from V$SESSION. For example:

SELECT owner, object_name, subobject_name, object_type
  FROM DBA_OBJECTS
 WHERE data_object_id = &row_wait_obj;

10.3.2.2 Actions

The action required depends on the class of block contended for and the actual segment.

10.3.2.2.1 segment header

If the contention is on the segment header, then this is most likely free list contention.

Automatic segment-space management in locally managed tablespaces eliminates the need to specify the PCTUSED, FREELISTS, and FREELIST GROUPS parameters. If possible, switch from manual space management to automatic segment-space management (ASSM).

The following information is relevant if you are unable to use automatic segment-space management (for example, because the tablespace uses dictionary space management).

A free list is a list of free data blocks that usually includes blocks existing in a number of different extents within the segment. Free lists are composed of blocks in which free space has not yet reached PCTFREE or used space has shrunk below PCTUSED. Specify the number of process free lists with the FREELISTS parameter. The default value of FREELISTS is one. The maximum value depends on the data block size.

To find the current setting for free lists for that segment, run the following:

SELECT SEGMENT_NAME, FREELISTS
  FROM DBA_SEGMENTS
 WHERE SEGMENT_NAME = segment name AND SEGMENT_TYPE = segment type;

Set free lists, or increase the number of free lists. If adding more free lists does not alleviate the problem, then use free list groups (even in single instance this can make a difference). If using Oracle Real Application Clusters, then ensure that each instance has its own free list group(s).

10.3.2.2.2 data block

If the contention is on tables or indexes (not the segment header):

  • Check for right-hand indexes. These are indexes that are inserted into at the same point by many processes. For example, those that use sequence number generators for the key values.

  • Consider using automatic segment-space management (ASSM), global hash partitioned indexes, or increasing free lists to avoid multiple processes attempting to insert into the same block.

10.3.2.2.3 undo header

For contention on rollback segment header:

  • If you are not using automatic undo management, then add more rollback segments.

10.3.2.2.4 undo block

For contention on rollback segment block:

  • If you are not using automatic undo management, then consider making rollback segment sizes larger.

10.3.3 db file scattered read

This event signifies that the user process is reading buffers into the SGA buffer cache and is waiting for a physical I/O call to return. A db file scattered read issues a scattered read to read the data into multiple discontinuous memory locations. A scattered read is usually a multiblock read. It can occur for a fast full scan (of an index) in addition to a full table scan.

The db file scattered read wait event identifies that a full scan is occurring. When performing a full scan into the buffer cache, the blocks read are read into memory locations that are not physically adjacent to each other. Such reads are called scattered read calls, because the blocks are scattered throughout memory. This is why the corresponding wait event is called 'db file scattered read'. multiblock (up to DB_FILE_MULTIBLOCK_READ_COUNT blocks) reads due to full scans into the buffer cache show up as waits for 'db file scattered read'.

Check the following V$SESSION_WAIT parameter columns:

  • P1 - The absolute file number

  • P2 - The block being read

  • P3 - The number of blocks (should be greater than 1)

10.3.3.1 Actions

On a healthy system, physical read waits should be the biggest waits after the idle waits. However, also consider whether there are direct read waits (signifying full table scans with parallel query) or db file scattered read waits on an operational (OLTP) system that should be doing small indexed accesses.

Other things that could indicate excessive I/O load on the system include the following:

  • Poor buffer cache hit ratio

  • These wait events accruing most of the wait time for a user experiencing poor response time

10.3.3.2 Managing Excessive I/O

There are several ways to handle excessive I/O waits. In the order of effectiveness, these are as follows:

  1. Reduce the I/O activity by SQL tuning

  2. Reduce the need to do I/O by managing the workload

  3. Gather system statistics with DBMS_STATS package, allowing the query optimizer to accurately cost possible access paths that use full scans

  4. Use Automatic Storage Management

  5. Add more disks to reduce the number of I/Os for each disk

  6. Alleviate I/O hot spots by redistributing I/O across existing disks

    The first course of action should be to find opportunities to reduce I/O. Examine the SQL statements being run by sessions waiting for these events, as well as statements causing high physical I/Os from V$SQLAREA. Factors that can adversely affect the execution plans causing excessive I/O include the following:

    • Improperly optimized SQL

    • Missing indexes

    • High degree of parallelism for the table (skewing the optimizer toward scans)

    • Lack of accurate statistics for the optimizer

    • Setting the value for DB_FILE_MULTIBLOCK_READ_COUNT initialization parameter too high which favors full scans

    10.3.3.3 Inadequate I/O Distribution

    Besides reducing I/O, also examine the I/O distribution of files across the disks. Is I/O distributed uniformly across the disks, or are there hot spots on some disks? Are the number of disks sufficient to meet the I/O needs of the database?

    See the total I/O operations (reads and writes) by the database, and compare those with the number of disks used. Remember to include the I/O activity of LGWR and ARCH processes.

    10.3.3.4 Finding the SQL Statement executed by Sessions Waiting for I/O

    Use the following query to determine, at a point in time, which sessions are waiting for I/O:

    SELECT SQL_ADDRESS, SQL_HASH_VALUE
      FROM V$SESSION 
     WHERE EVENT LIKE 'db file%read';  
    

    10.3.3.5 Finding the Object Requiring I/O

    To determine the possible causes, first query V$SESSION to identify the value of ROW_WAIT_OBJ# when the session waits for db file scattered read. For example:

    SELECT row_wait_obj# 
      FROM V$SESSION 
     WHERE EVENT = 'db file scattered read';
    

    To identify the object and object type contended for, query DBA_OBJECTS using the value for ROW_WAIT_OBJ# that is returned from V$SESSION. For example:

    SELECT owner, object_name, subobject_name, object_type
      FROM DBA_OBJECTS
     WHERE data_object_id = &row_wait_obj;
    

    10.3.4 db file sequential read

    This event signifies that the user process is reading a buffer into the SGA buffer cache and is waiting for a physical I/O call to return. A sequential read is a single-block read.

    Single block I/Os are usually the result of using indexes. Rarely, full table scan calls could get truncated to a single block call due to extent boundaries, or buffers already present in the buffer cache. These waits would also show up as 'db file sequential read'.

    Check the following V$SESSION_WAIT parameter columns:

    • P1 - The absolute file number

    • P2 - The block being read

    • P3 - The number of blocks (should be 1)


    10.3.4.1 Actions

    On a healthy system, physical read waits should be the biggest waits after the idle waits. However, also consider whether there are db file sequential reads on a large data warehouse that should be seeing mostly full table scans with parallel query.

    Figure 10-1 depicts the differences between the following wait events:

    • db file sequential read (single block read into one SGA buffer)

    • db file scattered read (multiblock read into many discontinuous SGA buffers)

    • direct read (single or multiblock read into the PGA, bypassing the SGA)

    Wait event (二) 摘自官檔 Oracle版權所有

    10.3.5 direct path read and direct path read temp

    When a session is reading buffers from disk directly into the PGA (opposed to the buffer cache in SGA), it waits on this event. If the I/O subsystem does not support asynchronous I/Os, then each wait corresponds to a physical read request.

    If the I/O subsystem supports asynchronous I/O, then the process is able to overlap issuing read requests with processing the blocks already existing in the PGA. When the process attempts to access a block in the PGA that has not yet been read from disk, it then issues a wait call and updates the statistics for this event. Hence, the number of waits is not necessarily the same as the number of read requests (unlike db file scattered read and db file sequentialread).

    Check the following V$SESSION_WAIT parameter columns:

    • P1 - File_id for the read call

    • P2 - Start block_id for the read call

    • P3 - Number of blocks in the read call

    10.3.5.1 Causes

    This happens in the following situations:

    • The sorts are too large to fit in memory and some of the sort data is written out directly to disk. This data is later read back in, using direct reads.

    • Parallel slaves are used for scanning data.

    • The server process is processing buffers faster than the I/O system can return the buffers. This can indicate an overloaded I/O system.

    10.3.5.2 Actions

    The file_id shows if the reads are for an object in TEMP tablespace (sorts to disk) or full table scans by parallel slaves. This is the biggest wait for large data warehouse sites. However, if the workload is not a DSS workload, then examine why this is happening.

    10.3.5.2.1 Sorts to Disk

    Examine the SQL statement currently being run by the session experiencing waits to see what is causing the sorts. Query V$TEMPSEG_USAGE to find the SQL statement that is generating the sort. Also query the statistics from V$SESSTATfor the session to determine the size of the sort. See if it is possible to reduce the sorting by tuning the SQL statement. If WORKAREA_SIZE_POLICY is MANUAL, then consider increasing the SORT_AREA_SIZE for the system (if the sorts are not too big) or for individual processes. If WORKAREA_SIZE_POLICY is AUTO, then investigate whether to increase PGA_AGGREGATE_TARGET. See "PGA Memory Management".

    10.3.5.2.2 Full Table Scans

    If tables are defined with a high degree of parallelism, then this could skew the optimizer to use full table scans with parallel slaves. Check the object being read into using the direct path reads. If the full table scans are a valid part of the workload, then ensure that the I/O subsystem is configured adequately for the degree of parallelism. Consider using disk striping if you are not already using it or Automatic Storage Management (ASM).

    10.3.5.2.3 Hash Area Size

    For query plans that call for a hash join, excessive I/O could result from having HASH_AREA_SIZE too small. If WORKAREA_SIZE_POLICY is MANUAL, then consider increasing the HASH_AREA_SIZE for the system or for individual processes. IfWORKAREA_SIZE_POLICY is AUTO, then investigate whether to increase PGA_AGGREGATE_TARGET.

    10.3.6 direct path write and direct path write temp

    When a process is writing buffers directly from PGA (as opposed to the DBWR writing them from the buffer cache), the process waits on this event for the write call to complete. Operations that could perform direct path writes include when a sort goes to disk, during parallel DML operations, direct-path INSERTs, parallel create table as select, and some LOB operations.

    Like direct path reads, the number of waits is not the same as number of write calls issued if the I/O subsystem supports asynchronous writes. The session waits if it has processed all buffers in the PGA and is unable to continue work until an I/O request completes.


    Check the following V$SESSION_WAIT parameter columns:

    • P1 - File_id for the write call

    • P2 - Start block_id for the write call

    • P3 - Number of blocks in the write call

    10.3.6.1 Causes

    This happens in the following situations:

    • Sorts are too large to fit in memory and are written to disk

    • Parallel DML are issued to create/populate objects

    • Direct path loads

    10.3.6.2 Actions

    For large sorts see "Sorts to Disk".

    For parallel DML, check the I/O distribution across disks and make sure that the I/O subsystem is adequately configured for the degree of parallelism.

    10.3.7 enqueue (enq:) waits

    Enqueues are locks that coordinate access to database resources. This event indicates that the session is waiting for a lock that is held by another session.

    The name of the enqueue is included as part of the wait event name, in the form enq: enqueue_type - related_details. In some cases, the same enqueue type can be held for different purposes, such as the following related TX types:

    • enq: TX - allocate ITL entry

    • enq: TX - contention

    • enq: TX - index contention

    • enq: TX - row lock contention

    The V$EVENT_NAME view provides a complete list of all the enq: wait events.

    You can check the following V$SESSION_WAIT parameter columns for additional information:

    • P1 - Lock TYPE (or name) and MODE

    • P2 - Resource identifier ID1 for the lock

    • P3 - Resource identifier ID2 for the lock

    10.3.7.1 Finding Locks and Lock Holders

    Query V$LOCK to find the sessions holding the lock. For every session waiting for the event enqueue, there is a row in V$LOCK with REQUEST <> 0. Use one of the following two queries to find the sessions holding the locks and waiting for the locks.

    If there are enqueue waits, you can see these using the following statement:

    SELECT * FROM V$LOCK WHERE request > 0;
    

    To show only holders and waiters for locks being waited on, use the following:

    SELECT DECODE(request,0,'Holder: ','Waiter: ') || 
              sid sess, id1, id2, lmode, request, type
       FROM V$LOCK
     WHERE (id1, id2, type) IN (SELECT id1, id2, type FROM V$LOCK WHERE request > 0)
       ORDER BY id1, request;
    

    10.3.7.2 Actions

    The appropriate action depends on the type of enqueue.

    10.3.7.2.1 ST enqueue

    If the contended-for enqueue is the ST enqueue, then the problem is most likely to be dynamic space allocation. Oracle dynamically allocates an extent to a segment when there is no more free space available in the segment. This enqueue is only used for dictionary managed tablespaces.

    To solve contention on this resource:

    • Check to see whether the temporary (that is, sort) tablespace uses TEMPFILES. If not, then switch to using TEMPFILES.

    • Switch to using locally managed tablespaces if the tablespace that contains segments that are growing dynamically is dictionary managed.

      See Also:

      Oracle Database Concepts for detailed information on TEMPFILEs and locally managed tablespaces
    • If it is not possible to switch to locally managed tablespaces, then ST enqueue resource usage can be decreased by changing the next extent sizes of the growing objects to be large enough to avoid constant space allocation. To determine which segments are growing constantly, monitor the EXTENTS column of the DBA_SEGMENTS view for all SEGMENT_NAMEs. See Oracle Database Administrator's Guide for information about displaying information about space usage.

    • Preallocate space in the segment, for example, by allocating extents using the ALTER TABLE ALLOCATE EXTENT SQL statement.

    10.3.7.2.2 HW enqueue

    The HW enqueue is used to serialize the allocation of space beyond the high water mark of a segment.

    • V$SESSION_WAIT.P2 / V$LOCK.ID1 is the tablespace number.

    • V$SESSION_WAIT.P3 / V$LOCK.ID2 is the relative dba of segment header of the object for which space is being allocated.

    If this is a point of contention for an object, then manual allocation of extents solves the problem.

    10.3.7.2.3 TM enqueue

    The most common reason for waits on TM locks tend to involve foreign key constraints where the constrained columns are not indexed. Index the foreign key columns to avoid this problem.

    10.3.7.2.4 TX enqueue

    These are acquired exclusive when a transaction initiates its first change and held until the transaction does a COMMIT or ROLLBACK.

    • Waits for TX in mode 6: occurs when a session is waiting for a row level lock that is already held by another session. This occurs when one user is updating or deleting a row, which another session wishes to update or delete. This type of TX enqueue wait corresponds to the wait event enq: TX - row lock contention.

      The solution is to have the first session already holding the lock perform a COMMIT or ROLLBACK.

    • Waits for TX in mode 4 can occur if the session is waiting for an ITL (interested transaction list) slot in a block. This happens when the session wants to lock a row in the block but one or more other sessions have rows locked in the same block, and there is no free ITL slot in the block. Usually, Oracle dynamically adds another ITL slot. This may not be possible if there is insufficient free space in the block to add an ITL. If so, the session waits for a slot with a TX enqueue in mode 4. This type of TX enqueue wait corresponds to the wait event enq: TX - allocate ITL entry.

      The solution is to increase the number of ITLs available, either by changing the INITRANS or MAXTRANS for the table (either by using an ALTER statement, or by re-creating the table with the higher values).

    • Waits for TX in mode 4 can also occur if a session is waiting due to potential duplicates in UNIQUE index. If two sessions try to insert the same key value the second session has to wait to see if an ORA-0001 should be raised or not. This type of TX enqueue wait corresponds to the wait event enq: TX - row lock contention.

      The solution is to have the first session already holding the lock perform a COMMIT or ROLLBACK.

    • Waits for TX in mode 4 is also possible if the session is waiting due to shared bitmap index fragment. Bitmap indexes index key values and a range of ROWIDs. Each 'entry' in a bitmap index can cover many rows in the actual table. If two sessions want to update rows covered by the same bitmap index fragment, then the second session waits for the first transaction to either COMMIT or ROLLBACK by waiting for the TX lock in mode 4. This type of TX enqueue wait corresponds to the wait event enq: TX - row lock contention.

    • Waits for TX in Mode 4 can also occur waiting for a PREPARED transaction.

    • Waits for TX in mode 4 also occur when a transaction inserting a row in an index has to wait for the end of an index block split being done by another transaction. This type of TX enqueue wait corresponds to the wait eventenq: TX - index contention.

    10.3.8 events in wait class other

    This event belong to Other wait class and typically should not occur on a system. This event is an aggregate of all other events in the Other wait class, such as latch free, and is used in the V$SESSION_EVENT and V$SERVICE_EVENT views only. In these views, the events in the Other wait class will not be maintained individually in every session. Instead, these events will be rolled up into this single event to reduce the memory used for maintaining statistics on events in the Other wait class.

    10.3.9 free buffer waits

    This wait event indicates that a server process was unable to find a free buffer and has posted the database writer to make free buffers by writing out dirty buffers. A dirty buffer is a buffer whose contents have been modified. Dirty buffers are freed for reuse when DBWR has written the blocks to disk.

    10.3.9.1 Causes

    DBWR may not be keeping up with writing dirty buffers in the following situations:

    • The I/O system is slow.

    • There are resources it is waiting for, such as latches.

    • The buffer cache is so small that DBWR spends most of its time cleaning out buffers for server processes.

    • The buffer cache is so big that one DBWR process is not enough to free enough buffers in the cache to satisfy requests.

    10.3.9.2 Actions

    If this event occurs frequently, then examine the session waits for DBWR to see whether there is anything delaying DBWR.

    10.3.9.2.1 Writes

    If it is waiting for writes, then determine what is delaying the writes and fix it. Check the following:

    • Examine V$FILESTAT to see where most of the writes are happening.

    • Examine the host operating system statistics for the I/O system. Are the write times acceptable?

    If I/O is slow:

    • Consider using faster I/O alternatives to speed up write times.

    • Spread the I/O activity across large number of spindles (disks) and controllers. See Chapter 8, "I/O Configuration and Design" for information on balancing I/O.

    10.3.9.2.2 Cache is Too Small

    It is possible DBWR is very active because the cache is too small. Investigate whether this is a probable cause by looking to see if the buffer cache hit ratio is low. Also use the V$DB_CACHE_ADVICE view to determine whether a larger cache size would be advantageous. See "Sizing the Buffer Cache".

    10.3.9.2.3 Cache Is Too Big for One DBWR

    If the cache size is adequate and the I/O is already evenly spread, then you can potentially modify the behavior of DBWR by using asynchronous I/O or by using multiple database writers.

    10.3.9.3 Consider Multiple Database Writer (DBWR) Processes or I/O Slaves

    Configuring multiple database writer processes, or using I/O slaves, is useful when the transaction rates are high or when the buffer cache size is so large that a single DBWn process cannot keep up with the load.

    10.3.9.3.1 DB_WRITER_PROCESSES

    The DB_WRITER_PROCESSES initialization parameter lets you configure multiple database writer processes (from DBW0 to DBW9 and from DBWa to DBWj). Configuring multiple DBWR processes distributes the work required to identify buffers to be written, and it also distributes the I/O load over these processes. Multiple db writer processes are highly recommended for systems with multiple CPUs (at least one db writer for every 8 CPUs) or multiple processor groups (at least as many db writers as processor groups).

    Based upon the number of CPUs and the number of processor groups, Oracle either selects an appropriate default setting for DB_WRITER_PROCESSES or adjusts a user-specified setting.

    10.3.9.3.2 DBWR_IO_SLAVES

    If it is not practical to use multiple DBWR processes, then Oracle provides a facility whereby the I/O load can be distributed over multiple slave processes. The DBWR process is the only process that scans the buffer cache LRU list for blocks to be written out. However, the I/O for those blocks is performed by the I/O slaves. The number of I/O slaves is determined by the parameter DBWR_IO_SLAVES.

    DBWR_IO_SLAVES is intended for scenarios where you cannot use multiple DB_WRITER_PROCESSES (for example, where you have a single CPU). I/O slaves are also useful when asynchronous I/O is not available, because the multiple I/O slaves simulate nonblocking, asynchronous requests by freeing DBWR to continue identifying blocks in the cache to be written. Asynchronous I/O at the operating system level, if you have it, is generally preferred.

    DBWR I/O slaves are allocated immediately following database open when the first I/O request is made. The DBWR continues to perform all of the DBWR-related work, apart from performing I/O. I/O slaves simply perform the I/O on behalf of DBWR. The writing of the batch is parallelized between the I/O slaves.

    Note:

    Implementing DBWR_IO_SLAVES requires that extra shared memory be allocated for I/O buffers and request queues. Multiple DBWR processes cannot be used with I/O slaves. Configuring I/O slaves forces only one DBWR process to start.
    10.3.9.3.3 Choosing Between Multiple DBWR Processes and I/O Slaves

    Configuring multiple DBWR processes benefits performance when a single DBWR process is unable to keep up with the required workload. However, before configuring multiple DBWR processes, check whether asynchronous I/O is available and configured on the system. If the system supports asynchronous I/O but it is not currently used, then enable asynchronous I/O to see if this alleviates the problem. If the system does not support asynchronous I/O, or if asynchronous I/O is already configured and there is still a DBWR bottleneck, then configure multiple DBWR processes.

    Note:

    If asynchronous I/O is not available on your platform, then asynchronous I/O can be disabled by setting the DISK_ASYNCH_IO initialization parameter to FALSE.

    Using multiple DBWRs parallelizes the gathering and writing of buffers. Therefore, multiple DBWn processes should deliver more throughput than one DBWR process with the same number of I/O slaves. For this reason, the use of I/O slaves has been deprecated in favor of multiple DBWR processes. I/O slaves should only be used if multiple DBWR processes cannot be configured.

    10.3.10 latch events

    A latch is a low-level internal lock used by Oracle to protect memory structures. The latch free event is updated when a server process attempts to get a latch, and the latch is unavailable on the first attempt.

    There is a dedicated latch-related wait event for the more popular latches that often generate significant contention. For those events, the name of the latch appears in the name of the wait event, such as latch: library cache orlatch: cache buffers chains. This enables you to quickly figure out if a particular type of latch is responsible for most of the latch-related contention. Waits for all other latches are grouped in the generic latch free wait event.

    See Also:

    Oracle Database Concepts for more information on latches and internal locks

    10.3.10.1 Actions

    This event should only be a concern if latch waits are a significant portion of the wait time on the system as a whole, or for individual users experiencing problems.

    • Examine the resource usage for related resources. For example, if the library cache latch is heavily contended for, then examine the hard and soft parse rates.

    • Examine the SQL statements for the sessions experiencing latch contention to see if there is any commonality.

    Check the following V$SESSION_WAIT parameter columns:

    • P1 - Address of the latch

    • P2 - Latch number

    • P3 - Number of times process has already slept, waiting for the latch

    10.3.10.2 Example: Find Latches Currently Waited For

    SELECT EVENT, SUM(P3) SLEEPS, SUM(SECONDS_IN_WAIT) SECONDS_IN_WAIT
      FROM V$SESSION_WAIT
     WHERE EVENT LIKE 'latch%'
      GROUP BY EVENT;
    

    A problem with the previous query is that it tells more about session tuning or instant instance tuning than instance or long-duration instance tuning.

    The following query provides more information about long duration instance tuning, showing whether the latch waits are significant in the overall database time.

    SELECT EVENT, TIME_WAITED_MICRO, 
           ROUND(TIME_WAITED_MICRO*100/S.DBTIME,1) PCT_DB_TIME 
      FROM V$SYSTEM_EVENT, 
       (SELECT VALUE DBTIME FROM V$SYS_TIME_MODEL WHERE STAT_NAME = 'DB time') S
     WHERE EVENT LIKE 'latch%'
     ORDER BY PCT_DB_TIME ASC;
    

    A more general query that is not specific to latch waits is the following:

    SELECT EVENT, WAIT_CLASS, 
          TIME_WAITED_MICRO,ROUND(TIME_WAITED_MICRO*100/S.DBTIME,1) PCT_DB_TIME
      FROM V$SYSTEM_EVENT E, V$EVENT_NAME N,
        (SELECT VALUE DBTIME FROM V$SYS_TIME_MODEL WHERE STAT_NAME = 'DB time') S
       WHERE E.EVENT_ID = N.EVENT_ID
        AND N.WAIT_CLASS NOT IN ('Idle', 'System I/O')
      ORDER BY PCT_DB_TIME ASC;

    Wait event (二) 摘自官檔 Oracle版權所有

    10.3.10.3 Shared Pool and Library Cache Latch Contention

    A main cause of shared pool or library cache latch contention is parsing. There are a number of techniques that can be used to identify unnecessary parsing and a number of types of unnecessary parsing:

    10.3.10.3.1 Unshared SQL

    This method identifies similar SQL statements that could be shared if literals were replaced with bind variables. The idea is to either:

    • Manually inspect SQL statements that have only one execution to see whether they are similar:

      SELECT SQL_TEXT
        FROM V$SQLSTATS
       WHERE EXECUTIONS < 4
       ORDER BY SQL_TEXT;
      
    • Or, automate this process by grouping together what may be similar statements. Do this by estimating the number of bytes of a SQL statement which will likely be the same, and group the SQL statements by that many bytes. For example, the following example groups together statements that differ only after the first 60 bytes.

      SELECT SUBSTR(SQL_TEXT, 1, 60), COUNT(*)
        FROM V$SQLSTATS
       WHERE EXECUTIONS < 4 
       GROUP BY SUBSTR(SQL_TEXT, 1, 60)
       HAVING COUNT(*) > 1;
      
    • Or report distinct SQL statements that have the same execution plan. The following query selects distinct SQL statements that share the same execution plan at least four times. These SQL statements are likely to be using literals instead of bind variables.

      SELECT SQL_TEXT FROM V$SQLSTATS WHERE PLAN_HASH_VALUE IN
        (SELECT PLAN_HASH_VALUE 
           FROM V$SQLSTATS 
          GROUP BY PLAN_HASH_VALUE HAVING COUNT(*) > 4)
        ORDER BY PLAN_HASH_VALUE;
      
    10.3.10.3.2 Reparsed Sharable SQL

    Check the V$SQLSTATS view. Enter the following query:

    SELECT SQL_TEXT, PARSE_CALLS, EXECUTIONS 
      FROM V$SQLSTATS
    ORDER BY PARSE_CALLS;
    

    When the PARSE_CALLS value is close to the EXECUTIONS value for a given statement, you might be continually reparsing that statement. Tune the statements with the higher numbers of parse calls.

    10.3.10.3.3 By Session

    Identify unnecessary parse calls by identifying the session in which they occur. It might be that particular batch programs or certain types of applications do most of the reparsing. To do this, run the following query:

    SELECT pa.SID, pa.VALUE "Hard Parses", ex.VALUE "Execute Count" 
      FROM V$SESSTAT pa, V$SESSTAT ex 
     WHERE pa.SID = ex.SID 
       AND pa.STATISTIC#=(SELECT STATISTIC# 
           FROM V$STATNAME WHERE NAME = 'parse count (hard)') 
       AND ex.STATISTIC#=(SELECT STATISTIC# 
           FROM V$STATNAME WHERE NAME = 'execute count') 
       AND pa.VALUE > 0; 
    

    The result is a list of all sessions and the amount of reparsing they do. For each session identifier (SID), go to V$SESSION to find the name of the program that causes the reparsing.

    Note:

    Because this query counts all parse calls since instance startup, it is best to look for sessions with high rates of parse. For example, a connection which has been up for 50 days might show a high parse figure, but a second connection might have been up for 10 minutes and be parsing at a much faster rate.

    The output is similar to the following:

    SID  Hard Parses  Execute Count
    ------  -----------  -------------
         7            1             20
         8            3          12690
         6           26            325
        11           84           1619
    
    10.3.10.3.4 cache buffers lru chain

    The cache buffers lru chain latches protect the lists of buffers in the cache. When adding, moving, or removing a buffer from a list, a latch must be obtained.

    For symmetric multiprocessor (SMP) systems, Oracle automatically sets the number of LRU latches to a value equal to one half the number of CPUs on the system. For non-SMP systems, one LRU latch is sufficient.

    Contention for the LRU latch can impede performance on SMP systems with a large number of CPUs. LRU latch contention is detected by querying V$LATCH, V$SESSION_EVENT, and V$SYSTEM_EVENT. To avoid contention, consider tuning the application, bypassing the buffer cache for DSS jobs, or redesigning the application.

    10.3.10.3.5 cache buffers chains

    The cache buffers chains latches are used to protect a buffer list in the buffer cache. These latches are used when searching for, adding, or removing a buffer from the buffer cache. Contention on this latch usually means that there is a block that is greatly contended for (known as a hot block).

    To identify the heavily accessed buffer chain, and hence the contended for block, look at latch statistics for the cache buffers chains latches using the view V$LATCH_CHILDREN. If there is a specific cache buffers chains child latch that has many more GETS, MISSES, and SLEEPS when compared with the other child latches, then this is the contended for child latch.

    This latch has a memory address, identified by the ADDR column. Use the value in the ADDR column joined with the X$BH table to identify the blocks protected by this latch. For example, given the address (V$LATCH_CHILDREN.ADDR) of a heavily contended latch, this queries the file and block numbers:

    SELECT OBJ data_object_id, FILE#, DBABLK,CLASS, STATE, TCH
      FROM X$BH
     WHERE HLADDR = 'address of latch'
      ORDER BY TCH;
    

    X$BH.TCH is a touch count for the buffer. A high value for X$BH.TCH indicates a hot block.

    Many blocks are protected by each latch. One of these buffers will probably be the hot block. Any block with a high TCH value is a potential hot block. Perform this query a number of times, and identify the block that consistently appears in the output. After you have identified the hot block, query DBA_EXTENTS using the file number and block number, to identify the segment.

    After you have identified the hot block, you can identify the segment it belongs to with the following query:

    SELECT OBJECT_NAME, SUBOBJECT_NAME
      FROM DBA_OBJECTS
     WHERE DATA_OBJECT_ID = &obj;
    

    In the query, &obj is the value of the OBJ column in the previous query on X$BH.

    10.3.10.3.6 row cache objects

    The row cache objects latches protect the data dictionary.

    10.3.11 log file parallel write

    This event involves writing redo records to the redo log files from the log buffer.

    10.3.12 library cache pin

    This event manages library cache concurrency. Pinning an object causes the heaps to be loaded into memory. If a client wants to modify or examine the object, the client must acquire a pin after the lock.

    10.3.13 library cache lock

    This event controls the concurrency between clients of the library cache. It acquires a lock on the object handle so that either:

    • One client can prevent other clients from accessing the same object

    • The client can maintain a dependency for a long time which does not allow another client to change the object

    This lock is also obtained to locate an object in the library cache.

    10.3.14 log buffer space

    This event occurs when server processes are waiting for free space in the log buffer, because all the redo is generated faster than LGWR can write it out.

    Actions

    Modify the redo log buffer size. If the size of the log buffer is already reasonable, then ensure that the disks on which the online redo logs reside do not suffer from I/O contention. The log buffer space wait event could be indicative of either disk I/O contention on the disks where the redo logs reside, or of a too-small log buffer. Check the I/O profile of the disks containing the redo logs to investigate whether the I/O system is the bottleneck. If the I/O system is not a problem, then the redo log buffer could be too small. Increase the size of the redo log buffer until this event is no longer significant.

    10.3.15 log file switch

    There are two wait events commonly encountered:

    • log file switch (archiving needed)

    • log file switch (checkpoint incomplete)

    In both of the events, the LGWR is unable to switch into the next online redo log, and all the commit requests wait for this event.

    10.3.15.1 Actions

    For the log file switch (archiving needed) event, examine why the archiver is unable to archive the logs in a timely fashion. It could be due to the following:

    • Archive destination is running out of free space.

    • Archiver is not able to read redo logs fast enough (contention with the LGWR).

    • Archiver is not able to write fast enough (contention on the archive destination, or not enough ARCH processes). If you have ruled out other possibilities (such as slow disks or a full archive destination) consider increasing the number of ARCn processes. The default is 2.

    • If you have mandatory remote shipped archive logs, check whether this process is slowing down because of network delays or the write is not completing because of errors.

    Depending on the nature of bottleneck, you might need to redistribute I/O or add more space to the archive destination to alleviate the problem. For the log file switch (checkpoint incomplete) event:

    • Check if DBWR is slow, possibly due to an overloaded or slow I/O system. Check the DBWR write times, check the I/O system, and distribute I/O if necessary. See Chapter 8, "I/O Configuration and Design".

    • Check if there are too few, or too small redo logs. If you have a few redo logs or small redo logs (for example two x 100k logs), and your system produces enough redo to cycle through all of the logs before DBWR has been able to complete the checkpoint, then increase the size or number of redo logs. See "Sizing Redo Log Files".

    10.3.16 log file sync

    When a user session commits (or rolls back), the session's redo information must be flushed to the redo logfile by LGWR. The server process performing the COMMIT or ROLLBACK waits under this event for the write to the redo log to complete.

    Actions

    If this event's waits constitute a significant wait on the system or a significant amount of time waited by a user experiencing response time issues or on a system, then examine the average time waited.

    If the average time waited is low, but the number of waits are high, then the application might be committing after every INSERT, rather than batching COMMITs. Applications can reduce the wait by committing after 50 rows, rather than every row.

    If the average time waited is high, then examine the session waits for the log writer and see what it is spending most of its time doing and waiting for. If the waits are because of slow I/O, then try the following:

    • Reduce other I/O activity on the disks containing the redo logs, or use dedicated disks.

    • Alternate redo logs on different disks to minimize the effect of the archiver on the log writer.

    • Move the redo logs to faster disks or a faster I/O subsystem (for example, switch from RAID 5 to RAID 1).

    • Consider using raw devices (or simulated raw devices provided by disk vendors) to speed up the writes.

    • Depending on the type of application, it might be possible to batch COMMITs by committing every N rows, rather than every row, so that fewer log file syncs are needed.

    10.3.17 rdbms ipc reply

    This event is used to wait for a reply from one of the background processes.

    10.4 Idle Wait Events

    These events belong to Idle wait class and indicate that the server process is waiting because it has no work. This usually implies that if there is a bottleneck, then the bottleneck is not for database resources. The majority of the idle events should be ignored when tuning, because they do not indicate the nature of the performance bottleneck. Some idle events can be useful in indicating what the bottleneck is not. An example of this type of event is the most commonly encountered idle wait-event SQL Net message from client. This and other idle events (and their categories) are listed in Table 10-3.

    Table 10-3 Idle Wait Events


    Wait event (二) 摘自官檔 Oracle版權所有




來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/30936525/viewspace-2016738/,如需轉載,請註明出處,否則將追究法律責任。

相關文章