Common Oracle Wait Event Descriptions(zt)

gangyaocn發表於2010-11-05

【REVIEW】 Oracle wait events is an important measure of operating conditions based on Oracle and indicators. The concept of waiting for the event introduced in Oracle7.0.1.2, roughly 100 to wait for the event. In this paper, Oracle common wait events in detail.
Oracle wait events is an important measure of operating conditions based on Oracle and indicators. The concept of waiting for the event introduced in Oracle7.0.1.2, roughly 100 to wait for the event. In Oracle 8.0 in this number increased to about 150, in Oracle8i approximately 200 events, in Oracle9i that about 360 to wait for the event. There are two main types of wait events, that is, idle (idle) and non-idle wait events (non-idle) to wait for the event.

Free event that Oracle is waiting for some work in diagnosis and optimization of the database, we do not have too much attention to this part of the event.

Common idle events are:

• dispatcher timer

• lock element cleanup

• Null event

• parallel query dequeue wait

• parallel query idle wait - Slaves

• pipe get

• PL / SQL lock timer

• pmon timer-pmon

• rdbms ipc message

• slave wait

• smon timer

• SQL * Net break / reset to client

• SQL * Net message from client

• SQL * Net message to client

• SQL * Net more data to client

• virtual circuit status

• client message

Non-idle wait events specifically for Oracle's activities in the task or application that is running the database occurs during the wait, they wait for the event is to adjust the database when we should be concerned with the study.

Some common non-idle wait events:

• db file scattered read

• db file sequential read

• buffer busy waits

• free buffer waits

• enqueue

• latch free

• log file parallel write

• log file sync

1. Db file scattered read-DB File Scattered Read

This display and a full table scan is usually related to waiting. When the database full table scan and, based on performance considerations, data distributed (scattered) read Buffer Cache. If this wait event more significant, may indicate that for some full table scan the table, did not create the index or not to create an appropriate index, we may need to check to determine whether these data sheets for the correct settings.

However, this wait event does not necessarily mean poor performance, under certain conditions, Oracle will take the initiative to replace the use of full table scan to improve the performance of index scan, the amount of data and access relevant, under the Oracle CBO will be conducted in a more intelligent choice , under the RBO more likely to use Oracle indexes.

Because full table scans are placed in LRU (Least Recently Used, recently, at least for) the list of the cold end (cold end), for frequently accessed data on the smaller table, you can choose them in the Cache memory, in order to avoid repeated reading .

When the wait for the more significant the event, you can combine the dynamic performance view v $ session_longops to the diagnosis, the view recorded in a long time (to run more than 6 seconds) to run things, may be many full table scan operation (In any case, This part of the information is worthy of our attention).

2. Db file sequential read-DB file sequential read.

This event is usually shown with a single block of data related to read operations (such as the index reading). If this wait event more significant, may indicate that in a multi-table connection, the connection table order problems, may not have the right to use drive form; or may indicate that non-selective indexes.

In most cases, we say that the index can be more quickly through access to records, so for a coding standard, adjust the sound database, it is normal to wait much. However, in many cases, the use of the index is not the best choice, such as read a lot of data in large tables, full table scan may be significantly faster than the index scan, so we should pay attention to development, for this query index scan should be avoided.

3. Free Buffer-free buffer

This wait event indicates that the system is waiting for the available memory space, indicating that the current has been no Free Buffer memory space. If the application is well designed, SQL writing rules, fully bound variable, then this may indicate Buffer Cache waiting for smaller settings, you may need to increase DB_BUFFER_CACHE.

Free Buffer wait for DBWR to write speed may indicate lack of serious competition, or the disk may need to consider increasing the checkpoint, using more DBWR processes, or increase the number of physical disks, distributed load balancing IO.

4. Buffer Busy-buffer busy

The wait for the event said it was waiting for a unshareable use of the buffer, or that is currently being read into the buffer cache. Buffer Busy Wait generally should not exceed 1%. Check buffer wait statistics in part (or V $ WAITSTAT), take a look at the anticipation of the first segment (Segment Header). If so, consider increasing the free list (freelist, for Oracle8i DMT) or increase the freelist groups (in many cases this adjustment is immediate, in 8.1.6, before this argument can not be dynamically modified freelists; in 8.1.6 and later, Dynamic changes feelists need to set the COMPATIBLE at least 8.1.6).

If this waiting in the undo header, you can increase the rollback (rollback segment) to solve the buffer problem. If you wait in the undo block, we may need to check the relevant application, the appropriate reduction of the consistency of large-scale reading, or reduce the consistency of read (consistent read) data in the table, or increase the density DB_CACHE_SIZE.

If you wait in the data block, may be considered frequent concurrent access of the table or data to another data block or for a wider distribution (pctfree value can increase and expand the data distribution, reduction of competition), to avoid the "hot spots "data blocks, or you can consider increasing the table free list or use the localized management of the table space (Locally Managed Tablespaces).

If you wait in the index block, you should consider rebuilding the index, index partition, or use a reverse key index. To prevent buffer busy blocks of data related to waiting, you can use smaller pieces: in this case, a single block of records on less, so the block is not so "busy"; or you can set a large pctfree, to expand the physical distribution of data to reduce the hot competition between records.

In the implementation of the DML (insert / update / delete) when, Oracle writes the information to the data block, concurrent access for multiple services data tables, and wait on ITL's competitive potential, in order to reduce the wait, you can increase initrans, using multiple a ITL slot. In Oracle9i, the introduction of a new concept: ASSM (Segment Space Management Auto). With this new feature to manage Oracle to use bitmaps use of space.

LMT ASSM completely changed the combination of Oracle's storage mechanism, bitmap freelist can reduce the buffer busy wait (buffer busy wait), this issue has been in previous versions of Oracle9i is a serious problem.

Oracle claims ASSM significantly improved the performance of DML concurrency, because (with a) different parts of the bitmap can be used simultaneously, thus eliminating the search for the remaining space serialization. According to Oracle's test results, using bitmap freelist would eliminate all sub-head (to resources) of the battle, but also get super-fast concurrent insert operations. In Oracle9i being, Buffer Busy wait no longer common!


5. Latch free-latch release

latch is a low-level queuing mechanism used to protect shared memory structures in the SGA. latch is like being a fast memory access and release the lock. Shared memory structure used to prevent simultaneous access by multiple users. If the latch is not available, it will record the release latch failure (latch free miss). There are two types of related and bars:

■ immediately.

■ can wait.

If a process tries to latch in immediate mode access, and the latch has been held by another process, if the latch can not be established, then the process will not wait to get the bolt and. It will continue to implement the other operation.

Most latch problems are associated with the following:

No good is bound variables (library cache latch), redo generation issues (redo allocation latch), buffer storage competition (cache buffers LRU chain), and the buffer cache in the presence of "hot spots" block (cache buffers chain).

Usually we say, if you want to design a system failure, regardless of bind variables, this condition is enough, and strong for heterogeneous systems, the consequences of not using bind variables is extremely serious.

There is also some latch wait and bug related to bug should be concerned about the announcement and the related Metalink patch release. When latch miss ratios greater than 0.5%, it would be to study the issue.

Oracle of the latch mechanism is competition, its handling is similar to the network of the CSMA / CD, all user processes compete for latch, are willing to wait for the type of (willing-to-wait) of the latch, if a process is not the first attempt to get latch, then it will wait and try again, if the fight can not get through _spin_count second latch, and then the process into sleep, continue for a specified length of time, and then woke up again, in order to repeat the previous steps. in 8i/9i the default value is _spin_count = 2000.

If the SQL statement can not be adjusted, in the 8.1.6 version of the above, Oracle provides a new initialization parameters: CURSOR_SHARING CURSOR_SHARING = force by setting mandatory bind variable in the server side. Setting this parameter may bring some side effects, for Java programs, there are related bug, specific applications should be concerned about the bug Metalink bulletin.

6. Log Buffer Space-log buffer space

When you log buffer (log buffer) generated faster than LGWR redo logs the write speed, or when the log switch (log switch) is too slow, they will happen to wait. This wait occurs, usually that redo log buffer is too small, to solve this problem, consider increasing the size of the log file, or increase the size of the log buffer.

Another possible reason is that disk I / O bottlenecks, consider using a faster disk write speed. Under the conditions set in the permit, consider using bare device to store the log files, increase writing efficiency. In general the system, the minimum standard is not to log files and data files together, because usually only write the log file is not read, separating storage access performance.

7. Log File Switch-log file switch

When the wait appears that all of the author (commit) the request needs to wait for the "log file switch" to complete.

Log file Switch consists mainly of two sub-events:

log file switch (archiving needed)

log file switch (checkpoint incomplete)

log file switch (archiving needed)

This wait event occurs because the log group is usually filled circle after the first log file has not been completed, there the wait. Appears that the wait may be said io a problem. Solution:

Consider increasing the log file and increase the log group

Move archive file to the fast disk

Adjustment log_archive_max_processes.

log file switch (checkpoint incomplete) - log switch (checkpoint incomplete)

When you are finished after the log group, LGWR trying to write the first log file, if this time the database is not complete written record of the first log file in the dirty block (for example, the first checkpoint is completed), the wait event appears.

The event usually means you wait for the DBWR to write is too slow or IO problems.

To address the problem, you may want to consider additional DBWR or increase your log group or log file size.

8. Log file sync-log file sync

When a user commits or rolls back data, LGWR will then redo from the log of writes to the redo log buffer. Log file synchronization process must wait for the process completed successfully. In order to reduce this wait event, you can try once more records submitted (frequent author will bring more overhead). The redo logs placed in a faster disk, or alternately use a different physical disk redo logs, to reduce the impact of archiving on LGWR.

For a soft RAID, generally do not use RAID 5, RAID5 write too often the system for a large performance loss, consider using file system direct input / output, or use the raw device (raw device), so get written performance improvement.

9. Log file single write the event log files only to write the first block-related, usually occurs in the addition of new team members and enhance serial number.

The first to write a single block, because some of the first block of information is the file number, different for each document. Update log file header in the background to complete this operation, it seldom occurs to wait, without too much concern.

10. Log file parallel write

Write redo records from the log buffer to redo log files, mainly referring to conventional write operation (relative to the log file sync). If you Log group multiple team members, when the flush log buffer, the write operation in parallel, this time for this wait event possible.

While this write parallel processing, until all I / O operation to complete the write operation to complete (if your disk supports asynchronous IO or use IO SLAVE, even if only one redo log file member, may also occur waiting).

The parameters and compared with log file sync time can be used to measure the cost of log file write. Often referred to as synchronization cost rates.

11. Control file parallel write-control file parallel write

When the server process of updating all the control file, this event possible. If the wait is very short, can not be considered. If you wait longer, check the physical disk storage control file I / O bottleneck exists.

Is exactly the same number of control file copy, for the mirror to improve security. For business systems, multiple control files should be stored in different disks, in general, three is enough, if only two physical hard disk, then two control files are also acceptable. The same number of control files saved on the disk is not available practical significance. Reduce the wait, you can consider the following methods:

Reduce the number of control files (the premise of ensuring safety)

If the system supports the use of asynchronous IO

Transfer of control file to lighten the burden of the physical disk IO

12. Control file sequential read / control file single write control file sequential read / control file single write a single control file I / O problems, the two events occur. If you wait more obvious, a single control file checking to see whether there is a location for I / O bottlenecks.

13. Direct path write-direct path write wait in the system waiting for confirmation that all outstanding asynchronous I / O have been written to disk. Write to wait for this, we should find the I / O operation of the most frequent data file (if there is too much of the sorting operation is likely that temporary files), distributed load, speed up the write operation.

If there is excessive disk sorting will cause temporary table space operation frequency, in this case, consider using the Local Management of table space, divided into several small files, write to different disks or raw device.

14. Idle Event-free events

Finally, we look at a few idle wait events. In general, the system is idle waiting for is waiting for nothing to do, or wait for the user's request or response, etc., we can usually ignore these wait events. Idle events can query through the stats $ idle_event be.

We look at the system idle waiting for the main event, all of these events should have a general impression, if your Top 5 wait events, mainly those events, then your system is parity in general idleness.

[@more@]

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/26651/viewspace-1040977/,如需轉載,請註明出處,否則將追究法律責任。

相關文章