Using V$BACKUP_ASYNC_IO / V$BACKUP_SYNC_IO to Monitor RMAN Performance

zhouwf0726發表於2019-04-16
Using V$BACKUP_ASYNC_IO / V$BACKUP_SYNC_IO to Monitor RMAN Performance [ID 237083.1]

  修改時間 20-OCT-2010     型別 TROUBLESHOOTING     狀態 PUBLISHED  
PURPOSE
-------

Introduce customers to usage of V$BACKUP_ASYNC_IO for monitoring RMAN performance.
 
SCOPE & APPLICATION
-------------------

For all that use RMAN to provide a better understanding of V$BACKUP_ASYNC_IO usage.

Using V$BACKUP_ASYNC_IO / V$BACKUP_SYNC_IO to Monitor RMAN Performance
----------------------------------------------------------------------
 
These views display performance information about ongoing and recently completed RMAN backups and restores.
 For each backup, they contains one row for each input datafile, one row for the aggregate total performance of all datafiles, 
and one row for the output backup piece. 
This data is not stored persistently, and is not preserved when the instance is re-started. 

The biggest benefit  from these views is the EFFECTIVE_BYTES_PER_SECOND column in rows where TYPE column is set to AGGREGATE.
This column represents the rate at which the objects are being backed up or recovered, in bytes per second. This number should be close 
to the listed read/write rate of your backup hardware, then you should be looking for some problem with your backup process.
The problem could be caused by a overburdened CPU to a saturated network, or perhaps a configuration issue with the MML interface
to your vendor's backup solution.

Use V$BACKUP_SYNC_IO and V$BACKUP_ASYNC_IO to determine the source
of backup or restore bottlenecks and to determine the progress of backup jobs. 
 V$BACKUP_SYNC_IO contains rows when the I/O is synchronous to the process (or thread, on some platforms) performing the backup.
 V$BACKUP_ASYNC_IO contains rows when the I/O is asynchronous. Asynchronous I/O is obtained either with I/O processes or because it is
 supported by the underlying operating system.

Here is an example of a query against V$BACKUP_ASYNC_IO and its results after a database backup has been completed:

select device_type "Device", type, filename, to_char(open_time, 'mm/dd/yyyy hh24:mi:ss') open,
to_char(close_time,'mm/dd/yyyy hh24:mi:ss') close,elapsed_time ET, effective_bytes_per_second EPS
from v$backup_async_io
where close_time > sysdate - 30
order by close_time desc;

Device  TYPE        FILENAME                               OPEN                  CLOSE                 ET      EPS

DISK    OUTPUT      d:\backup\recover\5cdvfjvh_1_1         08/06/2002 20:24:43   08/06/2002 20:28:10   20700   566435
DISK    AGGREGATE                                          08/06/2002 20:24:58   08/06/2002 20:28:09   19100   933288
DISK    INPUT       d:\oracale\oradata\recover\system.dbf  08/06/2002 20:24:58   08/06/2002 20:28:09   19100   933288
DISK    OUTPUT      d:\backup\recover\5bdvfjup_1_1         08/06/2002 20:24:18   08/06/2002 20:24:35    1700    24576
DISK    AGGREGATE                                          08/06/2002 20:24:19   08/06/2002 20:24:34    1500  1118481 
DISK    INPUT       d:\oracale\oradata\recover\undo.dbf    08/06/2002 20:24:19   08/06/2002 20:24:34    1500   699051
DISK    INPUT       d:\oracale\oradata\recover\tools.dbf   08/06/2002 20:24:27   08/06/2002 20:24:29     200   524288
DISK    OUTPUT      d:\backup\recover\5advfjug_1_1         08/06/2002 20:23:55   08/06/2002 20:24:13    1800    18204
DISK    AGGREGATE                                          08/06/2002 20:23:58   08/06/2002 20:24:11    1300   967916
DISK    INPUT       d:\oracale\oradata\recover\users.dbf   08/06/2002 20:24:05   08/06/2002 20:24:11     600   873813
DISK    INPUT       d:\oracale\oradata\recover\indx0.dbf   08/06/2002 20:24:08   08/06/2002 20:24:09     100  2097152

From this output we can see the effective transfer rate from the database to the backup set by RMAN, also the name of the datafile 
that was backed up and the actual start and stop time of the backup itself.

To determine whether there is an I/O problem, we can look at the ratio of  I/Os to long  waits (LONG_WAITS/IO_COUNTS)

select io_count, ready, short_waits, long_waits, long_waits/io_count, filename
from v$backup_async_io ;

IO_COUNT    READY   SHORT_WAITS    LONG_WAITS      LONG_WAITS/IO_COUNT         FILENAME

      2      1              0        1                     0.5              d:\oracle\admin\recover\arch\arc00052.001
      2      1              0        1                     0.5              d:\oracle\admin\recover\arch\arc00046.001
      2      1              0        1                     0.5              d:\oracle\admin\recover\arch\arc00051.001
      2      1              0        1                     0.5              d:\oracle\admin\recover\arch\arc00050.001
    171    107             12       52                     0.304093567      d:\oracle\oradata\recover\system.dbf
     11      8              2        1                     0.090909091      d:\oracle\oradata\recover\undo.dbf
      6      4              0        2                     0.333333333      d:\oracle\oradata\recover\tools.dbf
      6      3              0        3                     0.5              d:\oracle\oradata\recover\users.dbf
      3      1              0        2                     0.666666667      d:\oracle\oradata\recover\indx0.dbf

The numbers returned by this indicate some sort of I/O bottleneck is causing grief (in this case is a single CPU machine).
      
Explanation :
LONG_WAITS : The number of times that a buffer was not immediately available, 
             and only became available after a blocking wait was issued.

In the above many files had to wait 30%-66% of the amount of IO's for a LONG-WAIT.
This indicate a OS-bufercache / configuration / disk problem.

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/756652/viewspace-694471/,如需轉載,請註明出處,否則將追究法律責任。

相關文章