ASM 11g New Features - How ASM Disk Resync Works

paulyibinyi發表於2011-05-11

Scope and Application

ASM Fast Disk Resync Overview

1) When we take a disk offline in case the disk is corrupted or database is not able to read or write from the disk. In case of Oracle database 10g, oracle engine use to balance the other disks with the content of offline disk. This process was a relatively costly operation, and could take hours to complete, even if the disk failure was only a transient failure.

2) Oracle Database 11g introduces the ASM Fast Mirror Resync feature that significantly reduces the time required to resynchronize a transient failure of a disk. When a disk goes off line oracle engine doesn’t balance other disk, instead ASM tracks the allocation units that are modified during the outage. The content present in the failed disk is tracked by other disks and any modification that is made to the content of failed disk is actually made in other available disks. Once we get the disk back and attach it, the data belonging to this disk and which got modified during that time will get resynchronized back again. This avoids the heavy re-balancing activity.

3) ASM fast disk resync significantly reduces the time required to resynchronize a transient failure of a disk. When a disk goes offline following a transient failure, ASM tracks the extents that are modified during the outage. When the transient failure is repaired, ASM can quickly resynchronize only the ASM disk extents that have been affected during the outage.

4) This feature assumes that the content of the affected ASM disks has not been damaged or modified.

5) When an ASM disk path fails, the ASM disk is taken offline but not dropped if you have set the DISK_REPAIR_TIME attribute for the corresponding disk group. The setting for this attribute determines the duration of a disk outage that ASM tolerates while still being able to resynchronize after you complete the repair.

Note: The tracking mechanism uses one bit for each modified allocation unit. This ensures that the tracking mechanism very efficient.

ASM 11g New Features - How ASM Disk Resync Works.

 Requirements:

1) This feature requires that the redundancy level for the disk should be set to NORMAL or HIGH.

2) compatible.asm & compatible.rdbms = 11.1.0.0.0 or higher

3) You need to set DISK_REPAIR_TIME parameter, which gives the time it takes for the disk to get repaired. The default time for this is set to 3.6 hours.

Examples:

SQL> ALTER DISKGROUP dgroupA SET ATTRIBUTE 'DISK_REPAIR_TIME'='3H';



4) The disk has to be offline (automatically due to the hardware failure or manually for maintenance operations) and should not be dropped.


To take the disk offline use:

SQL> ALTER DISKGROUP … OFFLINE DISKS command.



Example:

ALTER DISKGROUP dgroupA OFFLINE DISKS IN FAILGROUP controller2 DROP AFTER 5H;


Repair time for the disk is associated with diskgroup. You can override the repair time of diskgroup using following command:

SQL> ALTER DISKGROUP dgroupA SET ATTRIBUTE ‘DISK_REPAIR_TIME’='3H’;




Additional Manual Offline Disk Operations Examples:

SQL>ALTER DISKGROUP DG1 OFFLINE DISK DG1_0003 ;
SQL>ALTER DISKGROUP DG1 OFFLINE DISK DG1_0003 DROP AFTER 1H;
SQL>ALTER DISKGROUP DG1 OFFLINE DISKS IN FAILGROUP FG1;
SQL> ALTER DISKGROUP dgroupA OFFLINE DISKS IN FAILGROUP controller2 DROP AFTER 5H;

5) After the transient failure was corrected on the affected disks, you will need to explicitly online the disks.

Examples:

SQL>ALTER DISKGROUP DG1 ONLINE DISK DG1_0003;

SQL>ALTER DISKGROUP DG1 ONLINE DISKS IN FAILGROUP FG1 POWER 8 WAIT;

 

6) If you cannot repair a failure group that is in the offline state, you can use the ALTER DISKGROUP DROP DISKS IN FAILGROUP command with the FORCE option. This ensures that data originally stored on these disks is reconstructed from redundant copies of the data and stored on other disks in the same diskgroup.

Example:

 

SQL> ALTER DISKGROUP dgroupA DROP DISKS IN FAILGROUP controller2;

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/7199859/viewspace-695025/,如需轉載,請註明出處,否則將追究法律責任。

相關文章