ASM Fast Mirror Resync

jichengjie發表於2018-03-21

ASM Fast Mirror Resync - Example To Simulate Transient Disk Failure And Restore Disk [ID 443835.1]

In this Document
  Purpose
  Scope and Application
  ASM Fast Mirror Resync - Example To Simulate Transient Disk Failure And Restore Disk

Applies to:

Oracle Server - Enterprise Edition - Version: 11.1.0.6 to 11.2.0.1.0 - Release: 11.1 to 11.2
Information in this document applies to any platform.
Purpose:

This note discusses the New 11g ASM feature called ASM Fast Mirror Resync . Also an example is taken to show how this works. We will simulate the transient disk failure and recover the disk before disk repair time.
Scope and Application:

All the DBA's and user's concerned with Database and ASM Administration activities.
ASM Fast Mirror Resync:

ASM fast resync keeps track of pending changes to extents on an OFFLINE disk during an outage. The extents are resynced when the disk is brought back online or replaced.

By default, ASM drops a disk shortly after it is taken offline. You can set the DISK_REPAIR_TIME attribute to prevent this operation by specifying a time interval to repair the disk and bring it back online.The default DISK_REPAIR_TIME attribute value of 3.6h should be adequate for most environments.The elapsed time (since the disk was set to OFFLINE mode) is incremented only when the disk group containing the offline disks is mounted. The REPAIR_TIMER column of V$ASM_DISK shows the amount of time left (in seconds) before an offline disk is dropped. After the specified time has elapsed,ASM drops the disk.

You can override this attribute with an ALTER DISKGROUP DISK OFFLINE statement and the DROP AFTER clause.

If an ALTER DISKGROUP SET ATTRIBUTE DISK_REPAIR_TIME is issued on a disk group that has disks that are currently offline, the new attribute value applies only to those disks that are not currently in OFFLINE mode.

A disk that is in OFFLINE mode cannot be dropped with an ALTER DISKGROUP DROP DISK statement; an error is returned if attempted. If for some reason the disk needs to be dropped (such as the disk cannot be repaired) before the repair time has expired, a disk can be dropped immediately by issuing a second OFFLINE statement with a DROP AFTER clause specifying 0h or 0m.

You can use ALTER DISKGROUP to set the DISK_REPAIR_TIME attribute to a specified hour or minute value, such as 4.5 hours or 270 minutes. For example:

alter diskgroup dg set attribute 'disk_repair_time' = '4.5h'
alter diskgroup dg set attribute 'disk_repair_time' = '270m'

After you repair the disk, run the SQL statement ALTER DISKGROUP DISK ONLINE. This statement brings a repaired disk group back online to enable writes so that no new writes are missed. This statement also starts a procedure to copy of all of the extents that are marked as stale on their redundant copies.

If a disk goes offline when the ASM instance is in rolling upgrade mode, the disk remains offline until the rolling upgrade has ended and the timer for dropping the disk is stopped until the ASM cluster is out of rolling upgrade mode. See "ASM Rolling Upgrade".

Note: To use this feature, the disk group compatibility attributes must be set to 11.1 or higher.

Please find below example in which we will simulate the transient disk failure and recover the disk before disk repair time
SQL> create diskgroup dgnm11gasm disk '/dev/raw/raw1','/dev/raw/raw2' attribute 'compatible.rdbms'='11.1','compatible.asm'='11.1';
Diskgroup created.

SQL> select group_number,name from v$asm_diskgroup where group_number=1;

GROUP_NUMBER         NAME
------------ --------------------
    1            DGNM11GASM

SQL>select name,value from v$asm_attribute where group_number=1;
NAME VALUE
-------------------- --------------------
disk_repair_time 3.6h
au_size 1048576
compatible.asm 11.1.0.0.0
compatible.rdbms 11.1.0.0.0

Default disk repair time is 3.6 hours

Connect to DB Instance

SQL> create tablespace test datafile '+DGNM11GASM' size 20m;
Tablespace created.

Shutdown the DB Instance
Dismount the ASM Diskgroup

SQL> alter diskgroup DGNM11GASM dismount;
Diskgroup altered.

Change the permission of /dev/raw/raw1 to simulate the disk loss

[root@11g ~]# chown root.root /dev/raw/raw1
[root@11g ~]# ls -ltr /dev/raw/raw1
crw-rw---- 1 root root 162, 1 Jul 8 01:47 /dev/raw/raw1

SQL> alter diskgroup dgnm11gasm mount;
alter diskgroup dgnm11gasm mount
*
ERROR at line 1:
ORA-15032: not all alterations performed
ORA-15040: diskgroup is incomplete
ORA-15042: ASM disk "0" is missing

With Oracle Database 11g, ASM will fail to mount a diskgroup if there are any missing disks or failgroups during mount.You need to mount the diskgroup with FORCE option.

Disk groups mounted with the FORCE option will have one or more disks offline if they were not available at the time of the mount.

SQL> alter diskgroup dgnm11gasm mount force;
Diskgroup altered.

SQL>select path,name,repair_timer from v$asm_disk where group_number=1;
PATH NAME REPAIR_TIMER
--------------- -------------------- ------------
DGNM11GASM_0000 12960
/dev/raw/raw2 DGNM11GASM_0001 0

Disk groups mounted with the FORCE option will have one or more disks offline if they are not available at time of the mount.You must take corrective actions before DISK_REPAIR_TIME expires to restore those devices

Connect to DB Instance and add new datafile to the tablespace.

SQL> alter tablespace test add datafile '+DGNM11GASM' size 20m;
Tablespace altered.

As there is only one disk available in the diskgroup (Normal redundancy), there will not be any mirror copy until the lost disk is accessible from oracle user and it is onlined using alter diskgroup online/new disk is added to diskgroup

chown oracle.dba /dev/raw/raw1
SQL> alter diskgroup dgnm11gasm online disk DGNM11GASM_0000;
Diskgroup altered.

SQL> select group_number,operation,state from v$asm_operation;
GROUP_NUMBER OPERA STAT POWER
---------- ----------------------------
1 ONLIN RUN 1

ASM fast resync keeps track of pending changes to extents on an OFFLINE disk during an outage. The extents are resynced when the disk is brought back online or replaced.

SQL> select path,header_status,mount_status from v$asm_disk where group_number=1;

PATH HEADER_STATU MOUNT_S
--------------- ------------ -------
/dev/raw/raw2 MEMBER CACHED
/dev/raw/raw1 MEMBER CACHED

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/26870952/viewspace-2152089/,如需轉載,請註明出處,否則將追究法律責任。

相關文章