AIX平臺下磁碟的PVID對ASM磁碟的破壞

luckyfriends發表於2015-02-05

    這篇文章將透過兩篇MOS文章來討論AIX平臺下為磁碟分配PVID對ASM磁碟的破壞。

文章一:
   這篇文章說明的是對一個存在的ASM磁碟分配PVID將破壞ASM的磁碟頭,導致ASM磁碟組無法正常MOUNT。

Assigning a Physical Volume ID (PVID) To An Existing ASM Disk Corrupts the ASM Disk Header (文件 ID 353761.1)
修改時間:2013-4-19型別:ALERT

In this Document

Description
Occurrence
Symptoms
Workaround
History
References


APPLIES TO:

Oracle Database - Enterprise Edition - Version 10.1.0.2 to 11.2.0.3 [Release 10.1 to 11.2]
IBM AIX on POWER Systems (64-bit)
***Checked for relevance on 30-Apr-2010***

AIX5L Based Systems (64-bit)


DESCRIPTION

Assigning a Physical Volume ID (PVID) to an existing ASM disk will destroy the ASM disk header rendering the
ASM disk unusable.

Various documents including the 10gR1 and 10gR2 installation instructions for AIX platforms suggest to assign
a PVID to disks to be used for ASM using the following command:
  

  # /usr/sbin/chdev -l hdiskn -a pv=yes


These documents furthermore suggest that this command is to be run on ALL nodes of a RAC cluster. This
does not present a problem as long as the disks have not yet been used by ASM. If however the disk are 
already in use and above command is issued against an ASM disk the file header will be destroyed. 
This is likely to happen if a new node is added to an existing RAC cluster as the documentation seems
to imply this has to be done on all nodes.

To check if a device has an associated PVID , use lspv:
EXAMPLE:


# lspv

hdisk0 0003286f04bc73ee rootvg active
hdisk1 0003286f867d77e1 rootvg active
hdisk2 0003286fb3470dae vg01 active
hdisk3 0003286fb3474190 vg01 active
hdisk4 0003286fb34747d1 vg01 active
hdisk5 0003286fb3474dff vg01 active
hdisk6 0003286fb3475428 vg01 active
hdisk7 0003286fb347607d vg01 active
hdisk8 0003286fb34766f3 vg01 active
hdisk9 0003286fb3476d70 vg01 active
hdisk10 0003286fb34773d5 vg01 active
hdisk11 0003286fb34780b8 vg01 active
hdisk12 0003286fb347872f vg01 active
hdisk13 0003286fb347940c vg01 active
hdisk14 0003286fb3479a7b vg01 active

The second column is the PVID.

OCCURRENCE

This is more likely to happen in a RAC environment, specifically if a new node is added to an existing
cluster.

SYMPTOMS

If the 'chdev' command is run while ASM instances have the disk mounted nothing will be noticed immediately 
as the disk header is only read when the disk is mounted. If however the diskgroup is unmounted and re-mounted
(e.g. ASM instance restart) the disk is no longer recognized as an ASM disk and the diskgroup mount will fail
with 
ORA-15063 "diskgroup \"%s\" lacks quorum of %s PST disks; %s found"
or
ORA-15063: ASM discovered an insufficient number of disks for diskgroup s%
ORA-15040: diskgroup is incomplete
ORA-15042: ASM disk "%" is missing

WORKAROUND

Do not assign a PVID to ASM disks, contrary to the documentation PVIDs are not required for ASM disks 
as ASM uses the ASM disk header to discover it's disks. 
This has been addressed in (Documentation)  Bug 3636335 which states:

       "This is a doc. bug and we are going to clearly document not to put PVIDs on
        disks given to ASM. The idea here is that ASM is the one which manages the
        disk and not any OS / vendor volume managers etc., PVIDs are needed for volume
        groups to work. For ASM to work, PVIDs are not needed. ASM has its own headers
        to identify the disk which is what is getting written here. "

As long as there is still an ASM instance which has the disk(group) mounted the 
contents may be backed up via RMAN as soon as possible.

Also the action plan from the Document 750016.1 can be applied. Also recommend to raise an SR with Oracle Support.

HISTORY

 Checked for relevance on 18-APR-2013

REFERENCES

 - PVID IN DISK HEADER IS OVERWRITTEN AFTER ADDING A NEW DISK TO ASM DISKGROUP
NOTE:750016.1 - Corrective Action for ASM Diskgroup with Disks Having PVIDs on AIX


文章二:
    這篇文章解釋了兩方面的問題,其一,如果在建立ASM磁碟組之前所屬的ASM磁碟就有了PVID,磁碟組建立成功將磁碟頭的PVID資訊覆蓋掉,但由於磁碟的PVID資訊會存在磁碟頭和ODM庫中,伺服器一旦重啟,AIX會嘗試用ODM庫中的PVID重新覆蓋磁碟頭,從而破壞ASM磁碟頭。其二,如果出現了上述情況,在沒有重啟作業系統之前如何清除磁碟的PVID。

Corrective Action for ASM Diskgroup with Disks Having PVIDs on AIX (文件 ID 750016.1)
修改時間:2013-4-7型別:HOWTO

In this Document

Goal
Solution
References


APPLIES TO:

Oracle Database - Enterprise Edition - Version 10.1.0.2 to 11.2.0.2 [Release 10.1 to 11.2]
IBM AIX on POWER Systems (64-bit)
IBM AIX Based Systems (64-bit)


GOAL

You have created a diskgroup with disks having PVID and the diskgroup is in use. There is no diskgroup metadata corruption reported yet. You now know that ASM Disk should not have PVID as alerted in MetaLink Note 353761.1 

This note will give the steps to clear the PVID of these ASM Disks.

SOLUTION

When the PVID is set to a disk in a volume group, the PVID is stored in two locations. In Physical disk header ( within first 4K )and in AIX's system object database, called ODM ( Object Data Manager ). 

When the diskgroup is created, the disk header information of PVID is overwritten. However, with reboot the OS, from ODM, AIX might try to restore the PVID information onto the disk header, 
there by destroying the ASM metadata. 

If the ASM disk header Metadata has not been over written by PVID from ODM ( before a reboot ), then you can follow the following steps to update the ODM not to have PVID for the disks:

1] Do not reboot any node.

1.1] Drop one disk at a time from the diskgroup.

1.2] Clear the PVID of the dropped disk

# chdev -l hdisk5 -a pv=clear

Run this on ALL the nodes in case of RAC.

1.3] Check the disk does not have the PVID from ALL the nodes

# lspv

1.4] Add the disk back to the diskgroup

1.5] Do this for all the disks having PVID in the diskgroup, one by one. Take care that the rebalance is complete from the drop/add disk command before going for the next disk.

OR

2] This needs downtime:

2.1] Take 'dd' backup of the disk headers

# dd if=/dev/hdisk5 of=/tmp/d5.txt bs=1024 count=1024

2.2] Shutdown ASM instance ( on ALL the nodes in RAC setup ).

2.3] Clear the PVID

# chdev -l hdisk5 -a pv=clear

Run this on ALL the nodes in case of RAC.

2.4] Check the disk does not have the PVID from ALL the nodes

# lspv

2.5] Start the ASM Instance(s) and mount the diskgroup on ALL the nodes


WARNING:
Point-2 commands overrides the content of the disk header and so could be destructive if not correctly used. If you have any doubt, raise an SR with Oracle Support before any action. 


總結:
    不管是手動還是AIX自動為磁碟分配PVID都將破壞ASM磁碟頭,導致ASM磁碟組無法載入。為了避免出現這種情況,我們應該遵守以下的規則:
1).確保在建立ASM磁碟組之前,清除所有節點所有ASM需要使用的磁碟的PVID。
2).磁碟組一旦建立成功,應該避免對ASM磁碟手動分配PVID。
3).磁碟組一旦建立成功,應該手動執行ASMCMD工具下的md_backup命令對磁碟組後設資料進行備份。
4).在規劃的時候,建議每次磁碟組由兩個以上的ASM磁碟組成。

--end--      

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/14710393/viewspace-1427726/,如需轉載,請註明出處,否則將追究法律責任。

相關文章