How to restore ASM based OCR after complete loss of the CRS diskgroup
轉過來看看,真實的情況應該比這個要複雜
How to restore ASM based OCR after complete loss of the CRS diskgroup on Linux/Unix systems
In this Document
|
Goal |
|
Solution |
Applies to:
Oracle Database - Enterprise Edition - Version 11.2.0.1.0 to 11.2.0.4 [Release 11.2]Information in this document applies to any platform.
Goal
It is not possible to directly restore a manual or automatic OCR
backup if the OCR is located in an ASM disk group. This is caused by the
fact that the command 'ocrconfig -restore' requires ASM to be up &
running in order to restore an OCR backup to an ASM disk group. However,
for ASM to be available, the CRS stack must have been successfully
started. For the restore to succeed, the OCR also must not be in use
(r/w), i.e. no CRS daemon must be running while the OCR is being
restored.
A description of the general procedure to restore the OCR can be found in the documentation,
this document explains how to recover from a complete loss of the ASM
disk group that held the OCR and Voting files in a 11gR2 Grid
environment.
Solution
When using an ASM disk group for CRS there are typically 3 different types of files located in the disk group that potentially need to be restored/recreated:
- the Oracle Cluster Registry file (OCR)
- the Voting file(s)
- the shared SPFILE for the ASM instances
The following example assumes that the OCR was located in a single
disk group used exclusively for CRS. The disk group has just one disk
using external redundancy.
Since the CRS disk group has been lost the CRS stack will not be available on any node.
The following settings used in the example would need to be replaced according to the actual configuration:
GRID user: oragrid
GRID home: /u01/app/11.2.0/grid ($CRS_HOME)
ASM disk group name for OCR: CRS
ASM/ASMLIB disk name: ASMD40
Linux device name for ASM disk: /dev/sdh1
Cluster name: rac_cluster1
Nodes: racnode1, racnode2
1. Locate the latest automatic OCR backup
When using a non-shared CRS home, automatic OCR backups can be located
on any node of the cluster, consequently all nodes need to be checked
for the most recent backup:
-rw------- 1 root root 7331840 Mar 10 18:52 week.ocr
-rw------- 1 root root 7651328 Mar 26 01:33 week_.ocr
-rw------- 1 root root 7651328 Mar 29 01:33 day.ocr
-rw------- 1 root root 7651328 Mar 30 01:33 day_.ocr
-rw------- 1 root root 7651328 Mar 30 01:33 backup02.ocr
-rw------- 1 root root 7651328 Mar 30 05:33 backup01.ocr
-rw------- 1 root root 7651328 Mar 30 09:33 backup00.ocr
2. Make sure the Grid Infrastructure is shutdown on all nodes
Given that the OCR diskgroup is missing, the GI stack will not be
functional on any node, however there may still be various daemon
processes running. On each node shutdown the GI stack using the force
(-f) option:
3. Start the CRS stack in exclusive mode
On the node that has the most recent OCR backup, log on as root and
start CRS in exclusive mode, this mode will allow ASM to start &
stay up without the presence of a Voting disk and without the CRS daemon
process (crsd.bin) running.
11.2.0.1:
...
CRS-2672: Attempting to start 'ora.asm' on 'racnode1'
CRS-2676: Start of 'ora.asm' on 'racnode1' succeeded
CRS-2672: Attempting to start 'ora.crsd' on 'racnode1'
CRS-2676: Start of 'ora.crsd' on 'racnode1' succeeded
This document assumes that the CRS diskgroup was completely lost, in which case the CRS daemon (resource ora.crsd) will terminate again due to the inaccessibility of the OCR - even if above message indicates that the start succeeded.
If this is not the case - i.e. if the CRS diskgroup is still present (but corrupt or incorrect) the CRS daemon needs to be shutdown manually using:
otherwise the subsequent OCR restore will fail.
11.2.0.2 and above:
CRS-4123: Oracle High Availability Services has been started.
...
CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'auw2k3'
CRS-2672: Attempting to start 'ora.ctssd' on 'racnode1'
CRS-2676: Start of 'ora.drivers.acfs' on 'racnode1' succeeded
CRS-2676: Start of 'ora.ctssd' on 'racnode1' succeeded
CRS-2676: Start of 'ora.cluster_interconnect.haip' on 'racnode1' succeeded
CRS-2672: Attempting to start 'ora.asm' on 'racnode1'
CRS-2676: Start of 'ora.asm' on 'racnode1' succeeded
A new option '-nocrs' has been introduced with 11.2.0.2, which prevents the start of the ora.crsd resource. It is vital that this option is specified, otherwise the failure to start the ora.crsd resource will tear down ora.cluster_interconnect.haip, which in turn will cause ASM to crash.
4. Label the CRS disk for ASMLIB use
If using ASMLIB the disk to be used for the CRS disk group needs to stamped first, as user root do:
Writing disk header: done
Instantiating disk: done
5. Create the CRS diskgroup via sqlplus
The disk group can now be (re-)created via sqlplus from the grid user. The compatible.asm attribute must be set to 11.2 in order for the disk group to be used by CRS:
SQL*Plus: Release 11.2.0.1.0 Production on Tue Mar 30 11:47:24 2010
Copyright (c) 1982, 2009, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
With the Real Application Clusters and Automatic Storage Management options
SQL> create diskgroup CRS external redundancy disk 'ORCL:ASMD40' attribute 'COMPATIBLE.ASM' = '11.2';
Diskgroup created.
SQL> exit
6. Restore the latest OCR backup
Now that the CRS disk group is created & mounted the OCR can be restored - must be done as the root user:
# $CRS_HOME/bin/ocrconfig -restore backup00.ocr
7. Start the CRS daemon on the current node (11.2.0.1 only !)
Now that the OCR has been restored the CRS daemon can be started, this
is needed to recreate the Voting file. Skip this step for 11.2.0.2.0.
CRS-2672: Attempting to start 'ora.crsd' on 'racnode1'
CRS-2676: Start of 'ora.crsd' on 'racnode1' succeeded
8. Recreate the Voting file
The Voting file needs to be initialized in the CRS disk group:
Successful addition of voting disk 00caa5b9c0f54f3abf5bd2a2609f09a9.
Successfully replaced voting disk group with +CRS.
CRS-4266: Voting file(s) successfully replaced
9. Recreate the SPFILE for ASM (optional)
- not using an SPFILE for ASM
- not using a shared SPFILE for ASM
- using a shared SPFILE not stored in ASM (e.g. on cluster file system)
this step possibly should be skipped.
Also use extra care in regards to the asm_diskstring parameter as it impacts the discovery of the voting disks.
Please verify the previous settings using the ASM alert log.
Prepare a pfile (e.g. /tmp/asm_pfile.ora) with the ASM startup
parameters - these may vary from the example below. If in doubt consult
the ASM alert log as the ASM instance startup should list all
non-default parameter values. Please note the last startup of ASM (in
step 2 via CRS start) will not have used an SPFILE, so a startup prior
to the loss of the CRS disk group would need to be located.
*.diagnostic_dest='/u01/app/oragrid'
*.instance_type='asm'
*.large_pool_size=12M
*.remote_login_passwordfile='EXCLUSIVE'
Now the SPFILE can be created using this PFILE:
SQL*Plus: Release 11.2.0.1.0 Production on Tue Mar 30 11:52:39 2010
Copyright (c) 1982, 2009, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
With the Real Application Clusters and Automatic Storage Management options
SQL> create spfile='+CRS' from pfile='/tmp/asm_pfile.ora';
File created.
SQL> exit
10. Shutdown CRS
Since CRS is
running in exclusive mode, it needs to be shutdown to allow CRS to run
on all nodes again. Use of the force (-f) option may be required:
...
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'auw2k3' has completed
CRS-4133: Oracle High Availability Services has been stopped.
11. Rescan ASM disks
If using ASMLIB rescan all ASM disks on each node as the root user:
Reloading disk partitions: done
Cleaning any stale ASM disks...
Scanning system for ASM disks...
Instantiating disk "ASMD40"
12. Start CRS
As the root user submit the CRS startup on all cluster nodes:
CRS-4123: Oracle High Availability Services has been started.
13. Verify CRS
To verify that CRS is fully functional again:
**************************************************************
racnode1:
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
**************************************************************
racnode2:
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
**************************************************************
# $CRS_HOME/bin/crsctl status resource -t
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/21754115/viewspace-1679628/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- How to restore ASM based OCRRESTASM
- How to Restore the Database Using AMDU after Diskgroup CorruptionRESTDatabase
- How to Restore CRS after accidentally run localconfig on RAC system_747415.1RESTIDE
- How to restore raid after reinstall LinuxRESTAILinux
- How to move ASM database files from one diskgroup to anotherASMDatabase
- How to Clean Up After a Failed Oracle Clusterware (CRS) InstallationAIOracle
- How to free space from an ASM diskgroup? (Doc ID 1553744.1)ASM
- 10g RAC: How to Clean Up After a Failed CRS InstallAI
- 【RAC】Diskgroup shows offline after restart even it is mounted in ASM instanceRESTASM
- Metlink:10g RAC How to Clean Up After a Failed CRS InstallAI
- guarantee restore points-Flashback after RMAN restoreREST
- How restore CBO statisticsREST
- Unable to start HTTP server after restoreHTTPServerREST
- how to clean failed crsAI
- How to get complete sessions informationSessionORM
- Asm diskgroup 的修復ASM
- script of check repair ASM DISKGROUPAIASM
- 針對11.2 RAC丟失OCR和Votedisk所在ASM Diskgroup的恢復手段ASM
- oracle 11gr2 針對ocr/vote asm diskgroup損壞的處理方法OracleASM
- Recover physical standby database after loss of archive log(2)DatabaseHive
- 給ASM例項增加diskgroupASM
- Cannot restore segment prot after reloc:Permission deniedREST
- ASM管理 - 如何重新命名diskgroupASM
- clean all Oracle 10gR2 CRS after a failed CRS installationOracle 10gAI
- Using FTP Transferring Non-ASM Datafiles to ASM diskgroupFTPASM
- Recover database after disk loss (Doc ID 230829.1)Database
- Recover physical standby database after loss of archive log – roll forward(轉)DatabaseHiveForward
- How to delete SLP after backup with storage fundationdelete
- oracle asm diskgroup add datafile error problemOracleASMError
- 學習ASM技術(三)--diskgroup管理ASM
- 學習ASM技術(二)--diskgroup管理ASM
- 新建或修改ASM diskgroup 的問題ASM
- How To Use Virtual Column-Based Partitioning
- Oracle RAC CRS、OCR、Voting破壞重建Oracle
- How to Restore ASM Password File if Lost ( ORA-01017 ORA-15077 )_1644005.1RESTASM
- Metlink:How to clean up a failed CRS/ClusterwareAI
- votedisk在ASM diskgroup上的存放規律ASM
- How to re-create the ASMASM