Oracle 10gR2 RAC 沒有備份的情況下恢復ocr和vote

sun642514265發表於2013-12-04

本例使用重新初始化OCR和VOTE的辦法

1.停止所有節點的Clusterware stack
本例在一個節點rac3上操作

[oracle@rac3 rac3]$ srvctl stop database -d ccupdb
[oracle@rac3 rac3]$ srvctl stop asm -n rac3
[oracle@rac3 rac3]$ srvctl stop asm -n rac4
[oracle@rac3 rac3]$ srvctl stop nodeapps -n rac3
[oracle@rac3 rac3]$ srvctl stop nodeapps -n rac4
[oracle@rac3 rac3]$ crs_stat -t -v
Name Type R/RA F/FT Target State Host
———————————————————————-
ora….b1.inst application 0/5 0/0 OFFLINE OFFLINE
ora….b2.inst application 0/5 0/0 OFFLINE OFFLINE
ora.ccupdb.db application 0/0 0/1 OFFLINE OFFLINE
ora….SM1.asm application 0/5 0/0 OFFLINE OFFLINE
ora….C3.lsnr application 0/5 0/0 OFFLINE OFFLINE
ora.rac3.gsd application 0/5 0/0 OFFLINE OFFLINE
ora.rac3.ons application 0/3 0/0 OFFLINE OFFLINE
ora.rac3.vip application 0/0 0/0 OFFLINE OFFLINE
ora….SM2.asm application 0/5 0/0 OFFLINE OFFLINE
ora….C4.lsnr application 0/5 0/0 OFFLINE OFFLINE
ora.rac4.gsd application 0/5 0/0 OFFLINE OFFLINE
ora.rac4.ons application 0/3 0/0 OFFLINE OFFLINE
ora.rac4.vip application 0/0 0/0 OFFLINE OFFLINE

[root@rac3 bin]# ./crsctl stop crs
Stopping resources.
Successfully stopped CRS resources
Stopping CSSD.
Shutting down CSS daemon.
Shutdown request successfully issued.

[root@rac3 bin]# ./crsctl check crs
Failure 1 contacting CSS daemon
Cannot communicate with CRS
Cannot communicate with EVM

[root@rac4 bin]# ./crsctl stop crs
Stopping resources.
Successfully stopped CRS resources
Stopping CSSD.
Shutting down CSS daemon.
Shutdown request successfully issued.

[root@rac4 bin]# ./crsctl check crs
Failure 1 contacting CSS daemon
Cannot communicate with CRS
Cannot communicate with EVM

上述演示的是正常停止crs的順序,也可以使用最後的命令直接停止

2.用root使用者在每個節點上執行$ORA_CRS_HOME/install/rootdelete.sh
[root@rac3 install]# sh rootdelete.sh
Shutting down Oracle Cluster Ready Services (CRS):
Stopping resources.
Error while stopping resources. Possible cause: CRSD is down.
Stopping CSSD.
Unable to communicate with the CSS daemon.
Shutdown has begun. The daemons should exit soon.
Checking to see if Oracle CRS stack is down…
Oracle CRS stack is not running.
Oracle CRS stack is down now.
Removing script. for Oracle Cluster Ready services
Updating ocr file for downgrade
Cleaning up SCR settings in ‘/etc/oracle/scls_scr’

[root@rac4 install]# sh rootdelete.sh
Shutting down Oracle Cluster Ready Services (CRS):
Stopping resources.
Error while stopping resources. Possible cause: CRSD is down.
Stopping CSSD.
Unable to communicate with the CSS daemon.
Shutdown has begun. The daemons should exit soon.
Checking to see if Oracle CRS stack is down…
Oracle CRS stack is not running.
Oracle CRS stack is down now.
Removing script. for Oracle Cluster Ready services
Updating ocr file for downgrade
Cleaning up SCR settings in ‘/etc/oracle/scls_scr’

3.只在其中一個節點上執行$ORA_CRS_HOME/install/rootdeinstall.sh
[root@rac3 install]# sh rootdeinstall.sh

Removing contents from OCR device
讀入了 2560+0 個塊
輸出了 2560+0 個塊

—上述命令會清除相關資訊,本例手動再次清除確認
修改 /etc/inittab, 刪除以下三行.
h1:2:respawn:/etc/init.evmd run >/dev/null 2>&1 h2:2:respawn:/etc/init.cssd fatal >/dev/null 2>&1 h3:2:respawn:/etc/init.crsd run >/dev/null 2>&1  
rm -rf /etc/oracle/* 
rm -f /etc/init.d/init.cssd
rm -f /etc/init.d/init.crs
rm -f /etc/init.d/init.crsd
rm -f /etc/init.d/init.evmd
rm -f /etc/inittab.crs
cp /etc/inittab.orig /etc/inittab 
rm -rf /var/tmp/.oracle
rm -rf /tmp/.oracle 

4.在和步驟3中的同一個節點上執行$ORA_CRS_HOME/root.sh
[root@rac3 crs]# pwd
/oracle/crs
[root@rac3 crs]# sh root.sh
WARNING: directory '/oracle' is not owned by root
Checking to see if Oracle CRS stack is already configured

Setting the permissions on OCR backup directory
Setting up NS directories
Oracle Cluster Registry configuration upgraded successfully
WARNING: directory '/oracle' is not owned by root
assigning default hostname rac3 for node 1.
assigning default hostname rac4 for node 2.
Successfully accumulated necessary OCR keys.
Using ports: CSS=49895 CRS=49896 EVMC=49898 and EVMR=49897.
node : 
node 1: rac3 rac3-priv rac3
node 2: rac4 rac4-priv rac4
Creating OCR keys for user ‘root’, privgrp ‘root’..
Operation successful.
Now formatting voting device: /dev/raw/raw2
Format of 1 voting devices complete.
Startup will be queued to init within 90 seconds.
Adding daemons to inittab
Expecting the CRS daemons to be up within 600 seconds.
CSS is active on these nodes.
rac3
CSS is inactive on these nodes.
rac4
Local node checking complete.
Run root.sh on remaining nodes to start CRS daemons.

—這裡注意最後的輸出,要求在另一個節點上也執行root.sh
http://oraclers.cn/wp-content/uploads/2013/06/未命名圖片.png

5.在另一個節點上執行$ORA_CRS_HOME/root.sh
[root@rac4 crs]# pwd
/oracle/crs
[root@rac4 crs]# sh root.sh
WARNING: directory ‘/oracle’ is not owned by root
Checking to see if Oracle CRS stack is already configured

Setting the permissions on OCR backup directory
Setting up NS directories
Oracle Cluster Registry configuration upgraded successfully
WARNING: directory ‘/oracle’ is not owned by root
clscfg: EXISTING configuration version 3 detected.
clscfg: version 3 is 10G Release 2.
assigning default hostname rac3 for node 1.
assigning default hostname rac4 for node 2.
Successfully accumulated necessary OCR keys.
Using ports: CSS=49895 CRS=49896 EVMC=49898 and EVMR=49897.
node : 
node 1: rac3 rac3-priv rac3
node 2: rac4 rac4-priv rac4
clscfg: Arguments check out successfully.

NO KEYS WERE WRITTEN. Supply -force parameter to override.
-force is destructive and will destroy any previous cluster
configuration.
Oracle Cluster Registry for cluster has already been initialized
Startup will be queued to init within 90 seconds.
Adding daemons to inittab
Expecting the CRS daemons to be up within 600 seconds.
CSS is active on these nodes.
rac3
rac4
CSS is active on all nodes.
Waiting for the Oracle CRSD and EVMD to start
Oracle CRS stack installed and running under init(1M)
Running vipca(silent) for configuring nodeapps

給定的介面 “eth0″ 不是公共介面。應使用公共介面來配置虛擬 IP。
—這裡注意最後的輸出,執行vipca,此處依然在rac4節點上以root使用者執行
=======
執行完成後發現已經重建了crs資源
[oracle@rac3 install]$ crs_stat -t -v
Name Type R/RA F/FT Target State Host
———————————————————————-
ora.rac3.gsd application 0/5 0/0 ONLINE ONLINE rac3
ora.rac3.ons application 0/3 0/0 ONLINE ONLINE rac3
ora.rac3.vip application 0/0 0/0 ONLINE ONLINE rac3
ora.rac4.gsd application 0/5 0/0 ONLINE ONLINE rac4
ora.rac4.ons application 0/3 0/0 ONLINE ONLINE rac4

ora.rac4.vip application 0/0 0/0 ONLINE ONLINE rac4

6.然後用netca命令重新配置rac的監聽檔案即可
https://i.iter01.com/images/3b18e6cbf7511e336633a6c41ae96ce62a1b065ba3eec9fd1a9d41128af91fb2.png

並最終確認監聽已經註冊到crs中:
[oracle@rac3 admin]$ crs_stat -t -v
Name Type R/RA F/FT Target State Host
———————————————————————-
ora….C3.lsnr application 0/5 0/0 ONLINE ONLINE rac3
ora.rac3.gsd application 0/5 0/0 ONLINE ONLINE rac3
ora.rac3.ons application 0/3 0/0 ONLINE ONLINE rac3
ora.rac3.vip application 0/0 0/0 ONLINE ONLINE rac3
ora….C4.lsnr application 0/5 0/0 ONLINE ONLINE rac4
ora.rac4.gsd application 0/5 0/0 ONLINE ONLINE rac4
ora.rac4.ons application 0/3 0/0 ONLINE ONLINE rac4
ora.rac4.vip application 0/0 0/0 ONLINE ONLINE rac4

此時CRS的3個核心程式資源ONS、GSD、VIP以及Listener資源都已經online了,還需要把ASM、資料庫例項註冊到OCR;
在新增之前,我們需要把OCR重啟一下
[oracle@rac3 ~]$ srvctl stop nodeapps -n rac3
[oracle@rac3 ~]$ srvctl stop nodeapps -n rac4
[root@rac3 bin]# ./crsctl stop crs
Stopping resources.
Successfully stopped CRS resources
Stopping CSSD.
Shutting down CSS daemon.
Shutdown request successfully issued.

[root@rac3 bin]# ./crsctl check crs
Failure 1 contacting CSS daemon
Cannot communicate with CRS
Cannot communicate with EVM

[root@rac4 bin]# ./crsctl stop crs
Stopping resources.
Successfully stopped CRS resources
Stopping CSSD.
Shutting down CSS daemon.
Shutdown request successfully issued.

[root@rac4 bin]# ./crsctl check crs
Failure 1 contacting CSS daemon
Cannot communicate with CRS
Cannot communicate with EVM

[root@rac3 bin]# ./crsctl start crs
Attempting to start CRS stack
The CRS stack will be started shortly

[root@rac4 bin]# ./crsctl start crs
Attempting to start CRS stack
The CRS stack will be started shortly

[oracle@rac3 ~]$ crs_stat -t -v
Name Type R/RA F/FT Target State Host
———————————————————————-
ora….C3.lsnr application 0/5 0/0 ONLINE ONLINE rac3
ora.rac3.gsd application 0/5 0/0 ONLINE ONLINE rac3
ora.rac3.ons application 0/3 0/0 ONLINE ONLINE rac3
ora.rac3.vip application 0/0 0/0 ONLINE ONLINE rac3
ora….C4.lsnr application 0/5 0/0 ONLINE ONLINE rac4
ora.rac4.gsd application 0/5 0/0 ONLINE ONLINE rac4
ora.rac4.ons application 0/3 0/0 ONLINE ONLINE rac4
ora.rac4.vip application 0/0 0/0 ONLINE ONLINE rac4

7.下面新增ASM
[oracle@rac3 ~]$ srvctl add asm -n rac3 -i +ASM1 -o $ORACLE_HOME
[oracle@rac3 ~]$ srvctl add asm -n rac4 -i +ASM2 -o $ORACLE_HOME
新增完成,注意這裡的$ORACLE_HOME使用的是database的,而不是$ORA_CRS_HOME
[oracle@rac3 admin]$ crs_stat -t -v
Name Type R/RA F/FT Target State Host
———————————————————————-
ora….SM1.asm application 0/5 0/0 OFFLINE OFFLINE
ora….C3.lsnr application 0/5 0/0 ONLINE ONLINE rac3
ora.rac3.gsd application 0/5 0/0 ONLINE ONLINE rac3
ora.rac3.ons application 0/3 0/0 ONLINE ONLINE rac3
ora.rac3.vip application 0/0 0/0 ONLINE ONLINE rac3
ora….SM2.asm application 0/5 0/0 OFFLINE OFFLINE
ora….C4.lsnr application 0/5 0/0 ONLINE ONLINE rac4
ora.rac4.gsd application 0/5 0/0 ONLINE ONLINE rac4
ora.rac4.ons application 0/3 0/0 ONLINE ONLINE rac4
ora.rac4.vip application 0/0 0/0 ONLINE ONLINE rac4

重啟OCR,步驟同上,略

啟動ASM例項,並監控/oracle/db/admin/+ASM/bdump/alert*.log和/oracle/db/admin/ccupdb/bdump/alert_ccupdb1.log日誌
[oracle@rac3 rac3]$ srvctl start asm -n rac3
節點rac3正常啟動asm例項
啟動rac4節點的asm例項時報錯了:
[oracle@rac3 rac3]$ srvctl start asm -n rac4
PRKS-1009 : 無法啟動節點 “rac4″, [PRKS-1009 : 無法啟動節點 "rac4", [CRS-0215: Could not start resource 'ora.rac4.ASM2.asm'.] 上的 ASM 例項 “+ASM2″] 上的 ASM 例項 “+ASM2″
[PRKS-1009 : 無法啟動節點 "rac4", [CRS-0215: Could not start resource 'ora.rac4.ASM2.asm'.] 上的 ASM 例項 “+ASM2″]

分別檢視rac3、rac4節點的ASM例項的alertr日誌/oracle/db/admin/+ASM/bdump/alert*.log
Tue Apr 16 02:04:50 2013
Error: KGXGN polling error (15)
Tue Apr 16 02:04:50 2013
Errors in file /oracle/db/admin/+ASM/bdump/+asm2_lmon_15526.trc:
ORA-29702: error occurred in Cluster Group Service operation
LMON: terminating instance due to error 29702
Tue Apr 16 02:04:52 2013
Shutting down instance (abort)
License high water mark = 0
Tue Apr 16 02:04:54 2013
Instance terminated by LMON, pid = 15526
Tue Apr 16 02:04:57 2013
Instance terminated by USER, pid = 20339
—資訊ORA-29702: error occurred in Cluster Group Service operation

—最重要的資訊是這裡:
WARNING: No cluster interconnect has been specified. Depending on
the communication driver configured Oracle cluster traffic
may be directed to the public interface of this machine.
Oracle recommends that RAC clustered databases be configured
with a private interconnect for enhanced security and
performance.
—這是因為RAC無法確認使用哪個網路卡作為 private interconnect ,因此需要在ASM例項啟動的pfile檔案中指定

—修改兩個節點ASM例項的pfile( init+ASM1.ora -> /oracle/db/admin/+ASM/pfile/init.ora),新增如下內容:
+ASM1.cluster_interconnects=’10.1.1.3′
+ASM2.cluster_interconnects=’10.1.1.4′
—就是將私有網路心跳ip填進去,再使用srvctl重啟asm例項即可

[oracle@rac3 admin]$ crs_stat -t -v
Name Type R/RA F/FT Target State Host
———————————————————————-
ora….SM1.asm application 0/5 0/0 ONLINE ONLINE rac3
ora….C3.lsnr application 0/5 0/0 ONLINE ONLINE rac3
ora.rac3.gsd application 0/5 0/0 ONLINE ONLINE rac3
ora.rac3.ons application 0/3 0/0 ONLINE ONLINE rac3
ora.rac3.vip application 0/0 0/0 ONLINE ONLINE rac3
ora….SM2.asm application 0/5 0/0 ONLINE ONLINE rac4
ora….C4.lsnr application 0/5 0/0 ONLINE ONLINE rac4
ora.rac4.gsd application 0/5 0/0 ONLINE ONLINE rac4
ora.rac4.ons application 0/3 0/0 ONLINE ONLINE rac4
ora.rac4.vip application 0/0 0/0 ONLINE ONLINE rac4

8.新增資料庫物件
[oracle@rac3 admin]$ srvctl add database -d ccupdb -o $ORACLE_HOME

9.新增兩個例項物件
[oracle@rac3 admin]$ srvctl add instance -d ccupdb -i ccupdb1 -n rac3
[oracle@rac3 admin]$ srvctl add instance -d ccupdb -i ccupdb2 -n rac4
[oracle@rac3 admin]$ crs_stat -t -v
Name Type R/RA F/FT Target State Host
———————————————————————-
ora….b1.inst application 0/5 0/0 OFFLINE OFFLINE
ora….b2.inst application 0/5 0/0 OFFLINE OFFLINE
ora.ccupdb.db application 0/0 0/1 OFFLINE OFFLINE
ora….SM1.asm application 0/5 0/0 ONLINE ONLINE rac3
ora….C3.lsnr application 0/5 0/0 ONLINE ONLINE rac3
ora.rac3.gsd application 0/5 0/0 ONLINE ONLINE rac3
ora.rac3.ons application 0/3 0/0 ONLINE ONLINE rac3
ora.rac3.vip application 0/0 0/0 ONLINE ONLINE rac3
ora….SM2.asm application 0/5 0/0 ONLINE ONLINE rac4
ora….C4.lsnr application 0/5 0/0 ONLINE ONLINE rac4
ora.rac4.gsd application 0/5 0/0 ONLINE ONLINE rac4
ora.rac4.ons application 0/3 0/0 ONLINE ONLINE rac4
ora.rac4.vip application 0/0 0/0 ONLINE ONLINE rac4

10.啟動資料庫
[oracle@rac3 admin]$ srvctl start database -d ccupdb
並關注資料庫的alert日誌,發現同上相同的錯誤資訊,使用如下語句修改:

SQL> alter system set cluster_interconnects = ’10.1.1.3′ scope=spfile sid=’ccupdb1′;

System altered.

SQL> alter system set cluster_interconnects = ’10.1.1.4′ scope=spfile sid=’ccupdb2′;

System altered.

—註釋:此處修改了spfile,建議也同步修改pfile的備份檔案,新增如下:
ccupdb1.cluster_interconnects = ’10.1.1.3′
ccupdb2.cluster_interconnects = ’10.1.1.4′

並重啟例項2
[oracle@rac3 admin]$ srvctl start instance -d ccupdb -i ccupdb2
[oracle@rac3 admin]$ crs_stat -t -v
Name Type R/RA F/FT Target State Host
———————————————————————-
ora….b1.inst application 1/5 0/0 ONLINE ONLINE rac3
ora….b2.inst application 0/5 0/0 ONLINE ONLINE rac4
ora.ccupdb.db application 0/0 0/1 ONLINE ONLINE rac4
ora….SM1.asm application 0/5 0/0 ONLINE ONLINE rac3
ora….C3.lsnr application 0/5 0/0 ONLINE ONLINE rac3
ora.rac3.gsd application 0/5 0/0 ONLINE ONLINE rac3
ora.rac3.ons application 0/3 0/0 ONLINE ONLINE rac3
ora.rac3.vip application 0/0 0/0 ONLINE ONLINE rac3
ora….SM2.asm application 0/5 0/0 ONLINE ONLINE rac4
ora….C4.lsnr application 0/5 0/0 ONLINE ONLINE rac4
ora.rac4.gsd application 0/5 0/0 ONLINE ONLINE rac4
ora.rac4.ons application 0/3 0/0 ONLINE ONLINE rac4
ora.rac4.vip application 0/0 0/0 ONLINE ONLINE rac4

至此OCR恢復完成。

http://www.oraclers.cn

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/28698327/viewspace-1062254/,如需轉載,請註明出處,否則將追究法律責任。

相關文章