oracle 11g rac 共享儲存壞掉後資料庫恢復

selectshen發表於2015-10-23
共享儲存壞掉,存在共享儲存的OCR,votedisk,ASM,database資料全部丟失.本地主機儲存有資料庫的備份.兩個主機的軟體是沒有問題,所以此時只需要重新配置gi,然後恢復資料庫.

以下模擬:
host:ct66rac01,ct66rac02
os:centos 6.6
db:11.2.0.4
GRID_HOME=/u01/app/11.2.0/grid
ORACLE_HOME=/u01/app/oracle/product/11.2.0/db_1

當前資料庫正常狀態:
[grid@ct66rac01 ~]$ crs_stat -t
Name           Type           Target    State     Host
------------------------------------------------------------
ora.DATA.dg    ora....up.type ONLINE    ONLINE    ct66rac01
ora....ER.lsnr ora....er.type ONLINE    ONLINE    ct66rac01
ora....N1.lsnr ora....er.type ONLINE    ONLINE    ct66rac01
ora.asm        ora.asm.type   ONLINE    ONLINE    ct66rac01
ora....SM1.asm application    ONLINE    ONLINE    ct66rac01
ora....01.lsnr application    ONLINE    ONLINE    ct66rac01
ora....c01.gsd application    OFFLINE   OFFLINE
ora....c01.ons application    ONLINE    ONLINE    ct66rac01
ora....c01.vip ora....t1.type ONLINE    ONLINE    ct66rac01
ora....SM2.asm application    ONLINE    ONLINE    ct66rac02
ora....02.lsnr application    ONLINE    ONLINE    ct66rac02
ora....c02.gsd application    OFFLINE   OFFLINE
ora....c02.ons application    ONLINE    ONLINE    ct66rac02
ora....c02.vip ora....t1.type ONLINE    ONLINE    ct66rac02
ora.cvu        ora.cvu.type   ONLINE    ONLINE    ct66rac01
ora.gsd        ora.gsd.type   OFFLINE   OFFLINE
ora....network ora....rk.type ONLINE    ONLINE    ct66rac01
ora.oc4j       ora.oc4j.type  ONLINE    ONLINE    ct66rac01
ora.ons        ora.ons.type   ONLINE    ONLINE    ct66rac01
ora.rac11g.db  ora....se.type ONLINE    ONLINE    ct66rac01
ora.scan1.vip  ora....ip.type ONLINE    ONLINE    ct66rac01

模擬儲存壞掉:
[root@ct66rac01 bin]#  dd if=/dev/zero bs=1M count=100 of=/dev/sdb1
[root@ct66rac01 bin]#  dd if=/dev/zero bs=1M count=100 of=/dev/sdb2

開始修復:
1.清空cluster的配置資訊
--先停止各節點crs
[root@ct66rac01 bin]# cd /u01/app/11.2.0/grid/bin/
[root@ct66rac01 bin]# ./crsctl stop crs -f
[root@ct66rac02 ~]# cd /u01/app/11.2.0/grid/bin/
[root@ct66rac02 bin]# ./crsctl stop crs -f
--清空各節點cluster配置資訊,注意最後的節點加-lastnode
[root@ct66rac01 bin]# cd /u01/app/11.2.0/grid/crs/install/
[root@ct66rac01 install]# ./rootcrs.pl -deconfig -force
[root@ct66rac02 bin]# cd /u01/app/11.2.0/grid/crs/install/
[root@ct66rac02 install]#  ./rootcrs.pl -deconfig -force -lastnode

2.重新配置gi
/u01/app/11.2.0/grid/crs/install/crsconfig_params這裡面存放了之前gi的配置資訊.
共享儲存重建之後,如果路徑和之前的不一樣,可以修改/u01/app/11.2.0/grid/crs/install/crsconfig_params這個檔案改為當前的配置.
如果配置資訊變化太大,也可以直接透過圖形化執行/u01/app/11.2.0/grid/crs/config/config.sh去重新輸入環境資訊,以完成gi重建.

如果環境沒有變化,只需要執行以下:
--重建cluster配置資訊,
[root@ct66rac01 ~]# cd /u01/app/11.2.0/grid
[root@ct66rac01 grid]# ./root.sh

[root@ct66rac02 ~]# cd /u01/app/11.2.0/grid
[root@ct66rac02 grid]# ./root.sh

--此時完成之後,可以看到,連asm也重建完成
[grid@ct66rac01 ~]$ crs_stat -t
Name           Type           Target    State     Host
------------------------------------------------------------
ora.DATA.dg    ora....up.type ONLINE    ONLINE    ct66rac01
ora....N1.lsnr ora....er.type ONLINE    ONLINE    ct66rac01
ora.asm        ora.asm.type   ONLINE    ONLINE    ct66rac01
ora....SM1.asm application    ONLINE    ONLINE    ct66rac01
ora....c01.gsd application    OFFLINE   OFFLINE
ora....c01.ons application    ONLINE    ONLINE    ct66rac01
ora....c01.vip ora....t1.type ONLINE    ONLINE    ct66rac01
ora....SM2.asm application    ONLINE    ONLINE    ct66rac02
ora....c02.gsd application    OFFLINE   OFFLINE
ora....c02.ons application    ONLINE    ONLINE    ct66rac02
ora....c02.vip ora....t1.type ONLINE    ONLINE    ct66rac02
ora.cvu        ora.cvu.type   ONLINE    ONLINE    ct66rac01
ora.gsd        ora.gsd.type   OFFLINE   OFFLINE
ora....network ora....rk.type ONLINE    ONLINE    ct66rac01
ora.oc4j       ora.oc4j.type  ONLINE    ONLINE    ct66rac01
ora.ons        ora.ons.type   ONLINE    ONLINE    ct66rac01
ora.scan1.vip  ora....ip.type ONLINE    ONLINE    ct66rac01

--重新註冊本地listenter到ocr
[grid@ct66rac01 ~]$ srvctl add listener -l listener
[grid@ct66rac01 ~]$ srvctl start listener
[grid@ct66rac01 ~]$ srvctl status listener
Listener LISTENER is enabled
Listener LISTENER is running on node(s): ct66rac01,ct66rac02

--新增資料庫的ASM磁碟組
可以透過sqlplus或者asmca,然後通srvctl add diskgroup重新註冊diskgroup到ocr.
這裡模擬環境的資料庫用的和ocr相同的diskgroup,所以不需要新建.

3.gi重新配置完成,開始還原資料庫
如果你是dp備份,那需要先透過DBCA去重建資料庫,再impdp匯入到資料庫.
如果是rman備份,執行以下:

[root@ct66rac01 grid]# su - oracle
--顯示當前的備份在/home/oracle下
[oracle@ct66rac01 ~]$ ll full*
-rw-r----- 1 oracle asmadmin 1134043136 Oct 22 17:59 full_db_RAC11G_3_1_1.dbfbk
-rw-r----- 1 oracle asmadmin   18579456 Oct 22 17:59 full_db_RAC11G_4_1_1.dbfbk

[oracle@ct66rac01 ~]$ ORACLE_SID=rac11g1
[oracle@ct66rac01 ~]$ rman target /
--啟動到nomount,此處不需要引數檔案
RMAN> startup nomount;

startup failed: ORA-01078: failure in processing system parameters
ORA-01565: error in identifying file '+DATA/rac11g/spfilerac11g.ora'
ORA-17503: ksfdopn:2 Failed to open file +DATA/rac11g/spfilerac11g.ora
ORA-15056: additional error message
ORA-17503: ksfdopn:2 Failed to open file +DATA/rac11g/spfilerac11g.ora
ORA-15173: entry 'rac11g' does not exist in directory '/'
ORA-06512: at line 4

starting Oracle instance without parameter file for retrieval of spfile
Oracle instance started

Total System Global Area    1068937216 bytes

Fixed Size                     2260088 bytes
Variable Size                281019272 bytes
Database Buffers             780140544 bytes
Redo Buffers                   5517312 bytes
--還原spfile
--此處報錯是因為目錄不存在,要先在asm的+DATA下建相應目錄
RMAN> restore spfile from '/home/oracle/full_db_RAC11G_4_1_1.dbfbk';

Starting restore at 23-OCT-15
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=174 device type=DISK

channel ORA_DISK_1: restoring spfile from AUTOBACKUP /home/oracle/full_db_RAC11G_4_1_1.dbfbk
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of restore command at 10/23/2015 14:36:58
ORA-19870: error while restoring backup piece /home/oracle/full_db_RAC11G_4_1_1.dbfbk
ORA-32015: unable to restore SPFILE
ORA-17502: ksfdcre:4 Failed to create file +DATA/rac11g/spfilerac11g.ora
ORA-15056: additional error message
ORA-17502: ksfdcre:4 Failed to create file +DATA/rac11g/spfilerac11g.ora
ORA-15173: entry 'rac11g' does not exist in directory '/'

RMAN> exit
--在+DATA下建rac11g目錄
[root@ct66rac01 grid]# su - grid
[grid@ct66rac01 ~]$ asmcmd
ASMCMD> ls
DATA/
ASMCMD> cd DATA
ASMCMD> ls
ct66rac-scan/
ASMCMD> mkdir rac11g
ASMCMD> exit

[root@ct66rac01 grid]# su - oracle
[oracle@ct66rac01 ~]$ ORACLE_SID=rac11g1
[oracle@ct66rac01 ~]$ rman target /

Recovery Manager: Release 11.2.0.4.0 - Production on Fri Oct 23 14:38:44 2015

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

connected to target database: DUMMY (not mounted)
--重新還原spfile
RMAN>  restore spfile from '/home/oracle/full_db_RAC11G_4_1_1.dbfbk';
--用還原後的spfile開啟到nomount
RMAN> startup force nomount;
--還原控制檔案
RMAN>  restore controlfile from '/home/oracle/full_db_RAC11G_4_1_1.dbfbk';
--啟動到mount
RMAN> alter database mount;
--把/home/oracle下備份檔案資訊記錄到控制檔案
RMAN> catalog start with '/home/oracle/full';

searching for all files that match the pattern /home/oracle/full

List of Files Unknown to the Database
=====================================
File Name: /home/oracle/full_db_RAC11G_4_1_1.dbfbk

Do you really want to catalog the above files (enter YES or NO)? yes
cataloging files...
cataloging done

List of Cataloged Files
=======================
File Name: /home/oracle/full_db_RAC11G_4_1_1.dbfbk

RMAN> crosscheck backup;
--還原資料庫
RMAN> restore database;

RMAN> recover database;
--開啟資料庫
RMAN> alter database open resetlogs;

database opened

RMAN> exit
--新增資料庫資訊到ocr
[oracle@ct66rac01 ~]$ srvctl status database -d rac11g
PRCD-1120 : The resource for database rac11g could not be found.
PRCR-1001 : Resource ora.rac11g.db does not exist
[oracle@ct66rac01 ~]$ echo $ORACLE_HOME
/u01/app/oracle/product/11.2.0/db_1
[oracle@ct66rac01 ~]$ srvctl add database -d rac11g -o /u01/app/oracle/product/11.2.0/db_1 -c RAC
[oracle@ct66rac01 ~]$ srvctl add instance -d rac11g -i rac11g1 -n ct66rac01
[oracle@ct66rac01 ~]$ srvctl add instance -d rac11g -i rac11g2 -n ct66rac02
[oracle@ct66rac01 ~]$ srvctl start database -d rac11g

--修復完成,檢視資料庫狀態
[root@ct66rac01 bin]# ./crs_stat -t
Name           Type           Target    State     Host
------------------------------------------------------------
ora.DATA.dg    ora....up.type ONLINE    ONLINE    ct66rac01
ora....ER.lsnr ora....er.type ONLINE    ONLINE    ct66rac01
ora....N1.lsnr ora....er.type ONLINE    ONLINE    ct66rac01
ora.asm        ora.asm.type   ONLINE    ONLINE    ct66rac01
ora....SM1.asm application    ONLINE    ONLINE    ct66rac01
ora....01.lsnr application    ONLINE    ONLINE    ct66rac01
ora....c01.gsd application    OFFLINE   OFFLINE
ora....c01.ons application    ONLINE    ONLINE    ct66rac01
ora....c01.vip ora....t1.type ONLINE    ONLINE    ct66rac01
ora....SM2.asm application    ONLINE    ONLINE    ct66rac02
ora....02.lsnr application    ONLINE    ONLINE    ct66rac02
ora....c02.gsd application    OFFLINE   OFFLINE
ora....c02.ons application    ONLINE    ONLINE    ct66rac02
ora....c02.vip ora....t1.type ONLINE    ONLINE    ct66rac02
ora.cvu        ora.cvu.type   ONLINE    ONLINE    ct66rac01
ora.gsd        ora.gsd.type   OFFLINE   OFFLINE
ora....network ora....rk.type ONLINE    ONLINE    ct66rac01
ora.oc4j       ora.oc4j.type  ONLINE    ONLINE    ct66rac01
ora.ons        ora.ons.type   ONLINE    ONLINE    ct66rac01
ora.rac11g.db  ora....se.type ONLINE    ONLINE    ct66rac01
ora.scan1.vip  ora....ip.type ONLINE    ONLINE    ct66rac01



來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/28539951/viewspace-1815958/,如需轉載,請註明出處,否則將追究法律責任。

相關文章