OracleDG備庫恢復–gap

笱局長發表於2018-12-13
資料庫服務的過程中,經常有客戶因備庫伺服器當機導致備庫無法同步資料的問題,向我們諮詢資料恢復服務,最近遇到有3例。

如果客戶主庫的歸檔日誌沒有刪除,那麼備庫啟動後會自動同步最近的日誌檔案,保持與主庫的同步;很明顯,上面客戶歸檔日誌隨著備份完成而清理,導致備庫無法自動恢復。

下面模擬該場景的故障恢復。
(一)測試環境主備

# 主、備庫版本
SQL> select * from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
PL/SQL Release 11.2.0.4.0 - Production
CORE    11.2.0.4.0      Production
TNS for Linux: Version 11.2.0.4.0 - Production
NLSRTL Version 11.2.0.4.0 - Production

#主庫測試環境,產生較多歸檔日誌
sys@ORCL> create table dgr tablespace tbs_baiyang as select * from all_objects;
sys@ORCL> alter system switch logfile;

#備庫同步狀態查詢
SQL> select archived,max(sequence#),applied from v$archived_log group by archived,applied;
ARC MAX(SEQUENCE#) APPLIED
--- -------------- ---------
YES             81 IN-MEMORY
YES             80 YES

SQLL> select count(*) from dgr;

  COUNT(*)
----------
    84431

# 目前主備庫同步正常

(二)模擬故障環境

# 關閉備庫
SQL> alter database recover managed standby database cancel;
SQL> shutdown immediate

# 主庫執行大量更新操作,以產生大量歸檔日誌
sys@ORCL> archive log list
Database log mode              Archive Mode
Automatic archival             Enabled
Archive destination            /u01/app/oracle/oradata/orcl/archivelog
Oldest online log sequence     83
Next log sequence to archive   85
Current log sequence           85

sys@ORCL> insert into dgr select * from dgr;

84431 rows created.

sys@ORCL> insert into dgr select * from dgr;

168862 rows created.

sys@ORCL> insert into dgr select * from dgr;

337724 rows created.

sys@ORCL> update dgr set object_id = 11;

675448 rows updated.

sys@ORCL> commit;

Commit complete.


sys@ORCL> archive log list;
Database log mode              Archive Mode
Automatic archival             Enabled
Archive destination            /u01/app/oracle/oradata/orcl/archivelog
Oldest online log sequence     93
Next log sequence to archive   95
Current log sequence           95

# 備份資料庫
RMAN> backup database;

# 刪除新生成歸檔日誌檔案
cd /u01/app/oracle/oradata/orcl/archivelog
$mkdir bak
$mv 1_9* bak/

# 開啟備庫
SQL>  startup mount

SQL> select current_scn,database_role,open_mode from v$database;

CURRENT_SCN DATABASE_ROLE    OPEN_MODE
----------- ---------------- --------------------
    3730534 PHYSICAL STANDBY MOUNTED

# 開啟日誌應用報錯    
SQL> alter database recover managed standby database disconnect from session using current logfile;
alter database recover managed standby database disconnect from session using current logfile
*
ERROR at line 1:
ORA-01153: an incompatible media recovery is active

#檢視下目前的日誌應用狀態,仍有新的歸檔日誌傳輸到備庫,但無法應用
SQL>  select archived,max(sequence#),applied from v$archived_log group by archived,applied;

ARC MAX(SEQUENCE#) APPLIED
--- -------------- ---------
YES             97 NO
YES             89 YES

# 檢視下是否有gap產生,少了5個日誌檔案,正式剛才刪除的
SQL> select * from v$archive_gap;

   THREAD# LOW_SEQUENCE# HIGH_SEQUENCE#
---------- ------------- --------------
         1            90             94
 # 以上就是模擬故障發生的步驟

(三)恢復前的準備
以RMAN增量備份的方式恢復備庫的過程中,需要重建備庫的控制檔案,勢必需要了解當前備庫的資料庫檔案、日誌檔案歸屬;如果主備檔案目錄保持一致,那麼可以跳過這一步

# 日誌目錄
SQL> set linesize 300
SQL> col MEMBER for a60
SQL> select type,member from v$logfile;

TYPE    MEMBER
------- ------------------------------------------------------------
ONLINE  /u01/app/oracle/oradata/standby/redo03.log
ONLINE  /u01/app/oracle/oradata/standby/redo02.log
ONLINE  /u01/app/oracle/oradata/standby/redo01.log
STANDBY /u01/app/oracle/oradata/standby/standbylog/redo11.log
STANDBY /u01/app/oracle/oradata/standby/standbylog/redo12.log
STANDBY /u01/app/oracle/oradata/standby/standbylog/redo13.log
STANDBY /u01/app/oracle/oradata/standby/standbylog/redo14.log

# 資料檔案目錄
SQL> select name from v$datafile;

NAME
--------------------------------------------------------------------------------
/u01/app/oracle/oradata/standby/datafile/system01.dbf
/u01/app/oracle/oradata/standby/datafile/sysaux01.dbf
/u01/app/oracle/oradata/standby/datafile/undotbs01.dbf
/u01/app/oracle/oradata/standby/datafile/users01.dbf
/u01/app/oracle/oradata/standby/datafile/tbs_baiyang.dbf
/u01/app/oracle/oradata/standby/datafile/lxx2.dbf

# 查詢當前備庫的SCN
SQL> select current_scn,database_role,open_mode from v$database;

CURRENT_SCN DATABASE_ROLE    OPEN_MODE
----------- ---------------- --------------------
    3730534 PHYSICAL STANDBY MOUNTED

# 主庫再進行一些更新操作
sys@ORCL> update dgr set object_id = 11;

675448 rows updated.

sys@ORCL> commit;

Commit complete.

(四)增量備份主庫,及備份主庫控制檔案

# 增量備份
RMAN> run{
allocate channel c1 device type disk;
backup incremental from scn 3730534 database format `/oradata/ora_scn_%U.bak`; 
release channel c1;
}

RMAN> BACKUP CURRENT CONTROLFILE FOR STANDBY FORMAT `/oradata/ctl_20181213.bak`;

# 將備份集傳到備庫
scp *.bak 172.16.8.123:/oradata/

(五)恢復備庫

# 首先使用增量備份恢復資料庫
RMAN> catalog start with `/oradata/`;
RMAN> recover database noredo;
……
channel ORA_DISK_1: restored backup piece 1
channel ORA_DISK_1: restore complete, elapsed time: 00:00:07

Finished recover at 13-DEC-18

# 關閉並重新備庫至nomount狀態,此時控制檔案需要恢復(為什麼)
SQL> startup nomount

# 使用主庫控制檔案備份恢復備庫控制檔案
RMAN> RESTORE STANDBY CONTROLFILE FROM `/oradata/ctl_20181213.bak`;
Starting restore at 13-DEC-18
using channel ORA_DISK_1

channel ORA_DISK_1: restoring control file
channel ORA_DISK_1: restore complete, elapsed time: 00:00:01
output file name=/u01/app/oracle/oradata/standby/control01.ctl
output file name=/u01/app/oracle/fast_recovery_area/standby/control02.ctl
Finished restore at 13-DEC-18

#啟動資料庫到mount狀態
SQL> alter database mount;

(六)控制檔案恢復後,資料和日誌檔案路徑要是和備庫現有不同,可以使用線上命令rename資料檔案,重建日誌檔案,主要是standby 日誌

……
 ALTER DATABASE RENAME file `/u01/app/oracle/oradata/orcl/datafile/tbs_baiyang.dbf` to  `/u01/app/oracle/oradata/standby/datafile/tbs_baiyang.dbf`;
……
SQL> alter database add standby logfile group 24 `/u01/app/oracle/oradata/standby/standby24.log` size 50m;
……
SQL> alter database drop standby logfile group 14;
……

(七)這時開啟資料庫、開啟日誌實時應用程式

SQL> alter database open;
SQL> alter database recover managed standby database disconnect from session using current logfile;

SQL>  select archived,max(sequence#),applied from v$archived_log group by archived,applied;

ARC MAX(SEQUENCE#) APPLIED
--- -------------- ---------
YES            102 IN-MEMORY
YES            101 YES


SQL> select count(*) from dgr;

  COUNT(*)
----------
    675448
    
  
SQL> select count(*) from dgr where object_id=11 ;

  COUNT(*)
----------
    675448
    

為避免發生類似情況,(1)歸檔檔案保留策略要儘可能的長,最好不要隨著備份完成而刪除(2)可以配置歸檔的備份策略,備份後可以刪除歸檔,有需要從備份歸檔中還原


相關文章