【恩墨學院】Oracle DG測試failover和後續恢復報告
一、概述
二、驗證過程:
三、結論
一、概述
本文是針對在DG災備環境進行failover操作以及後續恢復的報告。
我這裡的測試環境是:
資料庫版本:Oracle 11.2.0.4
Site A:主庫 db_unique_name=jyzhao
Site B:備庫(實時應用)db_unique_name=mynas
Site C:備庫(延遲1小時應用)db_unique_name=jyzhao_s
以下章節涉及到的簡稱註釋:
A庫 => Site A:主庫
B庫 => Site B:備庫(實時應用)
C庫 => Site C:備庫(延遲1小時應用)
驗證:
當A庫crash後,在B庫進行failover將B切換為新的主庫,確認failover之後,A庫和C庫應該如何處理才可以成為新的備庫繼續使用?是否需要重建?重建的話,是否需要重新備份來恢復,以前的備份是否可以用來建立備庫?
二、驗證過程:
2.1 A庫異常關閉
A庫:
SQL> shutdown abort
2.2 B庫進行failover切換為新主庫
failover 標準步驟如下:
#取消DG應用
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;
#重啟下資料庫(建議)
shutdown immediate;
startup
#操作不可逆,確定實際情況需要failover
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE FINISH force;
SELECT OPEN_MODE, DATABASE_ROLE, SWITCHOVER_STATUS, FORCE_LOGGING, DATAGUARD_BROKER, GUARD_STATUS FROM V$DATABASE;
#嘗試常規切換為主庫
ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY WITH SESSION SHUTDOWN;
如果這一步的常規切換失敗,提示需要介質恢復,那麼:
1)恢復備庫 recover standby database until cancel;
2)啟用備庫 alter database activate standby database;
#最後重新啟動資料庫
shutdown immediate;
startup
檢視此時B庫的資訊:
SQL> select name, database_role, open_mode from gv$database;
NAME DATABASE_ROLE OPEN_MODE
--------- ---------------- --------------------
JYZHAO PRIMARY READ WRITE
SQL> select * from v$log;
GROUP# THREAD# SEQUENCE# BYTES BLOCKSIZE MEMBERS ARC STATUS FIRST_CHANGE# FIRST_TIME NEXT_CHANGE# NEXT_TIME
---------- ---------- ---------- ---------- ---------- ---------- --- ---------------- ------------- ------------ ------------ ------------
1 1 3 52428800 512 2 NO CURRENT 17882720 03-SEP-17 2.8147E+14
2 1 2 52428800 512 2 YES ACTIVE 17882488 03-SEP-17 17882720 03-SEP-17
3 2 0 52428800 512 2 YES UNUSED 0 0
4 2 0 52428800 512 2 YES UNUSED 0 0
可以看到,目前B庫已成為新的主庫,redo日誌的sequence重新開始。
2.3 要求C庫成為新主庫的備庫
現在要求C庫成為新主庫的備庫。是否需要重建C庫呢?答案是不需要。下面具體來看下驗證過程。
C庫的alert日誌:
Sun Sep 03 14:09:58 2017
Archived Log entry 9 added for thread 1 sequence 1227 ID 0x97764e10 dest 1:
ARC2: Archive log thread 1 sequence 1227 available in 60 minute(s)
Sun Sep 03 14:09:58 2017
RFS[3]: Selected log 11 for thread 1 sequence 1228 dbid -1785877518 branch 919999037
Sun Sep 03 14:19:24 2017
RFS[3]: Possible network disconnect with primary database
Sun Sep 03 14:19:24 2017
RFS[2]: Possible network disconnect with primary database
Sun Sep 03 14:19:24 2017
RFS[4]: Assigned to RFS process 10190
RFS[4]: Possible network disconnect with primary database
可以看到,在A庫crash之後,C庫收到網路無法連線到A庫的告警,說明C庫目前沒有新的操作。
接下來想要C庫成為B庫(新主庫)的備庫,就需要嘗試在B庫上配置DG引數,使得B庫的歸檔可以傳輸到C庫。
show parameter log_archive_config
show parameter log_archive_dest_3
alter system set log_archive_config = 'DG_CONFIG=(jyzhao,mynas,jyzhao_s)';
alter system set log_archive_dest_3 = 'SERVICE=jyzhao_s LGWR ASYNC VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) db_unique_name=jyzhao_s';
同時在B庫的tnsnames.ora檔案中增加到C庫的連線:
#Standby Single Instance
JYZHAO_S =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.1.111)(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = jyzhao_s)
)
)
在B庫設定完成後,觀察B庫的告警:
Sun Sep 03 14:37:41 2017
ALTER SYSTEM SET log_archive_dest_3='SERVICE=jyzhao_s LGWR ASYNC VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) db_unique_name=jyzhao_s' SCOPE=BOTH;
ARC3: Standby redo logfile selected for thread 1 sequence 3 for destination LOG_ARCHIVE_DEST_3
Thread 1 cannot allocate new log, sequence 5
Checkpoint not complete
Current log# 2 seq# 4 mem# 0: +DATA/mynas/onlinelog/group_2.278.953484865
Current log# 2 seq# 4 mem# 1: +FRA/mynas/onlinelog/group_2.714.953484869
Thread 1 advanced to log sequence 5 (LGWR switch)
Current log# 1 seq# 5 mem# 0: +DATA/mynas/onlinelog/group_1.277.953484853
Current log# 1 seq# 5 mem# 1: +FRA/mynas/onlinelog/group_1.720.953484859
******************************************************************
LGWR: Setting 'active' archival for destination LOG_ARCHIVE_DEST_3
******************************************************************
LNS: Standby redo logfile selected for thread 1 sequence 5 for destination LOG_ARCHIVE_DEST_3
Archived Log entry 382 added for thread 1 sequence 4 ID 0x978ff56d dest 1:
Sun Sep 03 14:37:55 2017
ARC3: Standby redo logfile selected for thread 1 sequence 4 for destination LOG_ARCHIVE_DEST_3
Sun Sep 03 14:39:49 2017
Thread 1 advanced to log sequence 6 (LGWR switch)
Current log# 2 seq# 6 mem# 0: +DATA/mynas/onlinelog/group_2.278.953484865
Current log# 2 seq# 6 mem# 1: +FRA/mynas/onlinelog/group_2.714.953484869
Sun Sep 03 14:39:49 2017
LNS: Standby redo logfile selected for thread 1 sequence 6 for destination LOG_ARCHIVE_DEST_3
Sun Sep 03 14:39:49 2017
Archived Log entry 388 added for thread 1 sequence 5 ID 0x978ff56d dest 1:
然後返回C庫操作,將C庫開啟實時日誌應用:
SQL> alter database recover managed standby database cancel;
Database altered.
SQL> alter database recover managed standby database using current logfile disconnect from session;
Database altered.
此時再觀察C庫的告警日誌:
Sun Sep 03 14:38:10 2017
Clearing online log 11 of thread 1 sequence number 0
Sun Sep 03 14:38:19 2017
MRP0: Incarnation has changed! Retry recovery...
RFS[7]: Selected log 11 for thread 1 sequence 4 dbid -1785877518 branch 953735009
Errors in file /u01/app/oracle/diag/rdbms/jyzhao_s/jyzhao/trace/jyzhao_pr00_10161.trc:
ORA-19906: recovery target incarnation changed during recovery
Recovery interrupted!
Sun Sep 03 14:38:19 2017
Archived Log entry 11 added for thread 1 sequence 4 ID 0x978ff56d dest 1:
Sun Sep 03 14:38:20 2017
started logmerger process
Sun Sep 03 14:38:20 2017
Managed Standby Recovery not using Real Time Apply
Parallel Media Recovery started with 2 slaves
Media Recovery start incarnation depth : 1, target inc# : 3, irscn : 17882484
Waiting for all non-current ORLs to be archived...
All non-current ORLs have been archived.
Media Recovery Delayed for 60 minute(s) (thread 1 sequence 1226)
Sun Sep 03 14:38:31 2017
RFS[8]: Assigned to RFS process 10704
RFS[8]: Opened log for thread 1 sequence 1228 dbid -1785877518 branch 919999037
Archived Log entry 12 added for thread 1 sequence 1228 rlc 919999037 ID 0x97764e10 dest 3:
RFS[8]: Opened log for thread 1 sequence 1 dbid -1785877518 branch 953735009
Sun Sep 03 14:38:36 2017
RFS[7]: Opened log for thread 1 sequence 2 dbid -1785877518 branch 953735009
Archived Log entry 13 added for thread 1 sequence 1 rlc 953735009 ID 0x978ff56d dest 3:
Archived Log entry 14 added for thread 1 sequence 2 rlc 953735009 ID 0x978ff56d dest 3:
Sun Sep 03 14:40:13 2017
Archived Log entry 15 added for thread 1 sequence 5 ID 0x978ff56d dest 1:
Sun Sep 03 14:40:13 2017
RFS[6]: Selected log 11 for thread 1 sequence 6 dbid -1785877518 branch 953735009
Sun Sep 03 14:40:37 2017
alter database recover managed standby database cancel
Sun Sep 03 14:40:38 2017
MRP0: Background Media Recovery cancelled with status 16037
Errors in file /u01/app/oracle/diag/rdbms/jyzhao_s/jyzhao/trace/jyzhao_pr00_10688.trc:
ORA-16037: user requested cancel of managed recovery operation
Recovery interrupted!
Sun Sep 03 14:40:38 2017
MRP0: Background Media Recovery process shutdown (jyzhao)
Managed Standby Recovery Canceled (jyzhao)
Completed: alter database recover managed standby database cancel
Sun Sep 03 14:40:53 2017
alter database recover managed standby database using current logfile disconnect from session
Attempt to start background Managed Standby Recovery process (jyzhao)
Sun Sep 03 14:40:53 2017
MRP0 started with pid=20, OS id=10747
MRP0: Background Managed Standby Recovery process started (jyzhao)
started logmerger process
Sun Sep 03 14:40:58 2017
Managed Standby Recovery starting Real Time Apply
Parallel Media Recovery started with 2 slaves
Media Recovery start incarnation depth : 1, target inc# : 3, irscn : 17882484
Waiting for all non-current ORLs to be archived...
All non-current ORLs have been archived.
Managed Standby Recovery started with USING CURRENT LOGFILE
Ignoring previously specified DELAY 60 minutes for thread 1 sequence 1226
Media Recovery Log /u01/oradata/JYZHAO_S/archivelog/2017_09_03/o1_mf_1_1226_dtq74r91_.arc
Managed Standby Recovery started with USING CURRENT LOGFILE
Ignoring previously specified DELAY 60 minutes for thread 1 sequence 1227
Media Recovery Log /u01/oradata/JYZHAO_S/archivelog/2017_09_03/o1_mf_1_1227_dtq75p6j_.arc
Media Recovery Log /u01/oradata/JYZHAO_S/archivelog/2017_09_03/o1_mf_1_1228_dtq8v7c7_.arc
Identified End-Of-Redo (failover) for thread 1 sequence 1228 at SCN 0x0.110dd74
Completed: alter database recover managed standby database using current logfile disconnect from session
Resetting standby activation ID 2541112848 (0x97764e10)
Media Recovery End-Of-Redo indicator encountered
Media Recovery Continuing
Media Recovery Log /u01/oradata/JYZHAO_S/archivelog/2017_09_03/o1_mf_1_1_dtq8vdpr_.arc
Media Recovery Log /u01/oradata/JYZHAO_S/archivelog/2017_09_03/o1_mf_1_2_dtq8vdpy_.arc
Media Recovery Log /u01/oradata/JYZHAO_S/archivelog/2017_09_03/o1_mf_1_3_dtq8tgvl_.arc
Sun Sep 03 14:41:11 2017
Media Recovery Log /u01/oradata/JYZHAO_S/archivelog/2017_09_03/o1_mf_1_4_dtq8tvy5_.arc
Media Recovery Log /u01/oradata/JYZHAO_S/archivelog/2017_09_03/o1_mf_1_5_dtq8yfr5_.arc
Media Recovery Waiting for thread 1 sequence 6 (in transit)
Recovery of Online Redo Log: Thread 1 Group 11 Seq 6 Reading mem 0
Mem# 0: /u01/oradata/standbylog/standby_group_11.log
實際看到,C庫已經可以正常應用日誌。說明C庫不需要重建即可透過簡單配置成為新主庫B庫的新備庫。
2.4 要求A庫成為新主庫的備庫
此時A庫啟動的話,是一個獨立執行的資料庫,如果想將A庫也設定為主庫的話,那麼,透過新主庫的最新備份肯定是可行的,但是如果資料量很大,之前A庫自己本身有歷史的備份,能否不再耗時備份新主庫,直接透過歷史的備份恢復呢?其實這個從上面的C庫不再需要重建直接成為新主庫的備庫,也可以推斷出,是可以的。只需要確認這個備份是在failover之前完成的。下面我們來具體實驗驗證下可行性。
在B庫建立新的備庫控制檔案,並傳輸到A庫相同路徑下:
backup current controlfile for standby format '/tmp/std_control02.ctl';
在A庫啟動到nomount,恢復新的備庫控制檔案
restore standby controlfile from '/tmp/std_control02.ctl';
在A庫檢視資料檔案頭的檢查點,確認是在failover之前:
SYS@jyzhao1 >select name from v$datafile;
NAME
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+DATA1/jyzhao/datafile/system.258.951608183
+DATA1/jyzhao/datafile/sysaux.257.951608183
+DATA1/jyzhao/datafile/undotbs1.259.951608185
+DATA1/jyzhao/datafile/users.265.951608205
+DATA1/jyzhao/datafile/undotbs2.261.951608185
+DATA1/jyzhao/datafile/dbs_d_jingyu.262.951608185
+DATA1/jyzhao/datafile/dbs_i_jingyu.263.951608185
+DATA1/jyzhao/datafile/test.264.951608185
+DATA1/jyzhao/datafile/test2.260.951608185
+DATA1/jyzhao/datafile/dbadata.276.952933931
10 rows selected.
SYS@jyzhao1 >select checkpoint_change# from v$database;
CHECKPOINT_CHANGE#
------------------
17892035
SYS@jyzhao1 >select checkpoint_change# from v$datafile;
CHECKPOINT_CHANGE#
------------------
17892035
17892035
17892035
17892035
17892035
17892035
17892035
17892035
17892035
17892035
10 rows selected.
SYS@jyzhao1 >select checkpoint_change# from v$datafile_header;
CHECKPOINT_CHANGE#
------------------
0
0
0
0
0
0
0
0
0
0
10 rows selected.
上面這個資料檔案頭的檢查點是0,說明資料檔案沒有正確獲取到,實際上是由於OMF的名字有變化,直接將資料檔案路徑catalog到備份集中,再switch即可。
catalog start with '+DATA1/jyzhao/datafile/';
switch database to copy;
再次查詢:
SYS@jyzhao1 >select current_scn||'' from v$database;
CURRENT_SCN||''
----------------------------------------
17892467
SYS@jyzhao1 >select checkpoint_change# from v$database;
CHECKPOINT_CHANGE#
------------------
17892035
SYS@jyzhao1 >select checkpoint_change# from v$datafile;
CHECKPOINT_CHANGE#
------------------
17892035
17892035
17892035
17892035
17892035
17892035
17892035
17892035
17892035
17892035
10 rows selected.
SYS@jyzhao1 >select checkpoint_change# from v$datafile_header;
CHECKPOINT_CHANGE#
------------------
17882484
17882484
17882484
17882484
17882484
17882484
17882484
17882484
17882484
17882484
10 rows selected.
此時在mount狀態下開啟日誌應用:
alter database recover managed standby database disconnect from session;
從告警日誌觀察,確認應用到最新時,取消日誌應用:
alter database recover managed standby database cancel;
開啟資料庫,開啟實時應用:
alter database recover managed standby database USING CURRENT LOGFILE disconnect from session;
最終查詢可以正常實時應用。
三、結論
一般來說,在A庫crash之後,B庫failover成為新的主庫,那麼原來設定為延遲1小時應用的C 庫是可以直接配置成為新主庫的備庫。A庫修復後,也可以透過failover之前的現有備份集來恢復到failover之前的狀態,而不需要在新主庫重新去備份。
恩墨學院隸屬於雲和恩墨(北京)資訊科技有限公司,致力於提供專業高水準的與大資料培訓服務,挖掘培養大資料與資料庫人才。恩墨學院提供包括個人實戰技能培訓、個人認證培訓、企業內訓在內的全方位大資料和資料庫技術培訓。ACE級別超強師資,配備專業實驗室,沉浸式學習與訓練,專業實驗室、配備專業助教指導訓練。能迅速融入專家圈子,業內資源豐富,迅速積累職場人脈。課程包括:班、Oracle 、Oracle OCP考試等。
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/28530558/viewspace-2150385/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- 【恩墨學院】深入解析:一主多備DG環境,failover的實現過程詳解AI
- Oracle 19C Data Guard基礎運維-07 failover後閃回恢復dg架構Oracle運維AI架構
- Oracle RMAN恢復測試Oracle
- 【恩墨學院】深入剖析 - Oracle SCN機制詳細解讀Oracle
- ORACLE DG從庫 Rman備份恢復Oracle
- Oracle 11g dg broker自動failoverOracleAI
- oracle dg 歸檔日誌恢復情況Oracle
- 【恩墨學院】5 分鐘帶你看懂 DockerDocker
- DG:11.2.0.4 RAC線上duplicate恢復DG
- 【恩墨學院】資料架構:中國電信的Oracle Sharding架構應用案例分析架構Oracle
- 【恩墨學院】深入解讀Oracle 18c對於DBA的影響及應對措施Oracle
- oracle dg報錯Oracle
- SQLSERVER恢復測試SQLServer
- 墨者學院-SQL手工注入漏洞測試(MySQL資料庫)MySql資料庫
- DG同步異常恢復文件
- 【恩墨學院】原來銀行都在用這些資料庫資料庫
- 測試計劃和測試報告測試報告
- DG歸檔日誌缺失恢復
- 直播預告丨先睹為快!Oracle 20c新特性解析 - 2020雲和恩墨大講堂Oracle
- 【恩墨學院】當Java虛擬機器遇上Linux Arena記憶體池Java虛擬機Linux記憶體
- Oracle 備份和恢復介紹Oracle
- 用增量備份來快速恢復dg
- SQL Server Availability Group Failover 測試SQLServerAI
- 【恩墨學院】京東618大促閘道器承載十億呼叫量背後的架構實踐架構
- NBU恢復oracleOracle
- Mysql資料庫delete刪除後資料恢復報告MySql資料庫delete資料恢復
- oracle 11g dg broker 開啟fast-start failover自動故障切換OracleASTAI
- Oracle 面試寶典-DG篇Oracle面試
- RMAN備份與恢復測試
- Oracle資料庫冷備和恢復Oracle資料庫
- 《雲和恩墨技術通訊-2020.03》.pdf
- 【ASK_ORACLE】Oracle表決磁碟丟失後的恢復方法Oracle
- 【恩墨學院】深度學習在美團點評推薦平臺排序中的運用深度學習排序
- 【恩墨學院】從商用到開源:DB2遷移至MySQL的最佳實踐DB2MySql
- 【恩墨學院】一次由查詢轉換引起的效能問題的分析
- 資料庫週刊17│OceanBase上雲;Oracle 的歷史;恩墨學院PG初、中級認證培訓開啟...資料庫Oracle
- Oracle案例12——NBU Oracle恢復Oracle
- 疫情後的新世界:消費恢復趨勢觀察報告
- oracle冷備恢復Oracle