Oracle 19C Data Guard基礎運維-05Failovers (GAP)

chenoracle發表於2020-04-19

Oracle 19C Data Guard 基礎運維 -0 5Failovers (GAP)

原主庫

原備庫

 

Failovers

新主庫

獨立庫

192.168.31.90

192.168.31.100

192.168.31.100

192.168.31.90

cjcdb

chendb

chendb

cjcdb

Failover

https://docs.oracle.com/en/database/oracle/oracle-database/19/sbydb/data-guard-concepts-and-administration.pdf

Figure 9-4 Failover to a Standby Database 

Performing a Failover to a Physical Standby Database  

關於archive gap 的問題?

上一篇部落格《04 Failovers疑問?》寫了關於 archive gap的疑問,在實驗中,我提前將備庫關機,主庫端插入大量資料產生 3個歸檔檔案,並手動將最後 3個歸檔檔案重新命名,目的是不讓備庫獲取到這三個歸檔檔案,在啟動備庫,試圖模擬出備庫 archive gap場景,但是在備庫端 v$archive_gap中顯示空的,備庫沒有檢測出 archive gap的存在嗎?

實際上是本人對archive gap概念存在一些誤解,比如主庫有 1100個歸檔,我認為只要有任何歸檔檔案在備庫端獲取失敗都會出現 archive gap,都會記錄到 v$archive_gap,通過上一篇實驗發現這種理論顯然是不對的,我強制將主庫 98,99,100三個歸檔檔案重新命名,備庫端並沒有出現 archive gap,即在 v$archive_gap中不會有資料。

那麼究竟什麼場景才會出現archive gap?真實的場景是,備庫在接收主庫歸檔檔案時有部分沒有接收成功,但後續的歸檔檔案又接收成功了,比如主庫 1100個歸檔檔案,出於某種原因,備庫沒有接收到 97,98兩個歸檔,但是後面的 99,100歸檔又能正常接收,這時就會產生 archive gap,在 v$archive_gap會查到 97,98歸檔資訊。 (感謝墨天輪平臺“你好我是李白”的答疑解惑 )

實驗過程如下:

場景二:archive gap下的failover

主庫模擬故障,模擬歸檔gap

先停掉備庫: 不接收主庫產生的 redo 或歸檔資料

SQL> shutdown immediate

主庫:生成測試資料,生成redo 和歸檔資料

---session 1

SQL>

declare

begin

  for i in 1 .. 1000 000  loop

    insert into test1 values (i);

    commit;

  end loop;

end;

插入資料期間,生成了3 個歸檔檔案

[oracle@cjcos01 arch]$ pwd

/arch

......

cjcpdb_arch_1_74_1030641846.arc

cjcpdb_arch_1_75_1030641846.arc

cjcpdb_arch_1_76_1030641846.arc

主庫重新命名新產生的前兩個歸檔檔案,模擬歸檔gap

[oracle@cjcos01 arch]$ mv cjcpdb_arch_1_74_1030641846.arc cjcpdb_arch_1_74_1030641846.arc.bak

[oracle@cjcos01 arch]$ mv cjcpdb_arch_1_75_1030641846.arc cjcpdb_arch_1_75_1030641846.arc.bak

再次插入部分資料

SQL>

declare

begin

  for i in 1 .. 1000 0  loop

    insert into test1 values (i);

    commit;

  end loop;

end;

啟動備庫:

SQL> startup

-- 備庫啟動時,檢視對應主庫日誌,提示找不到 74,75 兩個歸檔檔案,無法將 74,75 傳送到備庫端。

2020-04-19T18:37:53.170879+08:00

Errors in file /u01/app/oracle/diag/rdbms/cjcdb/cjcdb/trace/cjcdb_tt00_2349.trc:

ORA-00308: cannot open archived log '/arch/cjcpdb_arch_1_75_1030641846.arc'

ORA-27037: unable to obtain file status

Linux-x86_64 Error: 2: No such file or directory

Additional information: 7

2020-04-19T18:37:53.171203+08:00

Errors in file /u01/app/oracle/diag/rdbms/cjcdb/cjcdb/trace/cjcdb_tt00_2349.trc:

ORA-00308: cannot open archived log '/arch/cjcpdb_arch_1_74_1030641846.arc'

ORA-27037: unable to obtain file status

Linux-x86_64 Error: 2: No such file or directory

Additional information: 7

備庫:檢視archive log ,實際應該是 74 75 ,不清楚為什麼會顯示 73

SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT FROM SESSION;

SQL> select thread#, low_sequence#, high_sequence# from v$archive_gap;

   THREAD# LOW_SEQUENCE# HIGH_SEQUENCE#

---------- ------------- --------------

 1       73      75

備庫:沒有接收到74,75 兩個歸檔檔案

主庫重新命名system01.dbf 模擬資料庫故障

[oracle@cjcos01 arch]$ cd /u01/app/oracle/oradata/CJCDB/

[oracle@cjcos01 CJCDB]$ mv system01.dbf system01.dbf.bak

SQL> alter system checkpoint;

SQL> shutdown abort

主庫啟動失敗

SQL> startup

ORACLE instance started.

Total System Global Area 1375728192 bytes

Fixed Size       9134656 bytes

Variable Size    1107296256 bytes

Database Buffers   251658240 bytes

Redo Buffers       7639040 bytes

Database mounted.

ORA-01157: cannot identify/lock data file 1 - see DBWR trace file

ORA-01110: data file 1: '/u01/app/oracle/oradata/CJCDB/system01.dbf'

SQL> select open_mode from v$database;

OPEN_MODE

--------------------

MOUNTED

備庫:

1. 檢查 dg 恢復模式 ( 最大效能模式 )

SQL> select database_role,protection_level,protection_mode from v$database;

DATABASE_ROLE  PROTECTION_LEVEL     PROTECTION_MODE

---------------- -------------------- --------------------

PHYSICAL STANDBY MAXIMUM PERFORMANCE  MAXIMUM PERFORMANCE

2 檢查 archive_gap ,實際應該是74 75 ,不清楚為什麼會顯示 73

SQL> select thread#, low_sequence#, high_sequence# from v$archive_gap;

   THREAD# LOW_SEQUENCE# HIGH_SEQUENCE#

---------- ------------- --------------

 1       73      75

主庫:

SQL> select name from v$archived_log where thread#=1 and sequence# between 73 and 75;

NAME

--------------------------------------------------------------------------------

/arch/cjcpdb_arch_1_73_1030641846.arc

/arch/cjcpdb_arch_1_74_1030641846.arc

/arch/cjcpdb_arch_1_75_1030641846.arc

73 歸檔檔案拷貝到備庫端

[oracle@cjcos01 arch]$ scp cjcpdb_arch_1_73_1030641846.arc cjcos02:/arch

主庫在mount 狀態下執行 flush redo 操作

SQL> ALTER SYSTEM FLUSH REDO TO chendb;

ALTER SYSTEM FLUSH REDO TO chendb

*

ERROR at line 1:

ORA-16416: No viable switchover targets available

備庫:手動註冊73 號歸檔,也顯示歸檔已經註冊了

SQL> alter database register logfile '/arch/cjcpdb_arch_1_73_1030641846.arc';

alter database register logfile '/arch/cjcpdb_arch_1_73_1030641846.arc'

*

ERROR at line 1:

ORA-16089: archive log has already been registered

但是archive gap 還是顯示有 73

SQL> select thread#, low_sequence#, high_sequence# from v$archive_gap;

   THREAD# LOW_SEQUENCE# HIGH_SEQUENCE#

---------- ------------- --------------

 1       73      75

主庫:將74 號歸檔檔名改回來

[oracle@cjcos01 arch]$ mv cjcpdb_arch_1_74_1030641846.arc.bak cjcpdb_arch_1_74_1030641846.arc

再次執行flush redo

SQL> ALTER SYSTEM FLUSH REDO TO chendb;

ALTER SYSTEM FLUSH REDO TO chendb

*

ERROR at line 1:

ORA-16416: No viable switchover targets available

檢視主庫日誌,主庫已經將74 歸檔發生備庫端了,開始嘗試讀取 75 號歸檔檔案。

如果flush redo 命令沒生效,也可以將歸檔檔案拷到備庫端,手動執行註冊

SQL> ALTER DATABASE REGISTER PHYSICAL LOGFILE 'filespec1';

備庫:只有1 75 號歸檔找不到了

SQL> select thread#, low_sequence#, high_sequence# from v$archive_gap;

   THREAD# LOW_SEQUENCE# HIGH_SEQUENCE#

---------- ------------- --------------

 1       75      75

備庫:取消應用程式

SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;

Database altered.

備庫:由於存在archive gap ,是不允許常規的 failover

SQL> ALTER DATABASE FAILOVER TO chendb;

ALTER DATABASE FAILOVER TO chendb

*

ERROR at line 1:

ORA-00283: recovery session canceled due to errors

ORA-16171: RECOVER...FINISH not allowed due to gap for thr 1, seq 75-75

備庫:加force 也不生效

SQL> ALTER DATABASE FAILOVER TO chendb force;

ALTER DATABASE FAILOVER TO chendb force

*

ERROR at line 1:

ORA-00283: recovery session canceled due to errors

ORA-16171: RECOVER...FINISH not allowed due to gap for thr 1, seq 75-75

強制failover: 在存在 archive gap 情況下,強制執行 failover ,會丟失資料,正式環境謹慎使用!!!

Perform a data loss failover.

If an error condition cannot be resolved, a failover can still be performed (with some data loss) by issuing the following SQL statement on the target standby database:

SQL> ALTER DATABASE ACTIVATE PHYSICAL STANDBY DATABASE;

開啟資料庫

SQL>  ALTER DATABASE OPEN;

檢視資料

SQL> select count(*) from test1;

  COUNT(*)

----------

    252780

test1 表丟失了 1000 000+1000- 252780 =748220 條資料。

歡迎關注我的微信公眾號"IT小Chen",共同學習,共同成長!!!

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/29785807/viewspace-2687052/,如需轉載,請註明出處,否則將追究法律責任。

相關文章