案例:DG主庫未設定force logging導致備庫壞塊

AlfredZhao發表於2020-06-08

DG搭建時,官方文件手冊有明確提到要設定資料庫為force_logging,防止有nologging操作日誌記錄不全導致備庫應用時出現問題。
雖然是老生常談的安裝規範,但現實中總會遇到不遵守規範的場景,最近就在某客戶現場遇到一則這樣的案例,因為DG主庫設定force_logging晚於DG搭建,導致備庫出現壞塊,使用dbv檢查就表現為DBV-201錯誤。

下面我們來模擬下這個場景,同時演示下具體修復過程:

1.準備實驗環境

主庫確認沒有開啟force logging 模式,如果是,修改為不是,這是模擬故障場景的前提條件:
select force_logging from v$database;
ALTER DATABASE NO FORCE LOGGING;

搭建一套測試DG:主庫修改系列DG配置引數後,建立pfile給備庫修改使用,同時將密碼檔案、tnsnames.ora檔案傳輸到備庫端,啟動例項到nomount狀態:

create pfile='/tmp/pfile_for_standby.txt' from spfile;

scp /tmp/pfile_for_standby.txt ora11204@192.168.1.11:/u01/app/oracle/product/11.2.0/dbhome_1/dbs/
scp $ORACLE_HOME/dbs/orapwcrmdb1 ora11204@192.168.1.11:/u01/app/oracle/product/11.2.0/dbhome_1/dbs/orapwsingle
scp $ORACLE_HOME/network/admin/tnsnames.ora ora11204@192.168.1.11:/u01/app/oracle/product/11.2.0/dbhome_1/network/admin/tnsnames.ora

change pfile depend on standby env;
sqlplus / as sysdba
startup nomount pfile=$ORACLE_HOME/dbs/pfile_for_standby.txt

使用duplicate搭建DG備庫,注意備庫需要靜態監聽:

vi dup_dg.sh

rman target sys/oracle@crmdb1 auxiliary sys/oracle@single <<EOF
duplicate target database for standby from active database dorecover nofilenamecheck;
EOF

nohup sh dup_dg.sh > dup_dg.log &

tail -200f dup_dg.log

注意:目標端所需目錄要提前手工建立,因為duplicate過程發現沒有對應目錄會報錯。

2.構造故障場景

主庫使用者表空間xxx,建立一張表插入資料,nologging建立索引;切換日誌,備庫檢查壞塊情況。 在jingyu使用者下建立測試表,並使用nologging方式建立索引:
jingyu@CRMDB> create table TEST as select * from dba_objects;

Table created.

jingyu@CRMDB> create index idx_test on test(object_id) nologging;

Index created.

jingyu@CRMDB> select USERNAME, DEFAULT_TABLESPACE, TEMPORARY_TABLESPACE from dba_users where username = 'JINGYU';

USERNAME                       DEFAULT_TABLESPACE             TEMPORARY_TABLESPACE
------------------------------ ------------------------------ ------------------------------
JINGYU                         DBS_D_JINGYU                   TEMP

備庫檢視同步狀態OK:

SQL> 
set lines 1000
col value for a20
col name for a30
col unit for a30
col TIME_COMPUTED for a30
col DATUM_TIME for a30
select * from v$dataguard_stats;

NAME                           VALUE                UNIT                           TIME_COMPUTED                  DATUM_TIME
------------------------------ -------------------- ------------------------------ ------------------------------ ------------------------------
transport lag                  +00 00:00:00         day(2) to second(0) interval   06/08/2020 09:14:26            06/08/2020 09:14:26
apply lag                      +00 00:00:00         day(2) to second(0) interval   06/08/2020 09:14:26            06/08/2020 09:14:26
apply finish time                                   day(2) to second(3) interval   06/08/2020 09:14:26
estimated startup time         40                   second                         06/08/2020 09:14:26

但備庫使用dbv檢查資料檔案,發現已經存在壞塊,報錯都是DBV-00201:

[ora11204@OEL-ASM arch]$ dbv file=/u01/oradata/crmdb/datafile/dbs_d_jingyu.365.1041070633

DBVERIFY: Release 11.2.0.4.0 - Production on Mon Jun 8 08:52:12 2020

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

DBVERIFY - Verification starting : FILE = /u01/oradata/crmdb/datafile/dbs_d_jingyu.365.1041070633

DBV-00201: Block, DBA 25168260, marked corrupt for invalid redo application

DBV-00201: Block, DBA 25168261, marked corrupt for invalid redo application

...這裡省略大量DBV-00201的輸出...

DBV-00201: Block, DBA 25168422, marked corrupt for invalid redo application

DBV-00201: Block, DBA 25168423, marked corrupt for invalid redo application


DBVERIFY - Verification complete

Total Pages Examined         : 12800
Total Pages Processed (Data) : 1998
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 157
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 354
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 10291
Total Pages Marked Corrupt   : 155
Total Pages Influx           : 0
Total Pages Encrypted        : 0
Highest block SCN            : 2168379 (0.2168379)
[ora11204@OEL-ASM arch]$ 

此時再通過主庫設定force logging挽救,為時過晚,只能對之後的操作起作用,但對已造成的壞塊無法修復:

ALTER DATABASE FORCE LOGGING;

3.解決故障

在主庫確認已設定force logging後,重新搭建DG環境。 當然如果造成壞塊的資料檔案不是很多,相比較全庫而言,直接重新備份受損的資料檔案也許是更效率的方案: 比如我這裡測試環境,就只有1個資料檔案收到了影響,只需要修復它就好:

3.1 確認下主庫的這個檔案是好的(無壞塊):

[oracle@jystdrac1 trace]$ dbv userid=sys/oracle file=+DATA/crmdb/datafile/dbs_d_jingyu.365.1041070633

DBVERIFY: Release 11.2.0.4.0 - Production on Mon Jun 8 09:26:21 2020

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

DBVERIFY - Verification starting : FILE = +DATA/crmdb/datafile/dbs_d_jingyu.365.1041070633


DBVERIFY - Verification complete

Total Pages Examined         : 12800
Total Pages Processed (Data) : 1998
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 312
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 199
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 10291
Total Pages Marked Corrupt   : 0
Total Pages Influx           : 0
Total Pages Encrypted        : 0
Highest block SCN            : 0 (0.0)

3.2 備份這個檔案並傳輸到備庫:
backup as compressed backupset datafile 6 format '/public/rman/primary_datafile_6.bak';

RMAN> backup as compressed backupset datafile 6 format '/public/rman/primary_datafile_6.bak';

Starting backup at 2020-06-08 09:29:15
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=71 instance=crmdb1 device type=DISK
channel ORA_DISK_1: starting compressed full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00006 name=+DATA/crmdb/datafile/dbs_d_jingyu.365.1041070633
channel ORA_DISK_1: starting piece 1 at 2020-06-08 09:29:17
channel ORA_DISK_1: finished piece 1 at 2020-06-08 09:29:25
piece handle=/public/rman/primary_datafile_6.bak tag=TAG20200608T092917 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:08
Finished backup at 2020-06-08 09:29:25

3.3 備庫關閉,啟動到mount,restore損壞的資料檔案,然後open開啟應用

RMAN> catalog start with '/public/rman/primary_';
RMAN> list backup of datafile 6;
RMAN> restore datafile 6;

再次使用dbv檢視壞塊情況,已經修復:

[ora11204@OEL-ASM rman]$ dbv file=/u01/oradata/crmdb/datafile/dbs_d_jingyu.365.1041070633

DBVERIFY: Release 11.2.0.4.0 - Production on Mon Jun 8 09:37:28 2020

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

DBVERIFY - Verification starting : FILE = /u01/oradata/crmdb/datafile/dbs_d_jingyu.365.1041070633


DBVERIFY - Verification complete

Total Pages Examined         : 12800
Total Pages Processed (Data) : 1998
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 312
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 10311
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 179
Total Pages Marked Corrupt   : 0
Total Pages Influx           : 0
Total Pages Encrypted        : 0
Highest block SCN            : 2168379 (0.2168379)
[ora11204@OEL-ASM rman]$ 

備庫重新開啟日誌應用:

SQL> alter database open;
SQL> recover managed standby database using current logfile disconnect;
SQL> select * from v$dataguard_stats;

NAME                           VALUE                UNIT                           TIME_COMPUTED                  DATUM_TIME
------------------------------ -------------------- ------------------------------ ------------------------------ ------------------------------
transport lag                  +00 00:00:00         day(2) to second(0) interval   06/08/2020 09:40:57            06/08/2020 09:40:56
apply lag                      +00 00:00:00         day(2) to second(0) interval   06/08/2020 09:40:57            06/08/2020 09:40:56
apply finish time                                   day(2) to second(3) interval   06/08/2020 09:40:57
estimated startup time         32                   second                         06/08/2020 09:40:57

壞塊消除後,再確認DG重新同步正常即可。

相關文章