環境: Oracle 19c ADG(主庫:單例項;備庫:RAC)
1.主庫新建測試檔案
主庫在AWR的PDB中做測試,為了不影響其他測試,建立一個新的測試表空間tbs_test及對應資料檔案:SQL> conn awr@awr
Enter password:
Connected.
SQL> create tablespace tbs_test datafile '/flash/oradata/DEMO/awr/tbs_test01.dbf' size 30m;
Tablespace created.
2.主庫建立測試表
主庫在新建表空間上建立測試表awr.test:SQL> create table awr.test tablespace tbs_test as select * from dba_users;
Table created.
SQL> select count(*) from awr.test;
COUNT(*)
----------
37
3.查詢表對應資料檔案資訊
透過dbms_rowid檢視awr.test表對應行資料的檔案號(rel_fno)、塊號(blockno,)和行號(rowno):select rowid,
dbms_rowid.rowid_relative_fno(rowid) rel_fno,
dbms_rowid.rowid_block_number(rowid) blockno,
dbms_rowid.rowid_row_number(rowid) rowno
from awr.test
order by rowid;
ROWID REL_FNO BLOCKNO ROWNO
------------------ ---------- ---------- ----------
AAATR2AAdAAAACDAAA 29 131 0
AAATR2AAdAAAACDAAB 29 131 1
AAATR2AAdAAAACDAAC 29 131 2
AAATR2AAdAAAACDAAD 29 131 3
AAATR2AAdAAAACDAAE 29 131 4
AAATR2AAdAAAACDAAF 29 131 5
AAATR2AAdAAAACDAAG 29 131 6
AAATR2AAdAAAACDAAH 29 131 7
AAATR2AAdAAAACDAAI 29 131 8
AAATR2AAdAAAACDAAJ 29 131 9
AAATR2AAdAAAACDAAK 29 131 10
...
4.模擬資料檔案物理壞塊
使用dd模擬資料檔案的物理壞塊:dd if=/dev/zero of=/flash/oradata/DEMO/awr/tbs_test01.dbf bs=8192 conv=notrunc seek=131 count=1
5.查詢對應測試表
再次查詢被破壞資料檔案上的表awr.test,發現客戶端只是卡頓一下就正常出了結果,並沒有任何顯示的報錯:ALTER SYSTEM Flush buffer_cache;
select count(*) from awr.test;
6.進一步查詢日誌資訊
上面查詢表沒有報錯,但是從主庫的alert日誌中可以看到:2023-04-03T12:08:02.602504+08:00
AWR(6):create tablespace tbs_test datafile '/flash/oradata/DEMO/awr/tbs_test01.dbf' size 30m
AWR(6):Completed: create tablespace tbs_test datafile '/flash/oradata/DEMO/awr/tbs_test01.dbf' size 30m
2023-04-03T12:15:21.021834+08:00
AWR(6):ALTER SYSTEM: Flushing buffer cache inst=0 container=6 global
AWR(6):TABLE AUDSYS.AUD$UNIFIED: ADDED INTERVAL PARTITION SYS_P368 (106) VALUES LESS THAN (TIMESTAMP' 2023-05-01 00:00:00')
2023-04-03T12:15:24.751443+08:00
AWR(6):Hex dump of (file 29, block 131) in trace file /u01/app/oracle/diag/rdbms/demo/demo/trace/demo_ora_11735.trc
AWR(6):
AWR(6):Corrupt block relative dba: 0x07400083 (file 29, block 131)
AWR(6):Completely zero block found during multiblock buffer read
AWR(6):
AWR(6):Reading datafile '/flash/oradata/DEMO/awr/tbs_test01.dbf' for corrupt data at rdba: 0x07400083 (file 29, block 131)
AWR(6):Reread (file 29, block 131) found same corrupt data (no logical check)
AWR(6):Starting background process ABMR
2023-04-03T12:15:24.763408+08:00
Corrupt Block Found
TIME STAMP (GMT) = 04/03/2023 12:15:24
CONT = 6, TSN = 5, TSNAME = TBS_TEST
RFN = 29, BLK = 131, RDBA = 121634947
OBJN = 78966, OBJD = 78966, OBJECT = TEST, SUBOBJECT =
SEGMENT OWNER = AWR, SEGMENT TYPE = Table Segment
2023-04-03T12:15:24.766521+08:00
ABMR started with pid=132, OS id=11983
2023-04-03T12:15:24.767751+08:00
Automatic block media recovery service is active.
2023-04-03T12:15:24.767981+08:00
AWR(6):Automatic block media recovery requested for (file# 29, block# 131)
2023-04-03T12:15:27.096763+08:00
Automatic block media recovery successful for (file# 29, block# 131)
2023-04-03T12:15:27.097189+08:00
AWR(6):Automatic block media recovery successful for (file# 29, block# 131)
日誌中顯示自動啟用了ABMR(Automatic block media recovery)成功修復了物理壞塊。
7.確認當前引數設定
如果查詢 db_block_checking 、 db_lost_write_protect 這些引數,會發現我這裡並沒有去特殊設定:SQL> show parameter db_block
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
db_block_buffers integer 0
db_block_checking string FALSE
db_block_checksum string TYPICAL
db_block_size integer 8192
SQL> show parameter db_lost
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
db_lost_write_protect string NONE
SQL>
那麼那些引數的意義呢?其實MOS檔案:
- Best Practices for Corruption Detection, Prevention, and Automatic Repair - in a Data Guard Configuration (Doc ID 1302539.1)
檔案中有說明,物理壞塊預設ADG就能檢測,邏輯壞塊要配合這些引數設定。包括上一步的日誌資訊中,在發現資料損壞時,也標註了(no logical check)
非邏輯檢查的提示。
當然,如果您想要獲得更全面的保護,還是要按檔案說明,額外設定這些引數。