Oracle資料庫塊的物理損壞與邏輯損壞

studywell發表於2017-05-18
       
        筆者認為不能簡單的說物理硬碟的損壞導致塊的物理損壞,Oracle bug導致塊的邏輯損壞。這篇文章我們來詳細討論一下Oracle資料庫的物理損壞以及邏輯損壞的概念。下面是一篇METALINK的文章:

Physical and Logical Block Corruptions. All you wanted to know about it. [ID 840978.1]

 修改時間 21-MAY-2012     型別 BULLETIN     狀態 PUBLISHED 

In this Document

 
 
 
 
 

Applies to:

Oracle Server - Enterprise Edition - Version 9.2.0.1 to 11.2.0.3 [Release 9.2 to 11.2]
Information in this document applies to any platform.
***Checked for relevance on 21-May-2012***


Purpose

Oracle classifies the Data File Block corruptions as Physical and Logical.  This is also referred as intra block corruptions.  This document is intended to provide detailed information and errors example about it.

Scope

This document is intended for Database Administrators.

Details

Physical Block Corruptions

This kind of block corruptions are normally reported by Oracle with error ORA-1578 and the detailed corruption description is printed in the alert log. 

Corruption Examples are:

  • Bad header - the beginning of the block (cache header) is corrupt with invalid values
  • The block is Fractured/Incomplete - header and footer of the block do not match
  • The block checksum is invalid
  • The block is misplaced
  • Zeroed out blocks / ORA-8103

Detailed Corruption Description:

Fractured Block

A Fractured block means that the block is incomplete. Information from the block header does not match the block tail.

Corrupt block relative dba: 0x0380e573 (file 14, block 58739)
Fractured block found during buffer read
Data in bad block -
type: 6 format: 2 rdba: 0x0380e573
last change scn: 0x0288.8e5a2f78 seq: 0x1 flg: 0x04
consistency value in tail: 0x00780601
check value in block header: 0x8739, computed block checksum: 0x2f00
spare1: 0x0, spare2: 0x0, spare3: 0x0
***
Reread of rdba: 0x0380e573 (file 14, block 58739) found same corrupted data


Bad Checksum

Block Checksums are used to identify if the block was changed by something external to Oracle and after the block was last written by Oracle.
Checksum is calculated by DBWR or direct loader before writing the block to disk and stored in the block header. Every time that the block is read and if db_block_checksum is different than false, Oracle calculates a checksum and compares it to the one stored in the block header. Reference 

Example of a corrupt block due to invalid checksum:

Corrupt block relative dba: 0x0380a58f (file 14, block 42383)
Bad check value found during buffer read
Data in bad block -
type: 6 format: 2 rdba: 0x0380a58f
last change scn: 0x0288.7784c5ee seq: 0x1 flg: 0x06
consistency value in tail: 0xc5ee0601
check value in block header: 0x68a7, computed block checksum: 0x2f00
spare1: 0x0, spare2: 0x0, spare3: 0x0
***
Reread of rdba: 0x0380a58f (file 14, block 42383) found same corrupted data


A value different than zero (0x0) in "computed block checksum" means that the checksum
differs and the result of this comparison is printed. 

Block Misplaced

This is when Oracle detected that the content of the block being read belongs to a different block and the checksum is valid:

Corrupt block relative dba: 0x0d805a89 (file 54, block 23177)
Bad header found during buffer read
Data in bad block -
type: 6 format: 2 rdba: 0x0d805b08 ----> Block is different than expected 0x0d805a89
last change scn: 0x0692.86dc08e3 seq: 0x1 flg: 0x04
consistency value in tail: 0x08e30601
check value in block header: 0x2a6e, computed block checksum: 0x0
spare1: 0x0, spare2: 0x0, spare3: 0x0
***

 

Logical Block Corruptions


This is when block contains a valid checksum and the structure below the beginning of the block is corrupt (Block content is corrupt). It may cause different ORA-600 errors. 

The detailed corruption description for Logical Corruptions are not normally printed in the alert.log. DBVerify will report what is logically corrupted in the block.

Corruption Examples are:

  • row locked by non-existent transaction - ORA-600 [4512],etc
  •  the amount of space used is not equal to block size
  • avsp bad
  • etc.

When db_block_checking is enabled, it may produce the internal errors ORA-600 [kddummy_blkchk] or ORA-600 [kdBlkCheckError].


If db_block_checking is enabled and the block is already logically corrupt on disk, the next block update will mark the block as Soft Corrupt and future reads of this block will produce the error ORA-1578. In that case DBVerify reports this corruption with error "DBV-200: Block, dba <rdba>, already marked corrupted".

References

 - Master Note for Handling Oracle Database Corruption Issues
 - OERR: ORA-1578 "ORACLE data block corrupted (file # %s, block # %s)" Master Note
 - Handling Oracle Block Corruptions in Oracle7/8/8i/9i/10g/11g
 - ORA-1578 / ORA-26040 Corrupt blocks by NOLOGGING - Error explanation and solution
 - How to identify the corrupt Object reported by ORA-1578 / RMAN / DBVERIFY
 - Identify the corruption extension using RMAN/DBV/ANALYZE etc


一.物理塊損壞:
        各種各樣的塊損壞通常是透過ORA-1578報告出來的,詳細的資訊被記錄在alert日誌中。物理損壞的例子包括:
>壞頭
>塊破壞/不完整 (Fractured Block)
>塊的checksum不正確 (Bad Checksum)
>塊的位置錯誤       (Block Misplaced)
>歸零的塊/ORA-8103

二.邏輯塊損壞:
        當塊包含一個正確的checksum和結構,但是塊頭一下的部分(塊的內容)被損壞,可能引起不同的ORA-600錯誤。
        邏輯損壞詳細的損壞資訊通常不列印在alert日誌中,DBV將報告邏輯損壞的塊。
邏輯損壞的例子包括:
>行被不存在的事務鎖定-ORA-600[4512]等
>使用的空間大小不等於塊的大小
>avsp bad
>等等
        當啟用db_block_checking,可能會產生內部錯誤ORA-600 [kddummy_blkchk] 或者 ORA-600 [kdBlkCheckError]。
        

三.DB_BLOCK_CHECKSUM和DB_BLOCK_CHECKING
        DB_BLOCK_CHECKSUM用於防止物理IO的損壞,預設值是TYPICAL,在DBWR程式寫入磁碟的時候會記錄資料的CHECKSUM值,將其儲存在塊頭中,下次在讀取的時候會重新計算塊的CHECKSUM值,與塊頭進行比對以判斷該塊是否損壞。如果將其設定為FULL,還會驗證記憶體中的塊的CHECKSUM值,避免記憶體的問題導致塊的損壞。 即使將DB_BLOCK_CHECKSUM值設定為FALSE,對於SYSTEM表空間也會進行相關的驗證。 
        DB_BLOCK_CHECKING用於邏輯控制塊,
如果db_block_checking是啟用,磁碟的塊已經被邏輯損壞,下一次更新塊將以軟損壞標記塊,將來對這個塊的讀將產生ORA-1578錯誤,在這種情況下報告損壞錯誤"DBV-200:Block,dba <rdba>,already marked corrupted"。  即使將DB_BLOCK_CHECKING值設定為FALSE,對於SYSTEM表空間也會進行相關的驗證。 

四.驗證工具介紹
        對於壞塊的驗證主要有兩個工具DBV和RMAN。

1.DBV工具

         DBV是DBVERIFY的縮寫,它是執行物理資料結構完整性檢查的外部命令列工具,它能用在離線或聯機的資料庫,也可以用在備份檔案上。因為DBVERYIFY能對離線資料庫執行,完整性檢查速度大幅度提高。DBV檢查被限制在快取管理的塊(資料塊),DBV只用於資料檔案,不能用於控制檔案或Redo日誌檔案的檢查。
dbv help=y可以看到DBV工具引數詳細的幫助資訊。  


2.RMAN工具
        下面是有關RMAN的VALIDATE命令驗證邏輯壞塊和物理壞塊的內容:
How to check for physical and logical database corruption using "backup validate check logical database" command for database on a non-archivelog mode [ID 466875.1]

 修改時間 30-JUN-2011     型別 FAQ     狀�?/em> PUBLISHED 

In this Document
  
  


Purpose

How can we use this command for a non-archivelog database so we can use this as opposed to the dbv command ?
RMAN> backup validate check logical database;

Questions and Answers

"backup validate check logical database" RMAN command could NOT be used for a non-archivelog database.

RMAN> backup validate check logical database;

RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: failure of backup command on ORA_DISK_2 channel at 11/06/2007 14:56:31
ORA-19602: cannot backup or copy active file in NOARCHIVELOG mode

The only way to do that for database on no-archive log mode is to perform. above command while the target database in MOUNT status not OPEN

Additional Information:
------------------------------------
- RMAN restore/backup validate with check logical will ensure that there is no logical and also physical corruption.However , DBV would report only physical corruption.

- If the backup validate discovers corrupt blocks, then RMAN updates the V$DATABASE_BLOCK_CORRUPTION  view with rows describing the corruptions.

- After corrupt block is repaired the row identifies this block is deleted from the view.


         從上面的描述可以看出,DBV只報告物理損壞,validate check logical database可以驗證物理損壞和邏輯損壞,執行VALIDATE命令後透過查詢V$DATABASE_BLOCK_CORRUPTION檢視可以獲得壞塊的詳細資訊,塊被修復後V$DATABASE_BLOCK_CORRUPTION中相應的記錄會被刪除。該命令只能執行在歸檔模式下的資料庫,如果要在非歸檔模式下執行該命令,必須使資料庫在MOUNT模式。
        執行下面的命令能驗證歸檔日誌的邏輯損壞和物理損壞:
RMAN > BACKUP VALIDATE CHECK LOGICAL DATABASE ARCHIVELOG ALL;
        執行下面的命令驗證資料庫物理損壞的塊:
RMAN > BACKUP VALIDATE DATABASE|ARCHIVELOG ALL;
       沒有CHECK LOGICAL關鍵字的BACKUP命令只會驗證物理損壞。 




來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/29519108/viewspace-2139271/,如需轉載,請註明出處,否則將追究法律責任。

相關文章