[20190311]關於oracle物理與邏輯壞塊.txt

lfree發表於2019-03-11

[20190311]關於oracle物理與邏輯壞塊.txt


--//別人問的問題,如何區分物理壞塊與邏輯壞塊,或者講什麼是物理壞塊與邏輯壞塊,區別在那裡?

--//講句真心話,我自己也不是很清晰,要給一個概念下準確的定義實際上很困難.


--//我的理解能夠透過bbed,dump之類的方式讀出塊的結構,基本可以稱為邏輯壞塊,只不過裡面一些數值不對.

--//oracle的資料塊(其他型別的塊類似)一般分為幾層:


--//資料塊的結構,一個0x06的block包含4個layer

---------------------

1- Cache Layer       - kcbh 20byte 該層主要包含block format, type(index, table or cluster and so on)在block中使用kcbh

                       struct來描述

---------------------

2- Transaction Layer - ktbbh 48byte(包含ktbbh,ktbit就是一個ITL的容器,單個ITL結構為ktbbhitl大小為24byte)

--------------------- ---------------------

3- Data Layer  包含  - - Data Header      - kdbh

--------------------- ---------------------

                          - Table Directory  - 包含:offset

                          ---------------------

                          - Row Directory    -

                          ---------------------

                          - Free Space       -

                          ---------------------

                          - Row Data         -

                          ---------------------

---------------------

4- Tailchk          -

---------------------


--//反之:物理壞塊,就是根本讀不出裡面的結構,實際上這樣的透過bbed之類的工具無法修復.當然這樣的理解有點武斷或者不全面.

--//至少是我自己的理解,透過例子說明:


1.環境:

SCOTT@book> @ver1

PORT_STRING                    VERSION        BANNER

------------------------------ -------------- --------------------------------------------------------------------------------

x86_64/Linux 2.4.xx            11.2.0.4.0     Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production


SCOTT@book> create table empy as select * from emp;

Table created.


SCOTT@book> select rowid ,empy.* from empy where rownum=1;

ROWID                   EMPNO ENAME      JOB              MGR HIREDATE                   SAL       COMM     DEPTNO

------------------ ---------- ---------- --------- ---------- ------------------- ---------- ---------- ----------

AAAWQ3AAEAAAALLAAA       7369 SMITH      CLERK           7902 1980-12-17 00:00:00        800                    20


SCOTT@book> @ rowid AAAWQ3AAEAAAALLAAA

    OBJECT       FILE      BLOCK        ROW ROWID_DBA            DBA                  TEXT

---------- ---------- ---------- ---------- -------------------- -------------------- ----------------------------------------

     91191          4        715          0  0x10002CB           4,715                alter system dump datafile 4 block 715 ;


SCOTT@book> alter system flush buffer_cache;

System altered.


--//使用rman做一個備份:

RMAN> backup as copy datafile 4 format '/u01/backup/%b_20190311';

Starting backup at 2019-03-11 09:24:04

using target database control file instead of recovery catalog

allocated channel: ORA_DISK_1

channel ORA_DISK_1: SID=67 device type=DISK

allocated channel: ORA_DISK_2

channel ORA_DISK_2: SID=80 device type=DISK

allocated channel: ORA_DISK_3

channel ORA_DISK_3: SID=94 device type=DISK

channel ORA_DISK_1: starting datafile copy

input datafile file number=00004 name=/mnt/ramdisk/book/users01.dbf

output file name=/u01/backup/users01.dbf_20190311 tag=TAG20190311T092405 RECID=22 STAMP=1002619446

channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:01

Finished backup at 2019-03-11 09:24:07

Starting Control File and SPFILE Autobackup at 2019-03-11 09:24:07

piece handle=/u01/app/oracle/fast_recovery_area/BOOK/autobackup/2019_03_11/o1_mf_s_1002619447_g8cg5q3k_.bkp comment=NONE

Finished Control File and SPFILE Autobackup at 2019-03-11 09:24:08


RMAN> validate datafile 4;

Starting validate at 2019-03-11 09:24:33

using channel ORA_DISK_1

using channel ORA_DISK_2

using channel ORA_DISK_3

channel ORA_DISK_1: starting validation of datafile

channel ORA_DISK_1: specifying datafile(s) for validation

input datafile file number=00004 name=/mnt/ramdisk/book/users01.dbf

channel ORA_DISK_1: validation complete, elapsed time: 00:00:01

List of Datafiles

=================

File Status Marked Corrupt Empty Blocks Blocks Examined High SCN

---- ------ -------------- ------------ --------------- ----------

4    OK     0              17           16411           13278827822

  File Name: /mnt/ramdisk/book/users01.dbf

  Block Type Blocks Failing Blocks Processed

  ---------- -------------- ----------------

  Data       0              12355

  Index      0              3292

  Other      0              720


Finished validate at 2019-03-11 09:24:34


2.測試:

BBED> set dba 4,715

        DBA             0x010002cb (16777931 4,715)


BBED> p kdbh

struct kdbh, 14 bytes                       @124

   ub1 kdbhflag                             @124      0x00 (NONE)

   sb1 kdbhntab                             @125      1

   sb2 kdbhnrow                             @126      14

   sb2 kdbhfrre                             @128     -1

   sb2 kdbhfsbo                             @130      46

   sb2 kdbhfseo                             @132      7497

   sb2 kdbhavsp                             @134      7451

   sb2 kdbhtosp                             @136      7451


--//修改該塊kdbhavsp的數值:

BBED> assign kdbh.kdbhavsp=7450

Warning: contents of previous BIFILE will be lost. Proceed? (Y/N) y

sb2 kdbhavsp                                @134      7450


BBED> sum apply

Check value for File 4, Block 715:

current = 0x99a0, required = 0x99a0


BBED> verify

DBVERIFY - Verification starting

FILE = /mnt/ramdisk/book/users01.dbf

BLOCK = 715


Block Checking: DBA = 16777931, Block Type = KTB-managed data block

data header at 0x7fe359db627c

kdbchk: the amount of space used is not equal to block size

        used=613 fsc=0 avsp=7450 dtl=8064

Block 715 failed with check code 6110


--//我以前提到過,一些修復這步我是不做的,我個人人為能讀出就ok了.


RMAN> validate datafile 4;

Starting validate at 2019-03-11 09:27:18

using channel ORA_DISK_1

using channel ORA_DISK_2

using channel ORA_DISK_3

channel ORA_DISK_1: starting validation of datafile

channel ORA_DISK_1: specifying datafile(s) for validation

input datafile file number=00004 name=/mnt/ramdisk/book/users01.dbf

channel ORA_DISK_1: validation complete, elapsed time: 00:00:01

List of Datafiles

=================

File Status Marked Corrupt Empty Blocks Blocks Examined High SCN

---- ------ -------------- ------------ --------------- ----------

4    OK     0              17           16411           13278827822

  File Name: /mnt/ramdisk/book/users01.dbf

  Block Type Blocks Failing Blocks Processed

  ---------- -------------- ----------------

  Data       0              12355

  Index      0              3292

  Other      0              720

Finished validate at 2019-03-11 09:27:19


--//你可以發現做validate datafile 4;不會報錯,這樣的"壞"塊不會報,可以稱為邏輯壞塊.


RMAN> validate check logical datafile 4;


Starting validate at 2019-03-11 09:28:49

using channel ORA_DISK_1

using channel ORA_DISK_2

using channel ORA_DISK_3

channel ORA_DISK_1: starting validation of datafile

channel ORA_DISK_1: specifying datafile(s) for validation

input datafile file number=00004 name=/mnt/ramdisk/book/users01.dbf

channel ORA_DISK_1: validation complete, elapsed time: 00:00:01

List of Datafiles

=================

File Status Marked Corrupt Empty Blocks Blocks Examined High SCN

---- ------ -------------- ------------ --------------- ----------

4    FAILED 0              17           16411           13278827822

  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

  File Name: /mnt/ramdisk/book/users01.dbf

  Block Type Blocks Failing Blocks Processed

  ---------- -------------- ----------------

  Data       1              12355

  Index      0              3292

  Other      0              720

validate found one or more corrupt blocks

See trace file /u01/app/oracle/diag/rdbms/book/book/trace/book_ora_47908.trc for details

Finished validate at 2019-03-11 09:28:50


--//可以發現1塊有問題.這樣的塊稱為邏輯壞塊.

--//檢視/u01/app/oracle/diag/rdbms/book/book/trace/book_ora_47908.trc可以發現如下資訊.

*** 2019-03-11 09:28:49.667

*** ACTION NAME:(0000087 STARTED19) 2019-03-11 09:28:49.667


Block Checking: DBA = 16777931, Block Type = KTB-managed data block

data header at 0x7f3f9671a07c

kdbchk: the amount of space used is not equal to block size

        used=613 fsc=0 avsp=7450 dtl=8064

Error backing up file 4, block 715: logical corruption


SCOTT@book> set numw 12

SCOTT@book> select * from GV$DATABASE_BLOCK_CORRUPTION;

     INST_ID        FILE#       BLOCK#       BLOCKS CORRUPTION_CHANGE# CORRUPTIO

------------ ------------ ------------ ------------ ------------------ ---------

           1            4          715            1        13278827768 CORRUPT


3.修改檢查和:

--//修改回來.

BBED> assign kdbh.kdbhavsp=7451

sb2 kdbhavsp                                @134      7451


BBED> verify

DBVERIFY - Verification starting

FILE = /mnt/ramdisk/book/users01.dbf

BLOCK = 715


Block 715 is corrupt

Corrupt block relative dba: 0x010002cb (file 0, block 715)

Bad check value found during verification

Data in bad block:

 type: 6 format: 2 rdba: 0x010002cb

 last change scn: 0x0003.177ad4f8 seq: 0x2 flg: 0x04

 spare1: 0x0 spare2: 0x0 spare3: 0x0

 consistency value in tail: 0xd4f80602

 check value in block header: 0x99a0

 computed block checksum: 0x1


--//這樣檢查和不一致.


RMAN> validate datafile 4;

Starting validate at 2019-03-11 09:36:54

using channel ORA_DISK_1

using channel ORA_DISK_2

using channel ORA_DISK_3

channel ORA_DISK_1: starting validation of datafile

channel ORA_DISK_1: specifying datafile(s) for validation

input datafile file number=00004 name=/mnt/ramdisk/book/users01.dbf

channel ORA_DISK_1: validation complete, elapsed time: 00:00:01

List of Datafiles

=================

File Status Marked Corrupt Empty Blocks Blocks Examined High SCN

---- ------ -------------- ------------ --------------- ----------

4    FAILED 0              17           16411           13278829628

  File Name: /mnt/ramdisk/book/users01.dbf

  Block Type Blocks Failing Blocks Processed

  ---------- -------------- ----------------

  Data       1              12355

  Index      0              3292

  Other      0              720


validate found one or more corrupt blocks

See trace file /u01/app/oracle/diag/rdbms/book/book/trace/book_ora_47908.trc for details

Finished validate at 2019-03-11 09:36:55


--//檢查和不一致,validate datafile 4;一定能發現問題.

RMAN> validate check logical datafile 4;


Starting validate at 2019-03-11 09:38:56

using channel ORA_DISK_1

using channel ORA_DISK_2

using channel ORA_DISK_3

channel ORA_DISK_1: starting validation of datafile

channel ORA_DISK_1: specifying datafile(s) for validation

input datafile file number=00004 name=/mnt/ramdisk/book/users01.dbf

channel ORA_DISK_1: validation complete, elapsed time: 00:00:01

List of Datafiles

=================

File Status Marked Corrupt Empty Blocks Blocks Examined High SCN

---- ------ -------------- ------------ --------------- ----------

4    FAILED 0              17           16411           13278829701

  File Name: /mnt/ramdisk/book/users01.dbf

  Block Type Blocks Failing Blocks Processed

  ---------- -------------- ----------------

  Data       1              12355

  Index      0              3292

  Other      0              720


validate found one or more corrupt blocks

See trace file /u01/app/oracle/diag/rdbms/book/book/trace/book_ora_47908.trc for details

Finished validate at 2019-03-11 09:38:57


BBED> sum apply

Check value for File 4, Block 715:

current = 0x99a1, required = 0x99a1



RMAN> validate datafile 4;


Starting validate at 2019-03-11 09:39:39

using channel ORA_DISK_1

using channel ORA_DISK_2

using channel ORA_DISK_3

channel ORA_DISK_1: starting validation of datafile

channel ORA_DISK_1: specifying datafile(s) for validation

input datafile file number=00004 name=/mnt/ramdisk/book/users01.dbf

channel ORA_DISK_1: validation complete, elapsed time: 00:00:01

List of Datafiles

=================

File Status Marked Corrupt Empty Blocks Blocks Examined High SCN

---- ------ -------------- ------------ --------------- ----------

4    OK     0              17           16411           13278829701

  File Name: /mnt/ramdisk/book/users01.dbf

  Block Type Blocks Failing Blocks Processed

  ---------- -------------- ----------------

  Data       0              12355

  Index      0              3292

  Other      0              720

Finished validate at 2019-03-11 09:39:40


--//看來我的理解有點片面...^_^.查詢一些資料,我認為作者的理解還是不錯的,轉載如下:

https://blog.pythian.com/what-is-the-difference-between-logical-and-physical-corruption-in-oracle/


Logical (and physical) corruption of data blocks. Block corruptions come in two types:


Physical corruptions (media corrupt blocks) are blocks that have sustained obvious physical damage. When Oracle detects

an inconsistency between the CSN in the block header and the CSN in the block footer, or the expected header and

footer structures are not present or are mangled, then the Oracle session raises an exception upon read of the block

(ORA-01578: ORACLE data block corrupted…). The call to Oracle fails, and the exception is written to the Oracle alert

log and trace files. Physical corruptions are generally the result of infrastructure problems, and can be introduced

in a variety of ways. Some possible sources of physical corruption are storage array cache corruption, array firmware

bugs, filesystem bugs and array controller battery failure combined with power outage. One can imagine at least a dozen

other possible sources of such corruption. Physically corrupt blocks can be repaired using Oracle Recovery Manager's

BLOCKRECOVER command. This operation restores and recovers the block in place in the file without interrupting any other

sessions operating against the database.


Logically corrupt blocks are blocks that have good header and footer CSNs, but that have some other kind of internal

inconsistency. For instance, one of the block header structures, which tracks the number of locks associated with rows

in the block, differs from the actual number of locks present. Another example would be if the header information on

available space differs from the true available space on the block. Upon encountering these types of faults, the calling

session generally will raise ORA-00600 ("internal error") with additional arguments that allow us to diagnose the

specific type of defect, and the call will fail. The exception will be written to the alert log and trace files. Like

physical corruption, there are a wide range of possible ways that the fault could have been introduced, including all

of the ways listed above for physical corruption. However, logically corrupt blocks are much more likely to have been

introduced as a result of a failure in the Oracle software, or as a result of an Oracle bug or cache corruption.


By default, Oracle has features that seek to perform sanity checks on blocks before they are written. However, for

highly risk-averse enterprises, additional checks, including checks for logical inconsistencies and block checksum

verification can be enabled. These features consume additional resources, so should be used judiciously.


--//金山翻譯如下:

物理壞塊(媒體破壞塊)是指遭受明顯物理傷害的塊。當Oracle檢測到塊頭中的csn與塊頁尾中的csn之間不一致時,或者預期的頁首和頁尾

結構不存在或出現故障時,Oracle會話將在讀取該塊時引發異常(ora-01578:oracledata塊損壞的…)。。對Oracle的呼叫失敗,異常將

寫入Oracle警報日誌和跟蹤檔案。物理壞塊通常是基礎設施問題的結果,可以透過多種方式引入。物理損壞的一些可能來源是儲存陣列緩

存損壞、陣列韌體錯誤、檔案系統錯誤和陣列控制器電池故障以及斷電。人們可以想象,至少還有十幾個其他可能的腐敗根源。可以使用

Oracle恢復管理器的BLOCKRECOVER命令修復損壞的塊。此操作恢復並恢復檔案中的塊,而不中斷對資料庫操作的任何其他會話。


邏輯上損壞的塊是具有良好的頁首和頁尾CSNS的塊,但是它們有一些其他型別的內部不一致性。例如,跟蹤與塊中行關聯的鎖數的塊頭結

構之一不同於當前的實際鎖數。另一個例子是,如果可用空間上的頭資訊與塊上的實際可用空間不同的話。當遇到這些型別的故障時,調

用會話通常會使用額外的引數引發Ora-00600("內部錯誤"),從而使我們能夠診斷特定型別的缺陷,並且呼叫將失敗。異常將寫入警報日

志和跟蹤檔案。與實際壞塊一樣,可以採用多種可能的方法來引入過失,包括上文列出的所有肉體腐敗的方式。但是,在邏輯上損壞的

塊更有可能是由於Oracle軟體的失敗或Oracle錯誤或快取損壞而引入的。


預設情況下,Oracle的功能是在編寫塊之前對其執行正常檢查。但是,對於高度規避風險的企業,可以啟用額外的檢查,包括檢查邏輯不

一致和塊校驗和驗證。這些特性消耗了額外的資源,因此應該謹慎使用。


來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/267265/viewspace-2638025/,如需轉載,請註明出處,否則將追究法律責任。

相關文章