oracle 壞塊問題 ora-01578

安佰勝發表於2010-08-18

 

壞塊問題

 

1、說明

壞塊問題是經常出現在資料庫系統中的,如果沒有合適的處理方法往往會導致壞塊物件不可用或者資料丟失。本文從壞塊產生開始說起,著重說明一但壞塊產生後不同情況的處理方法。

2、壞塊產生的原因:

硬體的I/O錯誤
作業系統的I/O錯誤或緩衝問題

記憶體paging問題

磁碟修復工具

一個資料檔案的一部分正在被覆蓋

Oracle
試圖訪問一個未被格式化的系統塊失敗

資料檔案部分溢位

Oracle
或者作業系統的
bug

3、壞塊的發現:

3.1Alter日誌報錯

Tue Aug 17 10:48:07 2010

Corrupt Block Found

         TSN = 7, TSNAME = BTEST

         RFN  6, BLK = 839, rdba = 25166663

         OBJN = 49205, BJD = 49205, BJECT= BOBJ, SUBOBJECT =

         Segment wner= AN, Segment Type = Table Segment

 

其中RFN表示的是relative_fno6號檔案的839塊,段型別為表段,是表bobj出現壞塊。

 

3.2、查詢報錯

SQL> select count(*) from bobj;

select count(*) from bobj

                     *

ERROR at line 1:

ORA-01578: ORACLE data block corrupted (file # 6, block # 839)

ORA-01110: data file 6: 'F:\ORACLE\PRODUCT\10.1.0\ORADATA\DB10\BTEST.DBF'

 

3.3、分析表報錯

SQL> analyze table bobj validate structure cascade;

analyze table bobj validate structure cascade

*

ERROR at line 1:

ORA-01578: ORACLE data block corrupted (file # 6, block # 839)

ORA-01110: data file 6: 'F:\ORACLE\PRODUCT\10.1.0\ORADATA\DB10\BTEST.DBF'

3.4Rman備份報錯

RMAN> backup tablespace btest;

 

Starting backup at 17-8 -10

using channel ORA_DISK_1

channel ORA_DISK_1: starting full datafile backupset

channel ORA_DISK_1: specifying datafile(s) in backupset

input datafile fno=00006 name=F:\ORACLE\PRODUCT\10.1.0\ORADATA\DB10\BTEST.DBF

channel ORA_DISK_1: starting piece 1 at 17-8 -10

RMAN-00571: ===========================================================

RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============

RMAN-00571: ===========================================================

RMAN-03009: failure of backup command on ORA_DISK_1 channel at 08/17/2010 11:03:09

ORA-19566: exceeded limit of 0 corrupt blocks for file F:\ORACLE\PRODUCT\10.1.0\ORADATA\DB10\BTEST.DBF

 

3.5Dbv檢查報錯

F:\oracle\product\10.1.0\oradata\db10>dbv file=BTEST.DBF blocksize=8192

 

DBVERIFY: Release 10.1.0.2.0 - Production on 星期二 8 17 10:49:02 2010

 

Copyright (c) 1982, 2004, Oracle.  All rights reserved.

 

DBVERIFY - Verification starting : FILE = BTEST.DBF

Page 839 is marked corrupt

Corrupt block relative dba: 0x01800347 (file 6, block 839)

Bad check value found during dbv:

Data in bad block:

 type: 6 format: 2 rdba: 0x01800347

 last change scn: 0x0000.0005246f seq: 0x1 flg: 0x04

 spare1: 0x0 spare2: 0x0 spare3: 0x0

 consistency value in tail: 0x246f0601

 check value in block header: 0x50c2

computed block checksum: 0x2751

 

 

DBVERIFY - Verification complete

 

Total Pages Examined         : 5376

Total Pages Processed (Data) : 5165

Total Pages Failing   (Data) : 0

Total Pages Processed (Index): 0

Total Pages Failing   (Index): 0

Total Pages Processed (Other): 9

Total Pages Processed (Seg)  : 0

Total Pages Failing   (Seg)  : 0

Total Pages Empty            : 201

Total Pages Marked Corrupt   : 1

Total Pages Influx           : 0

 

3.6、查詢壞塊發生的資料檔案以及對應表空間

Select file_name,tablespace_name,file_id “AFN”,relative_fno “RFN”
From dba_data_files;
Select file_name,tablespace_name,file_id, relative_fno“RFN”
From dba_temp_files;

  

3.7、查詢存在壞塊的物件是什麼:

SELECT tablespace_name, segment_type, owner,

segment_name, partition_name FROM

dba_extents WHERE file_id =v_file_id and v_block_id

between block_id AND block_id + blocks – 1

 

4、出現壞塊的常見物件:

Sys使用者下的物件
  回滾段

  臨時段

  索引或者分割槽索引

5、問題的處理

5.1、無備份情況下的恢復

5.1.1Sys使用者下的物件,需要謹慎處理。

5.1.2、回滾段壞塊採用類似ora-6004000)的處理方法將回滾段刪除即可。會造成事物失敗,資料丟失。

5.1.3、索引:重建,期間會鎖表對業務有影響,資料不會丟失。

5.1.4、表:

5.1.41EVENT10231

SQL> select count(*)from bobj;

 

  COUNT(*)

----------

    376240

 

SQL> analyze table bobj validate structure cascade;

analyze table bobj validate structure cascade

*

ERROR at line 1:

ORA-01578: ORACLE data block corrupted (file # 6, block # 495)

ORA-01110: data file 6: 'F:\ORACLE\PRODUCT\10.1.0\ORADATA\DB10\BTEST.DBF'

 

 

SQL> alter session set events ='10231 trace name context forever,level 10';

 

Session altered.

 

SQL> select count(*) from bobj;

 

  COUNT(*)

----------

    376169

忽略了壞塊進行讀寫,所以資料量上有變化,丟失了一部分資料。

 

5.1.42ROWID RANGE SCAN

FUNCTION ROWID_CREATE RETURNS ROWID

 Argument Name                  Type                    In/Out Default?

 ------------------------------ ----------------------- ------ --------

 ROWID_TYPE                     NUMBER                  IN

 OBJECT_NUMBER                  NUMBER                  IN

 RELATIVE_FNO                   NUMBER                  IN

 BLOCK_NUMBER                   NUMBER                  IN

 ROW_NUMBER                     NUMBER                  IN

 

SQL> select * from v$database_block_corruption;

 

     FILE#     BLOCK#     BLOCKS CORRUPTION_CHANGE# CORRUPTIO

---------- ---------- ---------- ------------------ ---------

         6        839          1                  0 CHECKSUM

 

SQL> SELECT dbms_rowid.rowid_create(1,49205,6,839,0) from DUAL;

 

DBMS_ROWID.ROWID_C

------------------

AAAMA1AAGAAAANHAAA

 

SQL> SELECT dbms_rowid.rowid_create(1,49205,6,840,0) from DUAL;

 

DBMS_ROWID.ROWID_C

------------------

AAAMA1AAGAAAANIAAA

 

SQL> create table bbobj tablespace btest as select * from bobj where 1=2;

 

Table created.

 

SQL> insert into bbobj select /*+ rowid(a) */ * from bobj where rowid

AAAANHAAA';

 

60993 rows created.

 

SQL> insert into bbobj select /*+ rowid(a) */ * from bobj where rowid>='AAAMA1AAG

AAAANIAAA';

 

315108 rows created.

 

SQL> commit;

 

Commit complete.

 

SQL> select count(*) from bbobj;

 

  COUNT(*)

----------

    376101

 

v$database_block_corruption中可以查處當前資料庫中的壞塊資訊

通過呼叫dbms_rowid.rowid_create確認出壞塊對應的rowid,重新建立表結構相同的表,並以rowid為條件跳過壞塊將好資料存入到中間表中,損失一部分資料,仍然可以將壞塊的影響忽略掉。

 

5.1.4.3Dbms_repair

sys使用者執行

PROCEDURE SKIP_CORRUPT_BLOCKS

 Argument Name                  Type                    In/Out Default?

 ------------------------------ ----------------------- ------ --------

 SCHEMA_NAME                    VARCHAR2                IN

 OBJECT_NAME                    VARCHAR2                IN

 OBJECT_TYPE                    BINARY_INTEGER          IN     DEFAULT

 FLAGS                          BINARY_INTEGER          IN     DEFAULT

 

SQL> Execute DBMS_REPAIR.SKIP_CORRUPT_BLOCKS('AN','BOBJ')

 

PL/SQL procedure successfully completed.

 

SQL> conn an/an

Connected.

SQL> select count(*) from bobj;

 

  COUNT(*)

----------

376169

 

試用sys使用者呼叫dbms_repair中的skip_corrupt_blocks忽略物件中的壞塊,同樣會丟失壞塊中的資料。包中還包含過程FIX_CORRUPT_BLOCKS,可以fix掉壞塊。

 

5.2、有備份情況下的恢復

5.2.1Blockrecover

RMAN> blockrecover datafile 5 block 425;

 

Starting blockrecover at 17-8 -10

using channel ORA_DISK_1

 

channel ORA_DISK_1: restoring block(s)

channel ORA_DISK_1: specifying block(s) to restore from backup set

restoring blocks of datafile 00005

channel ORA_DISK_1: restored block(s) from backup piece 1

piece handle=F:\ORACLE\PRODUCT\10.1.0\FLASH_RECOVERY_AREA\DB10\BACKUPSET\2010_08_17\O1_MF_NNNDF_TAG20100817T162508_66NKV5B8_.BKP tag=TAG20100817T162508

channel ORA_DISK_1: block restore complete

 

starting media recovery

media recovery complete

 

Finished blockrecover at 17-8 -10

 

SQL> select count(*) from an.atest;

 

  COUNT(*)

----------

    376240

 

5.2.2Recover datafile

SQL> select count(*) from an.atest;

select count(*) from an.atest

*

ERROR at line 1:

ORA-01578: ORACLE data block corrupted (file # 5, block # 963)

ORA-01110: data file 5: 'F:\ORACLE\PRODUCT\10.1.0\ORADATA\DB10\ATEST.DBF'

 

RMAN> sql " alter database datafile 5 offline";

 

sql statement:  alter database datafile 5 offline

 

RMAN> restore datafile 5;

 

Starting restore at 17-8 -10

using channel ORA_DISK_1

 

channel ORA_DISK_1: starting datafile backupset restore

channel ORA_DISK_1: specifying datafile(s) to restore from backup set

restoring datafile 00005 to F:\ORACLE\PRODUCT\10.1.0\ORADATA\DB10\ATEST.DBF

channel ORA_DISK_1: restored backup piece 1

piece handle=F:\ORACLE\PRODUCT\10.1.0\FLASH_RECOVERY_AREA\DB10\BACKUPSET\2010_08_17\O1_MF_NNNDF_TAG20100817T162508_66NKV5B8_.BKP tag=TAG20100817T162508

channel ORA_DISK_1: restore complete

Finished restore at 17-8 -10

 

RMAN> recover datafile 5;

 

Starting recover at 17-8 -10

using channel ORA_DISK_1

 

starting media recovery

media recovery complete

 

Finished recover at 17-8 -10

 

RMAN> sql " alter database datafile 5 online";

 

sql statement:  alter database datafile 5 online

 

SQL> select count(*) from an.atest;

 

  COUNT(*)

----------

    376240

 

 

5.3Rman備份跳過壞塊

SET MAXCORRUPT FOR DATAFILE filename TO n

 

例:

run {

        allocate channel node1 type disk;

SET MAXCORRUPT FOR DATAFILE 8 TO 3

        set limit channel node1 kbytes = 1800000;

        backup as compressed backupset full database format '$BACKUPDIR/full_%d_%T_%s_%p' plus archivelog format '$BACKUPDIR/arch_%d_%T_%s_%p' delete all input;

        backup current controlfile format '$BACKUPDIR/ctl_%d_%T_%s_%p' TAG "control.bak";

        release channel node1;

}

 

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/13177610/viewspace-671253/,如需轉載,請註明出處,否則將追究法律責任。

相關文章