oracle 9i 裸裝置資料檔案損壞的恢復過程

paulyibinyi發表於2009-02-23

 環境 oracle 9.2.0.4+AIX 5300-02

1.問題描述
    今天客戶資料庫前臺應用報追加記錄失敗,以前一直都是正常的,我就趕緊檢視資料庫日誌,發現在9:49分-10:02之間有很多以下報錯,
Errors in file /ora/app/oracle/admin/ora9i/bdump/ora9i_ckpt_1011798.trc:
ORA-01171: datafile 76 going offline due to error advancing checkpoint
ORA-01110: data file 76: '/dev/rdata2_4_01_rw'
ORA-01115: IO error reading block from file 76 (block # 1)
ORA-27063: skgfospo: number of bytes read/written is incorrect
IBM AIX RISC System/6000 Error: 16: Device busy
Additional information: -1
Additional information: 8192
Mon Feb 23 10:02:20 2009
Errors in file /ora/app/oracle/admin/ora9i/bdump/ora9i_ckpt_1011798.trc:
ORA-01171: datafile 77 going offline due to error advancing checkpoint
ORA-01110: data file 77: '/dev/rdata2_4_02_rw'
ORA-01115: IO error reading block from file 77 (block # 1)
ORA-27063: skgfospo: number of bytes read/written is incorrect
IBM AIX RISC System/6000 Error: 16: Device busy
透過查詢vg所在的lv裸裝置,有問題的資料檔案全是closed狀態
> lsvg -l ora04vg
ora04vg:
LV NAME             TYPE       LPs   PPs   PVs  LV STATE      MOUNT POINT
data4_4_01_rw       jfs        128   128   2    closed/syncd  N/A
data4_4_02_rw       jfs        128   128   2    closed/syncd  N/A
data4_4_03_rw       jfs        128   128   2    closed/syncd  N/A
data4_4_04_rw       jfs        128   128   2    closed/syncd  N/A
data4_4_05_rw       jfs        128   128   2    closed/syncd  N/A
data4_4_06_rw       jfs        128   128   2    closed/syncd  N/A
data4_4_07_rw       jfs        128   128   2    closed/syncd  N/A
data4_4_08_rw       jfs        128   128   2    closed/syncd  N/A
data4_4_09_rw       jfs        128   128   2    closed/syncd  N/A
data4_4_10_rw       jfs        128   128   2    closed/syncd  N/A
data4_4_11_rw       jfs        128   128   2    closed/syncd  N/A
data4_4_12_rw       jfs        128   128   2    closed/syncd  N/A
data4_4_13_rw       jfs        128   128   2    closed/syncd  N/A
data4_4_14_rw       jfs        128   128   2    closed/syncd  N/A
data4_4_15_rw       jfs        128   128   2    closed/syncd  N/A
data4_4_16_rw       jfs        128   128   2    closed/syncd  N/A
data4_4_17_rw       jfs        128   128   2    closed/syncd  N/A
data4_4_18_rw       jfs        128   128   2    closed/syncd  N/A
data4_4_19_rw       jfs        128   128   2    closed/syncd  N/A
data4_4_20_rw       jfs        128   128   2    closed/syncd  N/A
data1_4_01_rw       jfs        32    32    2    closed/syncd  N/A
data1_4_02_rw       jfs        32    32    2    closed/syncd  N/A
data1_4_03_rw       jfs        32    32    2    closed/syncd  N/A
data2_4_01_rw       jfs        64    64    2    closed/syncd  N/A
data2_4_02_rw       jfs        64    64    2    closed/syncd  N/A
data2_4_03_rw       jfs        64    64    2    closed/syncd  N/A
data2_4_04_rw       jfs        64    64    2    closed/syncd  N/A
data2_4_05_rw       jfs        64    64    2    closed/syncd  N/A
data2_4_06_rw       jfs        64    64    2    closed/syncd  N/A
data2_4_07_rw       jfs        64    64    2    closed/syncd  N/A
data2_4_08_rw       jfs        64    64    2    closed/syncd  N/A
>


2.問題跟蹤
  我就和客戶到了機房,我自己也就用前臺那個應用程式做了下測試,插入一條記錄提交後,報追加記錄失敗的錯誤,
  但沒有任何ora-相關錯誤程式碼體現出來,用pl/sql developer工具
  連到資料庫做測試
   insert into stck (編號,戶號) values(100000000,'test');
   出錯資訊如下:
   ORA-00376: file 106 cannot be read at this time
   ORA-01110: data file 106: '/dev/rdata4_4_13_rw'
   看到這個錯誤提示後,就聯想到上午看的那些錯誤日誌有關了,重啟資料庫後根本沒解決,腦海中閃出的就是這個
   裸裝置出問題了
   馬上查這個資料檔案的狀態
   select status,name from v$datafile where name='/dev/rdata2_4_01_rw';  
   發現狀態是recover,想到上午很多裸裝置報錯,透過以下sql查詢,很多資料檔案都是recover狀態
   select status,name from v$datafile where status='RECOVER';
        RECOVER /dev/rdata4_4_08_rw
 RECOVER /dev/rdata4_4_09_rw
 RECOVER /dev/rdata4_4_10_rw
 RECOVER /dev/rdata4_4_11_rw
 RECOVER /dev/rdata4_4_12_rw
        ..........
   然後查詢資料檔案所屬表空間都是online狀態,就只能用抱著用介質恢復recover datafile file_name 應用
   歸檔來看看能不能解決。

3.問題解決
  3.1 先恢復今天的歸檔日誌,呼叫備份軟體的引數
      p550b# su - oracle9i
> rman target /

Recovery Manager: Release 9.2.0.4.0 - 64bit Production

Copyright (c) 1995, 2002, Oracle Corporation.  All rights reserved.

connected to target database: ORA9I (DBID=2362852444)
connected to recovery catalog database

RMAN> run{
2> allocate  channel t1 type 'SBT_TAPE'
3> parms 'ENV=(NB_ORA_SERV=backup_server,
4> NB_ORA_CLIENT=p550b,
5> NB_ORA_POLICY=p550b_arch)';
6> restore archivelog from logseq 26861;
7> release channel t1;
8> }

   allocated channel: t1
channel t1: sid=70 devtype=SBT_TAPE
channel t1: VERITAS NetBackup for Oracle - Release 5.0GA (2004111820)

Starting restore at 2009-02-23:16:04:58

archive log thread 1 sequence 26871 is already on disk as file /arch/1_26871.dbf
archive log thread 1 sequence 26872 is already on disk as file /arch/1_26872.dbf
archive log thread 1 sequence 26873 is already on disk as file /arch/1_26873.dbf
archive log thread 1 sequence 26874 is already on disk as file /arch/1_26874.dbf
archive log thread 1 sequence 26875 is already on disk as file /arch/1_26875.dbf
archive log thread 1 sequence 26876 is already on disk as file /arch/1_26876.dbf
channel t1: starting archive log restore to default destination
channel t1: restoring archive log
archive log thread=1 sequence=26861
channel t1: restored backup piece 1
piece handle=oracle_arch_2810_1_679584083 tag=TAG20090223T132122 params=NULL
channel t1: restore complete
channel t1: starting archive log restore to default destination
channel t1: restoring archive log
archive log thread=1 sequence=26862
channel t1: restoring archive log
archive log thread=1 sequence=26863
channel t1: restoring archive log
archive log thread=1 sequence=26864
channel t1: restoring archive log
archive log thread=1 sequence=26865
channel t1: restoring archive log
archive log thread=1 sequence=26866
channel t1: restoring archive log
archive log thread=1 sequence=26867
channel t1: restoring archive log
archive log thread=1 sequence=26868
channel t1: restored backup piece 1
piece handle=oracle_arch_2809_1_679584083 tag=TAG20090223T132122 params=NULL
channel t1: restore complete
channel t1: starting archive log restore to default destination
channel t1: restoring archive log
archive log thread=1 sequence=26869
channel t1: restoring archive log
archive log thread=1 sequence=26870
channel t1: restored backup piece 1
piece handle=oracle_arch_2811_1_679584143 tag=TAG20090223T132122 params=NULL
channel t1: restore complete
Finished restore at 2009-02-23:16:07:21

released channel: t1

RMAN>


 3.2 介質恢復
   p550b# su - oracle9i
> sqlplus "/as sysdba"

SQL*Plus: Release 9.2.0.4.0 - Production on Mon Feb 23 15:54:22 2009

Copyright (c) 1982, 2002, Oracle Corporation.  All rights reserved.


Connected to:
Oracle9i Enterprise Edition Release 9.2.0.4.0 - 64bit Production
With the Partitioning option
JServer Release 9.2.0.4.0 - Production

SQL> recover datafile 106;
ORA-00279: change 9463530758543 generated at 02/23/2009 08:37:55 needed for
thread 1
ORA-00289: suggestion : /arch/1_26862.dbf
ORA-00280: change 9463530758543 for thread 1 is in sequence #26862


Specify log: {=suggested | filename | AUTO | CANCEL}
auto
ORA-00308: cannot open archived log '/arch/1_26862.dbf'
ORA-27037: unable to obtain file status
IBM AIX RISC System/6000 Error: 2: No such file or directory
Additional information: 3


ORA-00308: cannot open archived log '/arch/1_26862.dbf'
ORA-27037: unable to obtain file status
IBM AIX RISC System/6000 Error: 2: No such file or directory
Additional information: 3


SQL> recover datafile 106;
ORA-00279: change 9463530758543 generated at 02/23/2009 08:37:55 needed for
thread 1
ORA-00289: suggestion : /arch/1_26862.dbf
ORA-00280: change 9463530758543 for thread 1 is in sequence #26862


Specify log: {=suggested | filename | AUTO | CANCEL}
auto
ORA-00279: change 9463818933167 generated at 02/23/2009 09:38:47 needed for
thread 1
ORA-00289: suggestion : /arch/1_26863.dbf
ORA-00280: change 9463818933167 for thread 1 is in sequence #26863
ORA-00278: log file '/arch/1_26862.dbf' no longer needed for this recovery


Log applied.
Media recovery complete.
SQL> alter database datafile '/dev/rdata4_4_13_rw' online;

Database altered.

SQL>
其他資料檔案也是用同樣方法,也可以用
recover datafile '/dev/rdata4_4_13_rw'

 3.3問題確認解決
   
    insert into stck (編號,戶號) values(100000000,'test');
    能正常插入
    這時查那那些有問題的lv狀態,屬於open狀態
    ora04vg:
    LV NAME             TYPE       LPs   PPs   PVs  LV STATE      MOUNT POINT
    data4_4_07_rw       jfs        128   128   2    open/syncd    N/A
    data4_4_08_rw       jfs        128   128   2    open/syncd    N/A
    data4_4_09_rw       jfs        128   128   2    open/syncd    N/A
    data4_4_13_rw       jfs        128   128   2    open/syncd    N/A
    data4_4_20_rw       jfs        128   128   2    open/syncd    N/A
    data1_4_01_rw       jfs        32    32    2    open/syncd    N/A
    data1_4_02_rw       jfs        32    32    2    open/syncd    N/A
    data1_4_03_rw       jfs        32    32    2    open/syncd    N/A
    data2_4_01_rw       jfs        64    64    2    open/syncd    N/A
    data2_4_02_rw       jfs        64    64    2    open/syncd    N/A
    >
    客戶經過測試也已經正常了。

 


  

 

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/7199859/viewspace-557505/,如需轉載,請註明出處,否則將追究法律責任。

相關文章