誤刪除資料檔案、控制檔案的非RMAN恢復方法

還不算暈發表於2014-03-07

參考文件:http://www.xifenfei.com/2289.html

http://www.ibm.com/developerworks/aix/library/au-lsof.html Aix上lsof的參考文件

http://www.askmaclean.com/archives/unix-linux%E4%B8%8B%E5%88%A0%E9%99%A4oracle%E6%8E%A7%E5%88%B6%E6%96%87%E4%BB%B6controlfile%E4%B8%BA%E4%BB%80%E4%B9%88%E5%AE%9E%E4%BE%8Binstance%E6%B2%A1%E6%9C%89%E7%AB%8B%E5%8D%B3%E5%A5%94%E6%BA%83.html
在Unix/Linux上刪除所有的Controlfile控制檔案後例項並不會在第一時間crash,相反居然還可以順利完成FULL checkpoint 。
這是為什麼呢?  ---注意這個問題僅適用於Unix/Linux, 在Windows平臺上不允許刪除 已經被其他程式開啟的檔案。
究其根本原因是在Linux/Unix上Read 或 Write一個檔案,程式都會開啟一個與此檔案相關聯的 開啟檔案描述符 Open File Descriptors (a file descriptor (FD) is an abstract indicator for accessing a file. The term is generally used in POSIX operating systems.In POSIX, a file descriptor is an integer, specifically of the C type int.)

而在Linux/Unix上當一個檔案被刪除時,它只是簡單地被”unlink”即刪除一個硬連結hard link; 實際包含檔案資料的inode 在所有與該檔案關聯的 開啟檔案描述符(Open File Descriptors)被關閉之前 都不會被實際刪除,已經獲得 開啟檔案描述符的程式 只要不用close函式關閉掉這些描述符, 都可以繼續正常地對檔案進行讀寫。 當所有該檔案管理的開啟檔案符都被關閉時 檔案才真正意義上被刪除。

注意:--SYSTEM不能線上RECOVER,需要重啟資料庫。
UNDO可以線上RECOVER,不過UNDO 檔案OFFLINE了不能做操作了--不能DML,查詢可以--不涉及回滾段中資料的可以。
其它資料檔案可以線上恢復。
使用的命令主要是: 
ps -ef |grep dbw |grep -v grep
ps -ef |grep ckpt |grep -v grep
ll  /proc/10986/fd
cp /proc/10986/fd/258  /u01/oradata/bys3/system01.dbf
alter database datafile 6 offline;
recover datafile 6;
alter database datafile 6 online;

1.刪除SYSTEM檔案的恢復實驗:

會話1:
SYS@ bys3>select * from v$dbfile;
     FILE# NAME
---------- ----------------------------------------
         1 /u01/oradata/bys3/system01.dbf
         2 /u01/oradata/bys3/sysaux01.dbf
         3 /u01/oradata/bys3/undotbs01.dbf
         4 /u01/oradata/bys3/user01.dbf
         5 /u01/oradata/bys3/test1.dbf
         6 /u01/oradata/bys3/test2.dbf
         7 /u01/oradata/bys3/test22.dbf
         8 /u01/oradata/bys3/test3.dbf
SYS@ bys3>select open_mode from v$database;
OPEN_MODE
--------------------
READ WRITE
#############
新開啟一個會話2:

檢視DBWR的程式號,並刪除/u01/oradata/bys3/system01.dbf
[oracle@bys3 bys3]$ ps -ef |grep dbw |grep -v grep  查出程式PID---10986注意這裡查的是DBWR程式,如是控制檔案,則查CKPT程式。
oracle   10986     1  0 Mar01 ?        00:04:10 ora_dbw0_bys3
[oracle@bys3 bys3]$ pwd
/u01/oradata/bys3
[oracle@bys3 bys3]$ ls
control01.ctl  redo02.log    system01.dbf  test22.dbf  undotbs01.dbf
control02.ctl  redo03.log    temp01.dbf    test2.dbf   user01.dbf
redo01.log     sysaux01.dbf  test1.dbf     test3.dbf
[oracle@bys3 bys3]$ rm -rf system01.dbf
#########################
回到會話1:
SYS@ bys3>conn bys/bys
ERROR:
ORA-00604: error occurred at recursive SQL level 2
ORA-01116: error in opening database file 1
ORA-01110: data file 1: '/u01/oradata/bys3/system01.dbf'
ORA-27041: unable to open file
Linux Error: 2: No such file or directory
Additional information: 3
ORA-00604: error occurred at recursive SQL level 2
ORA-01116: error in opening database file 1
ORA-01110: data file 1: '/u01/oradata/bys3/system01.dbf'
ORA-27041: unable to open file
Linux Error: 2: No such file or directory
Additional information: 3

Warning: You are no longer connected to ORACLE.

@ >conn / as sysdba        
Connected.
SYS@ bys3>select open_mode from v$database;
OPEN_MODE
--------------------
READ WRITE
##########################################
會話2:
[oracle@bys3 ~]$  ll  /proc/10986/fd
total 0
lr-x------ 1 oracle oinstall 64 Mar  7 11:23 0 -> /dev/null
l-wx------ 1 oracle oinstall 64 Mar  7 11:23 1 -> /dev/null
lrwx------ 1 oracle oinstall 64 Mar  7 11:23 10 -> /u01/app/oracle/product/11.2.0/dbhome_1/dbs/lkBYS3
lr-x------ 1 oracle oinstall 64 Mar  7 11:23 11 -> /u01/app/oracle/product/11.2.0/dbhome_1/rdbms/mesg/oraus.msb
l-wx------ 1 oracle oinstall 64 Mar  7 11:23 2 -> /dev/null
lrwx------ 1 oracle oinstall 64 Mar  7 11:23 256 -> /u01/oradata/bys3/control01.ctl
lrwx------ 1 oracle oinstall 64 Mar  7 11:23 257 -> /u01/oradata/bys3/control02.ctl
lrwx------ 1 oracle oinstall 64 Mar  7 11:23 258 -> /u01/oradata/bys3/system01.dbf (deleted)   ---這裡會一直閃
lrwx------ 1 oracle oinstall 64 Mar  7 11:23 259 -> /u01/oradata/bys3/sysaux01.dbf
lrwx------ 1 oracle oinstall 64 Mar  7 11:23 260 -> /u01/oradata/bys3/undotbs01.dbf
lrwx------ 1 oracle oinstall 64 Mar  7 11:23 261 -> /u01/oradata/bys3/user01.dbf
lrwx------ 1 oracle oinstall 64 Mar  7 11:23 262 -> /u01/oradata/bys3/test1.dbf
lrwx------ 1 oracle oinstall 64 Mar  7 11:23 263 -> /u01/oradata/bys3/test2.dbf
lrwx------ 1 oracle oinstall 64 Mar  7 11:23 264 -> /u01/oradata/bys3/test22.dbf
lrwx------ 1 oracle oinstall 64 Mar  7 11:23 265 -> /u01/oradata/bys3/test3.dbf
lrwx------ 1 oracle oinstall 64 Mar  7 11:23 266 -> /u01/oradata/bys3/temp01.dbf
lr-x------ 1 oracle oinstall 64 Mar  7 11:23 3 -> /dev/null
lr-x------ 1 oracle oinstall 64 Mar  7 11:23 4 -> /dev/null
lr-x------ 1 oracle oinstall 64 Mar  7 11:23 5 -> /dev/null
lr-x------ 1 oracle oinstall 64 Mar  7 11:23 6 -> /u01/app/oracle/product/11.2.0/dbhome_1/rdbms/mesg/oraus.msb
lr-x------ 1 oracle oinstall 64 Mar  7 11:23 7 -> /proc/10986/fd
lr-x------ 1 oracle oinstall 64 Mar  7 11:23 8 -> /dev/zero
lrwx------ 1 oracle oinstall 64 Mar  7 11:23 9 -> /u01/app/oracle/product/11.2.0/dbhome_1/dbs/hc_bys3.dat
[oracle@bys3 ~]$ cp /proc/10986/fd/258  /u01/oradata/bys3/system01.dbf
[oracle@bys3 ~]$ ll /u01/oradata/bys3/system01.dbf
-rw-r----- 1 oracle oinstall 524296192 Mar  7 11:25 /u01/oradata/bys3/system01.dbf
此時,ll /proc/10986/fd還是會顯示檔案狀態是:(deleted)如果是非SYSTEM檔案,線上RECOVER後這裡會變正常。
####################
SYS@ bys3>conn bys/bys
Connected.
BYS@ bys3>select TABLESPACE_NAME,STATUS,ONLINE_STATUS from dba_data_files;

TABLESPACE_NAME                STATUS    ONLINE_
------------------------------ --------- -------
SYSTEM                         AVAILABLE SYSTEM
SYSAUX                         AVAILABLE ONLINE
UNDOTBS1                       AVAILABLE ONLINE
USERS                          AVAILABLE ONLINE
TEST1                          AVAILABLE ONLINE
TEST1                          AVAILABLE ONLINE
TEST2                          AVAILABLE ONLINE
TEST3                          AVAILABLE ONLINE
ALERT日誌中的警告資訊:
Fri Mar 07 11:17:39 2014
Errors in file /u01/diag/rdbms/bys3/bys3/trace/bys3_m000_14498.trc:
ORA-00604: error occurred at recursive SQL level 2
ORA-01116: error in opening database file 1
ORA-01110: data file 1: '/u01/oradata/bys3/system01.dbf'
ORA-27041: unable to open file
Linux Error: 2: No such file or directory
Additional information: 3
Checker run found 1 new persistent data failures
Fri Mar 07 11:18:41 2014
Errors in file /u01/diag/rdbms/bys3/bys3/trace/bys3_m000_14516.trc:
ORA-01116: error in opening database file 1
ORA-01110: data file 1: '/u01/oradata/bys3/system01.dbf'
ORA-27041: unable to open file
Linux Error: 2: No such file or directory
Additional information: 3
Fri Mar 07 11:18:42 2014
Errors in file /u01/diag/rdbms/bys3/bys3/trace/bys3_m001_14520.trc:
ORA-00604: error occurred at recursive SQL level 2
ORA-01116: error in opening database file 1
ORA-01110: data file 1: '/u01/oradata/bys3/system01.dbf'
ORA-27041: unable to open file
Linux Error: 2: No such file or directory
Additional information: 3
############
此時,應該關閉資料庫,開啟到MOUNT,恢復SYSTEM檔案,再開啟資料庫,就可以正常使用了。
SYS@ bys3>shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.
SYS@ bys3>startup
ORACLE instance started.

Total System Global Area  225996800 bytes
Fixed Size                  1363692 bytes
Variable Size             150995220 bytes
Database Buffers           67108864 bytes
Redo Buffers                6529024 bytes
Database mounted.
ORA-01113: file 1 needs media recovery
ORA-01110: data file 1: '/u01/oradata/bys3/system01.dbf'


SYS@ bys3>recover datafile 1;
Media recovery complete.
SYS@ bys3>alter database open;
Database altered.
#############

2.如果丟失的是非SYSTEM/UNDO,則可以線上恢復:方法如下:

BYS@ bys3>alter database datafile 6 offline;
Database altered.
BYS@ bys3>recover datafile 6;
Media recovery complete.
BYS@ bys3>alter database datafile 6 online;
Database altered.
################

3.UNDO的OFFLINE及恢復和上面一樣:

SYS@ bys3>alter database  datafile 3 offline;
alter database  datafile 3 offline
*
ERROR at line 1:
ORA-00603: ORACLE server session terminated by fatal error
ORA-00376: file 3 cannot be read at this time
ORA-01110: data file 3: '/u01/oradata/bys3/undotbs01.dbf'
ORA-00376: file 3 cannot be read at this time
ORA-01110: data file 3: '/u01/oradata/bys3/undotbs01.dbf'
ORA-00376: file 3 cannot be read at this time
ORA-01110: data file 3: '/u01/oradata/bys3/undotbs01.dbf'
Process ID: 15106
Session ID: 1 Serial number: 5
BYS@ bys3>update dept set deptno=22;
update dept set deptno=22
       *
ERROR at line 1:
ORA-00604: error occurred at recursive SQL level 1
ORA-00376: file 3 cannot be read at this time
ORA-01110: data file 3: '/u01/oradata/bys3/undotbs01.dbf'
#########

4.控制檔案誤刪除的恢復的上面資料檔案的恢復方法基本一樣:

[oracle@bys3 ~]$ rm -rf /u01/oradata/bys3/control*
[oracle@bys3 ~]$ ps -ef|grep ckpt|grep -v grep   ----注意這裡查的是CKPT進程。
oracle   15076     1  0 11:53 ?        00:00:01 ora_ckpt_bys3
[oracle@bys3 ~]$ ll /proc/15076/fd
total 0
lr-x------ 1 oracle oinstall 64 Mar  7 12:12 0 -> /dev/null
l-wx------ 1 oracle oinstall 64 Mar  7 12:12 1 -> /dev/null
lrwx------ 1 oracle oinstall 64 Mar  7 12:12 10 -> /u01/app/oracle/product/11.2.0/dbhome_1/dbs/lkBYS3
lr-x------ 1 oracle oinstall 64 Mar  7 12:12 11 -> /u01/app/oracle/product/11.2.0/dbhome_1/rdbms/mesg/oraus.msb
l-wx------ 1 oracle oinstall 64 Mar  7 12:12 2 -> /dev/null
lrwx------ 1 oracle oinstall 64 Mar  7 12:12 256 -> /u01/oradata/bys3/control01.ctl (deleted)
lrwx------ 1 oracle oinstall 64 Mar  7 12:12 257 -> /u01/oradata/bys3/control02.ctl (deleted)

lr-x------ 1 oracle oinstall 64 Mar  7 12:12 3 -> /dev/null
lr-x------ 1 oracle oinstall 64 Mar  7 12:12 4 -> /dev/null
lr-x------ 1 oracle oinstall 64 Mar  7 12:12 5 -> /dev/null
lr-x------ 1 oracle oinstall 64 Mar  7 12:12 6 -> /u01/app/oracle/product/11.2.0/dbhome_1/rdbms/mesg/oraus.msb
lr-x------ 1 oracle oinstall 64 Mar  7 12:12 7 -> /proc/15076/fd
lr-x------ 1 oracle oinstall 64 Mar  7 12:12 8 -> /dev/zero
lrwx------ 1 oracle oinstall 64 Mar  7 12:12 9 -> /u01/app/oracle/product/11.2.0/dbhome_1/dbs/hc_bys3.dat
[oracle@bys3 ~]$ cp /proc/15076/fd/256 /u01/oradata/bys3/control01.ctl
[oracle@bys3 ~]$ cp /proc/15076/fd/257 /u01/oradata/bys3/control02.ctl
[oracle@bys3 ~]$ ls /u01/oradata/bys3/control*
/u01/oradata/bys3/control01.ctl  /u01/oradata/bys3/control02.ctl
日誌:
Fri Mar 07 12:12:04 2014
Errors in file /u01/diag/rdbms/bys3/bys3/trace/bys3_m000_15540.trc:
ORA-00210: cannot open the specified control file
ORA-00202: control file: '/u01/oradata/bys3/control01.ctl'
ORA-27041: unable to open file
Linux Error: 2: No such file or directory
Additional information: 3

關閉資料庫:

BYS@ bys3>conn / as sysdba
Connected.
SYS@ bys3>shutdown immediate;
Database closed.
ORA-03113: end-of-file on communication channel
Process ID: 15566

Session ID: 51 Serial number: 223

關庫時日誌:

ALTER DATABASE DISMOUNT
Shutting down archive processes
Archiving is disabled
********************* ATTENTION: ********************
 The controlfile header block returned by the OS
 has a sequence number that is too old.
 The controlfile might be corrupted.
 PLEASE DO NOT ATTEMPT TO START UP THE INSTANCE
 without following the steps below.
 RE-STARTING THE INSTANCE CAN CAUSE SERIOUS DAMAGE
 TO THE DATABASE, if the controlfile is truly corrupted.
 In order to re-start the instance safely,
 please do the following:
 (1) Save all copies of the controlfile for later
     analysis and contact your OS vendor and Oracle support.
 (2) Mount the instance and issue:
     ALTER DATABASE BACKUP CONTROLFILE TO TRACE;
 (3) Unmount the instance.
 (4) Use the script in the trace file to
     RE-CREATE THE CONTROLFILE and open the database.

開啟資料庫:

[oracle@bys3 ~]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.4.0 Production on Fri Mar 7 12:16:10 2014
Copyright (c) 1982, 2013, Oracle.  All rights reserved.
Connected to an idle instance.
SYS@ bys3>startup
ORACLE instance started.
Total System Global Area  225996800 bytes
Fixed Size                  1363692 bytes
Variable Size             150995220 bytes
Database Buffers           67108864 bytes
Redo Buffers                6529024 bytes
Database mounted.
Database opened.
開啟庫時的日誌
ALTER DATABASE   MOUNT
Fri Mar 07 12:16:21 2014
Sweep [inc][246535]: completed
……………………………………
Sweep [inc2][246423]: completed
Sweep [inc2][246415]: completed
Sweep [inc2][246407]: completed
Successful mount of redo thread 1, with mount id 3368434514
Database mounted in Exclusive Mode
Lost write protection disabled
Completed: ALTER DATABASE   MOUNT
     

相關文章