【備份恢復】不使用rman工具就能恢復被rm刪除的資料檔案案例

leonarding發表於2012-07-27

引題:朋友一時興起使用了rm**,刪除了oracle資料檔案後找我幫忙,我在幫朋友恢復資料庫時,遇到了當recover時,報錯不能找到28739號歸檔日誌,這樣我就不能同步scn,更不能開啟資料庫了。這是歸檔日誌不連續的典型案例,我最後告訴他要做好心理準備。事情還沒有完,這個真實案例引發了我的思考,如果當時在朋友沒有做rman拯救措施的情況下,可不可能不使用rman即可恢復資料檔案呢!最後我找到了答案:)

案例
1.
系統 solaris SunOS TJLT-YDWG6 5.9 Generic_122300-25 sun4u sparc SUNW,Sun-Fire-V890
DB  Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - 64bi
2.
案情描述
現場工程師使用了rm -rf *.dbf 命令把所有的資料檔案全部刪除了
現在有54日的備份
restore database until time "to_date('2012-05-04 12:00:00','yyyy-mm-dd hh24:mi:ss')";
  進行恢復顯示finish restore complete 沒有問題已經把檔案 複製回來了
進行同步
RMAN> recover database until time "to_date('2012-05-04 11:00:00','yyyy-mm-dd hh24:mi:ss')";
                           
Starting recover at 2012-07-26 14:02:42
using channel ORA_DISK_1
starting media recovery
unable to find archive log
archive log thread=1 sequence=28739
   缺少28739號的歸檔日誌,導致undotbs01.dbf檔案不一致
Oracle Error:
ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below
ORA-01194: file 2 needs more recovery to be consistent
ORA-01110: data file 2: '/opt/oradata/kpidb/undotbs01.dbf'
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of recover command at 07/26/2012 14:02:51
RMAN-06054: media recovery requesting unknown log: thread 1 seq 28739 lowscn 1513525474
Leonarding
2012.7.26

在我們工作中可能會經常發生這樣類似的突發狀況,在遇到此情況下首先要做的就是冷靜,上面發生的問題到了我這裡之後,我就發現資料庫已經變成了mount狀態,在使用檔案控制程式碼方式恢復資料檔案已經為時已晚,所以我採用了常規的恢復方式,沒想到啊沒想到,歸檔日誌還不全,立馬我整個人都“斯巴達”了,最後告訴朋友做DBA是需要勇氣的。

下面我用自己的測試庫演示一下作業系統rm級別的刪除資料檔案後,資料庫仍然處於open狀態的時候使用檔案控制程式碼來恢復被rm刪除的資料檔案,並最終順利開啟資料庫的實驗!我是一個比較嚴謹的人,所以一上來我先做個了“壓縮全庫備份”做到有備無患

描述一下場景:

作業系統:Enterprise Linux Enterprise Linux AS release 4 (October Update 8)

資料庫版本:Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Prod

一、備份重於一切

因此在演示之前我們先做一下備份

RMAN> show all;

RMAN配置引數區,如下都是預設值

RMAN configuration parameters are:

冗餘配置保留政策:冗餘數是1

CONFIGURE RETENTION POLICY TO REDUNDANCY 1; # default

開啟增量備份:關

CONFIGURE BACKUP OPTIMIZATION OFF; # default

預設備份裝置是磁碟

CONFIGURE DEFAULT DEVICE TYPE TO DISK; # default

控制檔案自動備份:關

CONFIGURE CONTROLFILE AUTOBACKUP OFF; # default

控制檔案自動備份目錄和格式:%F  【備份裝置:Disk】

CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK TO '%F'; # default

備份的並行度:1,備份型別為備份集

CONFIGURE DEVICE TYPE DISK PARALLELISM 1 BACKUP TYPE TO BACKUPSET; # default

資料檔案採用複製方式備份

CONFIGURE DATAFILE BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # default

歸檔日誌採用複製方式備份

CONFIGURE ARCHIVELOG BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # default

最大值:無限制

CONFIGURE MAXSETSIZE TO UNLIMITED; # default

加密資料庫:關

CONFIGURE ENCRYPTION FOR DATABASE OFF; # default

加密演算法採用AES128

CONFIGURE ENCRYPTION ALGORITHM 'AES128'; # default

歸檔日誌刪除策略:空

CONFIGURE ARCHIVELOG DELETION POLICY TO NONE; # default

快照控制檔名

CONFIGURE SNAPSHOT CONTROLFILE NAME TO '/u01/app/oracle/product/10.2.0/db_1/dbs/snapcf_LEO.f'; # default

配置RMAN預設備份介質存放目錄到/home/oracle/backup

RMAN> configure channel device type disk format '/home/oracle/backup/DB_LEO_%U';

new RMAN configuration parameters:

CONFIGURE CHANNEL DEVICE TYPE DISK FORMAT   '/home/oracle/backup/DB_LEO_%U';

new RMAN configuration parameters are successfully stored

配置控制檔案自動備份並儲存到/home/oracle/backup/control目錄

RMAN> configure controlfile autobackup on;              啟動控制檔案自動備份

new RMAN configuration parameters:

CONFIGURE CONTROLFILE AUTOBACKUP ON;

new RMAN configuration parameters are successfully stored

RMAN> configure controlfile autobackup format for device type disk to  設定控制檔案自動備份目錄和格式

'/home/oracle/backup/control/cf_LEO_%F';

new RMAN configuration parameters:

CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK TO

'/home/oracle/backup/control/cf_LEO_%F';

new RMAN configuration parameters are successfully stored

顯示配置後RMAN環境變數

RMAN configuration parameters are:

CONFIGURE RETENTION POLICY TO REDUNDANCY 1; # default

CONFIGURE BACKUP OPTIMIZATION OFF; # default

CONFIGURE DEFAULT DEVICE TYPE TO DISK; # default

CONFIGURE CONTROLFILE AUTOBACKUP ON;

CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK TO '/home/oracle/backup/control/cf_LEO_%F';

CONFIGURE DEVICE TYPE DISK PARALLELISM 1 BACKUP TYPE TO BACKUPSET; # default

CONFIGURE DATAFILE BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # default

CONFIGURE ARCHIVELOG BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # default

CONFIGURE CHANNEL DEVICE TYPE DISK FORMAT   '/home/oracle/backup/DB_LEO_%U';

CONFIGURE MAXSETSIZE TO UNLIMITED; # default

CONFIGURE ENCRYPTION FOR DATABASE OFF; # default

CONFIGURE ENCRYPTION ALGORITHM 'AES128'; # default

CONFIGURE ARCHIVELOG DELETION POLICY TO NONE; # default

CONFIGURE SNAPSHOT CONTROLFILE NAME TO '/u01/app/oracle/product/10.2.0/db_1/dbs/snapcf_LEO.f'; # default

使用rman命令壓縮備份資料庫

RMAN> backup as compressed backupset full database format

2> '/home/oracle/backup/full_bk1_%u%p%s.rmn' include current controlfile

3> plus

4> archivelog format '/home/oracle/backup/arch_bk1_%u%p%s.rmn'  delete all input; (刪除備份過的舊歸檔日誌)

Starting backup at 26-JUL-12

current log archived

allocated channel: ORA_DISK_1

channel ORA_DISK_1: sid=318 devtype=DISK

channel ORA_DISK_1: starting compressed archive log backupset

channel ORA_DISK_1: specifying archive log(s) in backup set

input archive log thread=1 sequence=5 recid=1 stamp=784059130

input archive log thread=1 sequence=6 recid=6 stamp=788723549

input archive log thread=1 sequence=7 recid=4 stamp=788723547

input archive log thread=1 sequence=8 recid=5 stamp=788723548

channel ORA_DISK_1: starting piece 1 at 26-JUL-12

channel ORA_DISK_1: finished piece 1 at 26-JUL-12

piece handle=/home/oracle/backup/arch_bk1_01nh3iuq11.rmn tag=TAG20120726T235048 comment=NONE

channel ORA_DISK_1: backup set complete, elapsed time: 00:00:16

channel ORA_DISK_1: deleting archive log(s)

archive log filename=/home/oracle/arch/LEO/1_5_784052660.dbf recid=1 stamp=784059130

archive log filename=/home/oracle/arch/LEO/1_6_784052660.dbf recid=6 stamp=788723549

archive log filename=/home/oracle/arch/LEO/1_6_784052660.dbf recid=2 stamp=784059169

archive log filename=/home/oracle/arch/LEO/1_7_784052660.dbf recid=4 stamp=788723547

archive log filename=/home/oracle/arch/LEO/1_7_784052660.dbf recid=3 stamp=784059221

archive log filename=/home/oracle/arch/LEO/1_8_784052660.dbf recid=5 stamp=788723548

channel ORA_DISK_1: starting compressed archive log backupset

channel ORA_DISK_1: specifying archive log(s) in backup set

input archive log thread=1 sequence=1 recid=8 stamp=789695445

channel ORA_DISK_1: starting piece 1 at 26-JUL-12

channel ORA_DISK_1: finished piece 1 at 26-JUL-12

piece handle=/home/oracle/backup/arch_bk1_02nh3ivb12.rmn tag=TAG20120726T235048 comment=NONE

channel ORA_DISK_1: backup set complete, elapsed time: 00:00:09

channel ORA_DISK_1: deleting archive log(s)

archive log filename=/home/oracle/arch/LEO/1_1_788725257.dbf recid=8 stamp=789695445

channel ORA_DISK_1: starting compressed archive log backupset

channel ORA_DISK_1: specifying archive log(s) in backup set

input archive log thread=1 sequence=1 recid=7 stamp=788725258

channel ORA_DISK_1: starting piece 1 at 26-JUL-12

channel ORA_DISK_1: finished piece 1 at 26-JUL-12

piece handle=/home/oracle/backup/arch_bk1_03nh3ivk13.rmn tag=TAG20120726T235048 comment=NONE

channel ORA_DISK_1: backup set complete, elapsed time: 00:00:02

channel ORA_DISK_1: deleting archive log(s)

archive log filename=/home/oracle/arch/LEO/1_1_788723547.dbf recid=7 stamp=788725258

Finished backup at 26-JUL-12

 

Starting backup at 26-JUL-12

using channel ORA_DISK_1

channel ORA_DISK_1: starting compressed full datafile backupset

channel ORA_DISK_1: specifying datafile(s) in backupset

input datafile fno=00001 name=/u01/app/oracle/oradata/LEO/file1/system01.dbf

input datafile fno=00003 name=/u01/app/oracle/oradata/LEO/file1/sysaux01.dbf

input datafile fno=00002 name=/u01/app/oracle/oradata/LEO/file1/undotbs01.dbf

input datafile fno=00005 name=/u01/app/oracle/oradata/LEO/file1/example01.dbf

input datafile fno=00004 name=/u01/app/oracle/oradata/LEO/file1/users01.dbf

channel ORA_DISK_1: starting piece 1 at 26-JUL-12

channel ORA_DISK_1: finished piece 1 at 26-JUL-12

piece handle=/home/oracle/backup/full_bk1_04nh3ivo14.rmn tag=TAG20120726T235119 comment=NONE

channel ORA_DISK_1: backup set complete, elapsed time: 00:04:56

channel ORA_DISK_1: starting compressed full datafile backupset

channel ORA_DISK_1: specifying datafile(s) in backupset

including current control file in backupset

channel ORA_DISK_1: starting piece 1 at 26-JUL-12

channel ORA_DISK_1: finished piece 1 at 26-JUL-12

piece handle=/home/oracle/backup/full_bk1_05nh3j9015.rmn tag=TAG20120726T235119 comment=NONE

channel ORA_DISK_1: backup set complete, elapsed time: 00:00:02

Finished backup at 26-JUL-12

 

Starting backup at 26-JUL-12

current log archived

using channel ORA_DISK_1

channel ORA_DISK_1: starting compressed archive log backupset

channel ORA_DISK_1: specifying archive log(s) in backup set

input archive log thread=1 sequence=2 recid=9 stamp=789695779

channel ORA_DISK_1: starting piece 1 at 26-JUL-12

channel ORA_DISK_1: finished piece 1 at 26-JUL-12

piece handle=/home/oracle/backup/arch_bk1_06nh3j9316.rmn tag=TAG20120726T235619 comment=NONE

channel ORA_DISK_1: backup set complete, elapsed time: 00:00:02

channel ORA_DISK_1: deleting archive log(s)

archive log filename=/home/oracle/arch/LEO/1_2_788725257.dbf recid=9 stamp=789695779

Finished backup at 26-JUL-12

 

Starting Control File and SPFILE Autobackup at 26-JUL-12

piece handle=/home/oracle/backup/control/cf_LEO_c-1558319476-20120726-00 comment=NONE

Finished Control File and SPFILE Autobackup at 26-JUL-12

ll        看一下我們的備份檔案已經生成,下面可以安心做實驗啦

-rw-r-----  1 oracle oinstall 11180032 Jul 26 23:51 arch_bk1_01nh3iuq11.rmn

-rw-r-----  1 oracle oinstall  7031808 Jul 26 23:51 arch_bk1_02nh3ivb12.rmn

-rw-r-----  1 oracle oinstall  1030144 Jul 26 23:51 arch_bk1_03nh3ivk13.rmn

-rw-r-----  1 oracle oinstall     8704 Jul 26 23:56 arch_bk1_06nh3j9316.rmn

drwxr-xr-x  2 oracle oinstall     4096 Jul 26 23:56 control

-rw-r-----  1 oracle oinstall 57221120 Jul 26 23:56 full_bk1_04nh3ivo14.rmn

-rw-r-----  1 oracle oinstall  1097728 Jul 26 23:56 full_bk1_05nh3j9015.rmn

二、模擬資料檔案刪除

SYS@LEO> select status from v$instance;   檢視資料庫是否是open狀態,是open沒有問題

STATUS

------------

OPEN

SYS@LEO> select name from v$datafile;    檢視存在的資料檔案

NAME

--------------------------------------------------------------------------------

/u01/app/oracle/oradata/LEO/file1/system01.dbf

/u01/app/oracle/oradata/LEO/file1/undotbs01.dbf

/u01/app/oracle/oradata/LEO/file1/sysaux01.dbf

/u01/app/oracle/oradata/LEO/file1/users01.dbf

/u01/app/oracle/oradata/LEO/file1/example01.dbf

SYS@LEO> host rm /u01/app/oracle/oradata/LEO/file1/example01.dbf  刪除example01.dbf資料檔案

我們檢查一下刪除example01.dbf檔案後資料庫是不是還處在open狀態

SYS@LEO> select status from v$instance;    呵呵  就像我們在開始說的一下,open木有問題

STATUS

------------

OPEN

我們檢查一下刪除example01.dbf檔案後表空間處於什麼狀態呢?

SYS@LEO> select TABLESPACE_NAME,status from dba_tablespaces;

TABLESPACE_NAME                STATUS

------------------------------ ---------

SYSTEM                         ONLINE

UNDOTBS                        ONLINE

SYSAUX                          ONLINE

TEMPTS1                        ONLINE

USERS                           ONLINE

TEMP1                          ONLINE

TEMP2                          ONLINE

EXAMPLE                        ONLINE      我們看到EXAMPLE還是聯機狀態,為什麼是這樣呢,我們不是已經把檔案刪除了嘛!對嘍 可能有人已經想到了,我們是從作業系統層面上直接刪除的(沒有透過資料庫刪除),所以資料庫此時還認為這個檔案沒有變化,資料字典中記錄的還是完好的狀態

conn ls/ls                                          切換使用者

資料檔案被刪除,所以建立表失敗(當我們往資料檔案上寫資料時,資料庫才會檢查檔案狀態)

LS@LEO> create table t1 tablespace example as select * from liusheng;

create table t1 tablespace example as select * from liusheng  

ERROR at line 1:

ORA-01116: error in opening database file 5

ORA-01110: data file 5: '/u01/app/oracle/oradata/LEO/file1/example01.dbf'

ORA-27041: unable to open file

Linux Error: 2: No such file or directory

Additional information: 3

注:如果我們在其他資料檔案上建立表時,是可以正常建立的,因為其他資料檔案沒有被破壞

LS@LEO> create table t2 tablespace users as select * from liusheng;

Table created.

三、透過檔案控制程式碼恢復資料檔案

查詢dbwr程式pid

[oracle@secdb1 bdump]$ ps -ef|grep dbw|grep -v grep

oracle   26745     1  0 Jul26 ?        00:00:01 ora_dbw0_LEO

注:1.可能有的朋友對  grep –v grep 這句不是很明白,-v 選項反向過濾oracle    2405   421  0 01:03 pts/9    00:00:00 grep dbw 這行記錄的意思,只保留 ora_dbw0_LEO 這行記錄,方便觀察

2. dbwr程式會開啟所有訪問資料檔案的控制程式碼。在proc目錄中可以查到,目錄名是程式PIDfdfile discription)表示檔案描述符。

cd  /proc/26745/fd

ls -l

lr-x------  1 oracle oinstall 64 Jul 27 01:09 0 -> /dev/null

lr-x------  1 oracle oinstall 64 Jul 27 01:09 1 -> /dev/null

lrwx------  1 oracle oinstall 64 Jul 27 01:09 10 -> /u01/app/oracle/admin/LEO/adump/ora_26737.aud

lr-x------  1 oracle oinstall 64 Jul 27 01:09 11 -> /dev/zero

lr-x------  1 oracle oinstall 64 Jul 27 01:09 12 -> /dev/zero

lr-x------  1 oracle oinstall 64 Jul 27 01:09 13 -> /u01/app/oracle/product/10.2.0/db_1/rdbms/mesg/oraus.msb

lrwx------  1 oracle oinstall 64 Jul 27 01:09 14 -> /u01/app/oracle/product/10.2.0/db_1/dbs/hc_LEO.dat

lrwx------  1 oracle oinstall 64 Jul 27 01:09 15 -> /u01/app/oracle/product/10.2.0/db_1/dbs/lkLEO

lrwx------  1 oracle oinstall 64 Jul 27 01:09 16 -> /u01/app/oracle/oradata/LEO/file1/control01.ctl

lrwx------  1 oracle oinstall 64 Jul 27 01:09 17 -> /u01/app/oracle/oradata/LEO/file2/control02.ctl

lrwx------  1 oracle oinstall 64 Jul 27 01:09 18 -> /u01/app/oracle/oradata/LEO/file3/control03.ctl

lrwx------  1 oracle oinstall 64 Jul 27 01:09 19 -> /u01/app/oracle/oradata/LEO/file1/system01.dbf

lr-x------  1 oracle oinstall 64 Jul 27 01:09 2 -> /dev/null

lrwx------  1 oracle oinstall 64 Jul 27 01:09 20 -> /u01/app/oracle/oradata/LEO/file1/undotbs01.dbf

lrwx------  1 oracle oinstall 64 Jul 27 01:09 21 -> /u01/app/oracle/oradata/LEO/file1/sysaux01.dbf

lrwx------  1 oracle oinstall 64 Jul 27 01:09 22 -> /u01/app/oracle/oradata/LEO/file1/users01.dbf

lrwx------  1 oracle oinstall 64 Jul 27 01:09 23 -> /u01/app/oracle/oradata/LEO/file1/example01.dbf (deleted)

lrwx------  1 oracle oinstall 64 Jul 27 01:09 24 -> /u01/app/oracle/oradata/LEO/file1/temp01.dbf

lrwx------  1 oracle oinstall 64 Jul 27 01:09 25 -> /u01/app/oracle/oradata/LEO/file1/temp1_01.dbf

lrwx------  1 oracle oinstall 64 Jul 27 01:09 26 -> /u01/app/oracle/oradata/LEO/file1/temp1_02.dbf

lr-x------  1 oracle oinstall 64 Jul 27 01:09 3 -> /dev/null

lr-x------  1 oracle oinstall 64 Jul 27 01:09 4 -> /dev/null

l-wx------  1 oracle oinstall 64 Jul 27 01:09 5 -> /u01/app/oracle/admin/LEO/udump/leo_ora_26737.trc

l-wx------  1 oracle oinstall 64 Jul 27 01:09 6 -> /u01/app/oracle/admin/LEO/bdump/alert_LEO.log

lrwx------  1 oracle oinstall 64 Jul 27 01:09 7 -> /u01/app/oracle/product/10.2.0/db_1/dbs/lkinstLEO (deleted)

l-wx------  1 oracle oinstall 64 Jul 27 01:09 8 -> /u01/app/oracle/admin/LEO/bdump/alert_LEO.log

lrwx------  1 oracle oinstall 64 Jul 27 01:09 9 -> /u01/app/oracle/product/10.2.0/db_1/dbs/hc_LEO.dat

注:“23 -> /u01/app/oracle/oradata/LEO/file1/example01.dbf (deleted) 被刪除資料檔案會被標示(deleted),我們使用的是redhat linux,如果在solaris系統中使用lsof命令檢視控制程式碼和檔案對應關係,我們透過copy檔案控制程式碼方式恢復資料檔案到原位置

cp /proc/26745/fd/23 /u01/app/oracle/oradata/LEO/file1/example01.dbf

確認example01.dbf 資料檔案已經恢復成功

ll /u01/app/oracle/oradata/LEO/file1/example01.dbf

-rw-r-----  1 oracle oinstall 209723392 Jul 27 01:22 /u01/app/oracle/oradata/LEO/file1/example01.dbf

四、資料檔案recover

LS@LEO> alter database datafile 5 offline;  先把example01.dbf檔案離線

Database altered.

LS@LEO> recover datafile 5;                  完成介質恢復,實質同步控制檔案、redo日誌scn

Media recovery complete.

LS@LEO> alter database datafile 5 online;  再把example01.dbf檔案聯機

Database altered.

五、測試是否可以正常建立表t1

LS@LEO> create table t1 tablespace example as select * from liusheng;

Table created.                                                                                              成功建立

LS@LEO> select * from t1;

   ORDERID NAME       LS_DATE

---------- ---------- -------------------

         1 ls1        1981-01-02 00:00:00

         1 ls2        1998-01-03 00:00:00

         1 ls3        1999-01-04 00:00:00

         1 ls4        2000-01-05 00:00:00

         1 ls5        2000-01-06 00:00:00

         1 ls6        2001-01-07 00:00:00

         1 ls7        2001-01-08 00:00:00

         1 ls8        2002-01-09 00:00:00

         1 ls9        2002-01-10 00:00:00

         1 ls10       2011-01-11 00:00:00

10 rows selected.

小結:當我們在Linux系統中不小心rm了資料檔案時,一定要冷靜,不要做關閉資料庫、重啟作業系統等危險操作,因為在不瞭解資料庫執行狀態的前提下做這些往往是徒勞的,還可能造成無法挽回的後果。此時我們不妨在資料庫open狀態下使用檔案控制程式碼方式來恢復被我們rm掉的資料檔案。因為只要資料檔案被某個程式使用著,那麼這個程式就會一直持有這個資料檔案的控制程式碼,那麼所指向的資料檔案依然可以讀寫,我們可以從proc->pid->fd目錄中找到被刪除的資料檔案控制程式碼(deletedcopy到原位置來恢復,這也是在恢復視窗時間很短的情況下快速恢復的好方法。

 

Leonarding

2012.7.26

天津&summer

分享技術~收穫快樂

Bloghttp://space.itpub.net/26686207

 
 
 
 

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/26686207/viewspace-737123/,如需轉載,請註明出處,否則將追究法律責任。

相關文章