所有除引數檔案以外的檔案都丟失,但是隻有資料檔案的 RMAN 備份的恢復2

jst143發表於2011-03-25

客戶報告資料庫故障,新來的系統管理員誤操作。刪掉了一些檔案。具體情況是:刪掉了所有重要資料檔案、所有控制檔案。資料庫原來是歸檔模式,用 RMAN (Recovery Manager) 備份資料,而 RMAN 使用控制檔案。幸運的是,最後一次 RMAN full 備份是包括了控制檔案在內。系統沒有設定自動備份控制檔案。現在狀況是資料庫無法啟動。
不用說,客戶的備份方案不夠完善,但是這時候再去說這些話責備使用者有事後諸葛亮之嫌,"使用者是上帝,不要去得罪他"。還有,客戶有Full備份(雖然不是自動備份控制檔案,這樣無法用常規的恢復步驟來進行恢復)。這對我們來說是個絕對的好訊息。

下面我們通過一次模擬操作來演示這個問題的解決辦法。

背景知識
在Oracle 816 以後的版本中,Oracle 提供了一個包: DBMS_BACKUP_RESTORE.DBMS_BACKUP_RESTORE 包是由 dbmsbkrs.sql 和 prvtbkrs.plb 這兩個指令碼建立的. catproc.sql 指令碼執行後會呼叫這兩個包.所以是每個資料庫都有的. 這個包是 Oracle 伺服器和作業系統之間 IO 操作的介面. 由恢復管理器直接呼叫。這兩個指令碼的功能是內建到 Oracle 的一些庫檔案中的.

由此可見,我們可以在資料庫 nomount 情況下呼叫這些 package ,來達到我們的恢復目的。在dbmsbkrs.sql 和 prvtbkrs.plb 這兩個指令碼中有詳細的說明文件,出於篇幅問題,就不一一加以翻譯了,但在下面會直接引用一些原文說明。

關鍵的內容有:

FUNCTION  deviceAllocate(
       type IN varchar2 default NULL
      ,name IN varchar2 default NULL
      ,ident IN varchar2 default NULL
      ,noio IN boolean default FALSE
      ,params IN varchar2 default NULL )
RETURN varchar2;
-- Describe the device to be used for sequential I/O. For device types where
-- only one process at a time can use a device, this call allocates a device
-- for exclusive use by this session. The device remains allocated until
-- deviceDeallocate is called or session termination. The device can be used
-- both for creating and restoring backups.
--
-- Specifying a device allocates a context that exists until the session
-- terminates or deviceDeallocate is called. Only one device can be specified
-- at a time for a particular session. Thus deviceDeallocate must be called
-- before a different device can be specified. This is not a limitation since
-- a session can only read or write one backup at a time.
--
-- The other major effect of allocating a device is to specify the name space
-- for the backup handles (file names). The handle for a sequential file does
-- not necessarily define the type of device used to write the file. Thus it
-- is necessary to specify the device type in order to interpret the file
-- handle. The NULL device type is defined for all systems. It is the file
-- system supplied by the operating system. The sequential file handles are
-- thus normal file names.
--
-- A device can be specified either by name or by type.
-- If the type is specified but not the name, the system picks an
-- available device of that type.
-- If the name is specified but not the type, the type is determined
-- from the device.
-- If neither the type or the name is given, the backups are files in
-- the operating system file system.
-- Note that some types of devices, optical disks for example, can be shared
-- by many processes, and thus do not really require allocation of the device
-- itself. However we do need to allocate the context for accessing the
-- device, and we do need to know the device type for proper interpretation
-- of the file handle. Thus it is always necessary to make the device
-- allocation call before making most other calls in this package.
--
-- Input parameters:
-- type
-- If specified, this gives the type of device to use for sequential
-- I/O. The allowed types are port specific. For example a port may
-- support the type "TAPE" which is implemented via the Oracle tape
-- API. If no type is specified, it may be implied by specifying a
-- particular device name to allocate. The type should be allowed to
-- default to NULL if operating system files are to be used.
--
-- name
-- If specified, this names a particular piece of hardware to use for
-- accessing sequential files. If not specified, any available
-- device of the correct type will be allocated. If the device cannot
-- be shared, it is allocated to this session for exclusive use.
-- The name should be allowed to default to NULL if operating system
-- files are to be used.
--
-- ident
-- This is the users identifier that he uses to name this device. It
-- is only used to report the status of this session via
-- dbms_application_info. This value will be placed in the CLIENT_INFO
-- column of the V$SESSION table, in the row corresponding to the
-- session in which the device was allocated. This value can also
-- be queried with the dbms_application_info.read_client_info procedure.
--
-- noio
-- If TRUE, the device will not be used for doing any I/O. This allows
-- the specification of a device type for deleting sequential files
-- without actually allocating a piece of hardware. An allocation for
-- noio can also be used for issuing device commands. Note that some
-- commands may actually require a physical device and thus will get
-- an error if the allocate was done with noio set to TRUE.
--
-- params
-- This string is simply passed to the device allocate OSD. It is
-- completely port and device specific.
--
-- Returns:
-- It returns a valid device type. This is the type that should be
-- allocated to access the same sequential files at a later date. Note
-- that this might not be exactly the same value as the input string.
-- The allocate OSD may do some translation of the type passed in. The
-- return value is NULL when using operating system files.
PROCEDURE restoreControlfileTo(cfname IN varchar2);
-- This copies the controlfile from the backup set to an operating system
-- file. If the database is mounted, the name must NOT match any of the
-- current controlfiles.
-- 從備份中恢復控制檔案為作業系統檔案.
-- 如果資料庫在 mount 狀態,控制檔名字則必須要匹配當前控制檔案的名字
-- Input parameters:
-- cfname
-- Name of file to create or overwrite with the controlfile from the
-- backup set.
-- 備份集裡的控制檔名字,建立或者覆蓋控制檔案.
PROCEDURE restoreDataFileTo( dfnumber IN binary_integer
,toname IN varchar2 default NULL);
--
-- restoreDataFileTo creates the output file from a complete backup in the
-- backup set.
-- 從完整的備份集裡建立輸出檔案
如果您有興趣可以去閱讀一下這兩個檔案的註釋說明.

解決過程
首先,用控制檔案作資料庫系統的全備份:

C:WUTemp>rman target /
Recovery Manager: Release 9.2.0.1.0 - Production.
Copyright (c) 1995, 2002, Oracle Corporation. All rights reserved.
connected to target database: DEMO (DBID=3272375326)
RMAN> run {
2> allocate channel C1 type disk;
3> backup full tag 'FullBackup' format 'd:\KDE\%d_%u_%s_%p.dbf' database include current controlfile;
4> sql ' alter system archive log current';
5> release channel C1;
6> }
using target database controlfile instead of recovery catalog
allocated channel: C1
channel C1: sid=15 devtype=DISK
Starting backup at 18-JUL-04
channel C1: starting full datafile backupset
channel C1: specifying datafile(s) in backupset
including current SPFILE in backupset
including current controlfile in backupset
input datafile fno=00001 name=D:\ORACLE\ORADATA\DEMO\SYSTEM01.DBF
input datafile fno=00002 name=D:\ORACLE\ORADATA\DEMO\UNDOTBS01.DBF
input datafile fno=00004 name=D:\ORACLE\ORADATA\DEMO\EXAMPLE01.DBF
input datafile fno=00009 name=D:\ORACLE\ORADATA\DEMO\XDB01.DBF
input datafile fno=00005 name=D:\ORACLE\ORADATA\DEMO\INDX01.DBF
input datafile fno=00008 name=D:\ORACLE\ORADATA\DEMO\USERS01.DBF
input datafile fno=00003 name=D:\ORACLE\ORADATA\DEMO\DRSYS01.DBF
input datafile fno=00006 name=D:\ORACLE\ORADATA\DEMO\ODM01.DBF
input datafile fno=00007 name=D:\ORACLE\ORADATA\DEMO\TOOLS01.DBF
channel C1: starting piece 1 at 18-JUL-04
channel C1: finished piece 1 at 18-JUL-04
piece handle=D:\KDE\DEMO_01FR79OT_1_1.DBF comment=NONE
channel C1: backup set complete, elapsed time: 00:01:17
Finished backup at 18-JUL-04
sql statement: alter system archive log current
released channel: C1

如上所示,我們做了一次資料庫的Full備份.備份片中包括控制檔案.注意上面輸出內容的黑體部分.我們在後面的恢復操作中會用到.

模擬錯誤,關掉例項,刪掉所有的控制檔案和所有的.DBF檔案。然後starup會看到如下的出錯資訊:

SQL> startup
ORACLE instance started.
Total System Global Area 152115804 bytes
Fixed Size 453212 bytes
Variable Size 100663296 bytes
Database Buffers 50331648 bytes
Redo Buffers 667648 bytes
ORA-00205: error in identifying controlfile, check alert log for more info

檢視alert Log,應該是系統找不到控制檔案.現在情形和客戶問題一致.不過在繼續講述之前,我們還需要介紹一點背景知識.

我們首先嚐試恢復控制檔案:

SQL>startup force nomount;
SQL> DECLARE
2 devtype varchar2(256);
3 done boolean;
4 BEGIN
5 devtype:=sys.dbms_backup_restore.deviceAllocate(type=>'',ident=>'T1');
6 sys.dbms_backup_restore.restoreSetDatafile;
7 sys.dbms_backup_restore.restoreControlfileTo (cfname=>'d:\oracle\Control01.ctl');
8 sys.dbms_backup_restore.restoreBackupPiece(done=>done,handle=>'D:\KDE\DEMO_01FR79OT_1_1.DBF', params=>null);
9 sys.dbms_backup_restore.deviceDeallocate;
10 END;
11 /
PL/SQL procedure successfully completed.

OK,控制檔案恢復完成.對以上內容的解釋:

第五行 分配一個device channel,因為使用的作業系統檔案,所以這裡為空,如果是從磁帶上恢復要用 "sbt_tape";
第六行 指明開始restore ;
第七行 指出待恢復檔案目標儲存位置;
第八行 從哪個備份片中恢復;
第九行 釋放裝置通道.
不妨對以上操作的結果驗證一下:

SQL> host dir d:\oracle
Volume in drive D is DATA
Volume Serial Number is DC79-57F8
Directory of d:\oracle
07/18/2004 09:08 PM

.
07/18/2004 09:08 PM ..
06/08/2004 03:21 PM admin
07/18/2004 09:08 PM 1,871,872 CONTROL01.CTL
07/16/2004 11:27 AM ORA92
07/18/2004 09:02 PM oradata

這樣,我們成功的 restore 了控制檔案 .如果控制檔案在 Full 備份之後單獨做的,接下來關掉例項,拷貝控制檔案到具體位置,然後 rman 執行 restore database; 即可。

可是,我們這裡的情況有些不同.

視丟失檔案的情況而定,繼續進行如下的恢復操作:

SQL> DECLARE
2 devtype varchar2(256);
3 done boolean;
4 BEGIN
5 devtype:=sys.dbms_backup_restore.deviceAllocate (type=>'',ident=>'t1');
6 sys.dbms_backup_restore.restoreSetDatafile;
7 sys.dbms_backup_restore.restoreDatafileTo(dfnumber=>01,toname=>'d:\oracle\oradata\demo\SYSTEM01.DBF');
8 sys.dbms_backup_restore.restoreDatafileTo(dfnumber=>02,toname=>'d:\oracle\oradata\demo\UNDOTBS01.DBF');
9 sys.dbms_backup_restore.restoreDatafileTo(dfnumber=>03,toname=>'d:\oracle\oradata\demo\DRSYS01.DBF');
10 sys.dbms_backup_restore.restoreDatafileTo(dfnumber=>04,toname=>'d:\oracle\oradata\demo\EXAMPLE01.DBF');
11 sys.dbms_backup_restore.restoreDatafileTo(dfnumber=>05,toname=>'d:\oracle\oradata\demo\INDX01.DBF');
12 sys.dbms_backup_restore.restoreDatafileTo(dfnumber=>06,toname=>'d:\oracle\oradata\demo\ODM01.DBF');
13 sys.dbms_backup_restore.restoreDatafileTo(dfnumber=>07,toname=>'d:\oracle\oradata\demo\TOOLS01.DBF');
14 sys.dbms_backup_restore.restoreDatafileTo(dfnumber=>08,toname=>'d:\oracle\oradata\demo\USERS01.DBF');
15 sys.dbms_backup_restore.restoreDatafileTo(dfnumber=>09,toname=>'d:\oracle\oradata\demo\XDB01.DBF');
16 sys.dbms_backup_restore.restoreBackupPiece(done=>done,handle=>'D:\KDE\DEMO_01FR79OT_1_1.DBF',
params=>null);
17 sys.dbms_backup_restore.deviceDeallocate;
18 END;
19 /
PL/SQL procedure successfully completed.
--我們的情形是所有的資料檔案都丟失了,那就如法炮製 ...........
--檔案對應編號來自前面全備份時候的螢幕輸出內容.所以,在備份的時候保留操作 Log 是個很好的習慣.

SQL> startup force mount;
ORACLE instance started.
Total System Global Area 152115804 bytes
Fixed Size 453212 bytes
Variable Size 100663296 bytes
Database Buffers 50331648 bytes
Redo Buffers 667648 bytes
Database mounted.
SQL> Recover database using backup controlfile until cancel ;
ORA-00279: change 243854 generated at 07/18/2004 20:57:03 needed for thread 1
ORA-00289: suggestion : D:\KDE\ARC00002.001
ORA-00280: change 243854 for thread 1 is in sequence #2
Specify log: {=suggested | filename | AUTO | CANCEL}
D:\KDE\ARC00002.001
ORA-00279: change 244089 generated at 07/18/2004 20:58:18 needed for thread 1
ORA-00289: suggestion : D:\KDE\ARC00003.001
ORA-00280: change 244089 for thread 1 is in sequence #3
ORA-00278: log file 'D:\KDE\ARC00002.001' no longer needed for this recovery
Specify log: {=suggested | filename | AUTO | CANCEL}
CANCEL
Media recovery cancelled.
SQL> alter database open resetlogs;
Database altered.

最後,不得不resetlogs .

然後,打掃戰場,馬上進行資料庫的全備份。如果您是DBA的話,應該進一步制定並完善備份計劃.亡羊補牢,為時未晚。

總結一下
控制檔案在備份中意義重大,建議每次對其單獨備份,如果資料庫版本允許的話,應該設定為控制檔案自動備 份。同時應該儘可能地增大CONTROL_FILE_RECORD_KEEP_TIME這個初始化引數的值。以便備份資訊能更長時間的保留
應該制定比較完善的備份計劃,否則備份計劃一旦出現缺口,將可能給系統帶來災難.記住, "可能出錯的地方一定會出錯".
熟悉RMAN內部備份機制,對DBMS_BACKUP_RESTORE的用法有一定的掌握在關鍵時侯很有幫助.
備份指令碼應該對Log重定向並儲存.以便在出錯的查詢有用資訊.

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/23577591/viewspace-690514/,如需轉載,請註明出處,否則將追究法律責任。

相關文章