oracle丟失的是所有的redo日誌組

賀子_DBA時代發表於2019-04-20

假設丟失的是所有的redo日誌組,分下列幾種情況分別處理:

1.Oracle沒開歸檔,一致性關閉資料庫

2.Oracle沒開歸檔,非一致性關閉資料庫

3.Oracle開歸檔,一致性關閉資料庫

4.Oracle開歸檔,非一致性關閉資料庫

一:Oracle沒開歸檔,一致性關閉資料庫

我做實驗的過程中有一個詭異的情況,我先把redo檔案從作業系統層面都刪除了,但是資料庫正常建立表,insert資料,我理解的是當你commit的時候,會觸發lgwr程式從redo log buffer中涮新redo 到redo 檔案中,但是redo檔案已經被刪除了,就會報錯,但是他並沒有報錯:

[root@testdb59 /data/u01/app/oracle/oradata/stdb59]# ll

total 13697796

-rw-r----- 1 oracle oinstall 144916480 Apr 5 22:30 control01.ctl

-rw-r----- 1 oracle oinstall 2147491840 Apr 5 22:26 liuwenhe.dbf

-rw-r----- 1 oracle oinstall 52429312 Apr 5 22:26 redo01.log

-rw-r----- 1 oracle oinstall 52429312 Apr 5 22:29 redo03.log

-rw-r----- 1 oracle oinstall 4938801152 Apr 5 22:26 soe3.dbf

-rw-r----- 1 oracle oinstall 2469404672 Apr 5 22:26 soe.dbf

-rw-r----- 1 oracle oinstall 2705334272 Apr 5 22:26 sysaux01.dbf

-rw-r----- 1 oracle oinstall 786440192 Apr 5 22:26 system01.dbf

-rw-r----- 1 oracle oinstall 30416896 Oct 16 12:37 temp01.dbf

-rw-r----- 1 oracle oinstall 1073750016 Apr 5 22:26 temp.dbf

-rw-r----- 1 oracle oinstall 309338112 Apr 5 22:26 undotbs01.dbf

-rw-r----- 1 oracle oinstall 166469632 Apr 5 22:26 users01.dbf

刪除redo 檔案

[root@testdb59 /data/u01/app/oracle/oradata/stdb59]# rm *.log

再次檢視,發現確實已經沒有了redo檔案

[root@testdb59 /data/u01/app/oracle/oradata/stdb59]# ll

total 13595388

-rw-r----- 1 oracle oinstall 144916480 Apr 5 22:50 control01.ctl

-rw-r----- 1 oracle oinstall 2147491840 Apr 5 22:50 liuwenhe.dbf

-rw-r----- 1 oracle oinstall 4938801152 Apr 5 22:50 soe3.dbf

-rw-r----- 1 oracle oinstall 2469404672 Apr 5 22:50 soe.dbf

-rw-r----- 1 oracle oinstall 2705334272 Apr 5 22:50 sysaux01.dbf

-rw-r----- 1 oracle oinstall 786440192 Apr 5 22:50 system01.dbf

-rw-r----- 1 oracle oinstall 30416896 Oct 16 12:37 temp01.dbf

-rw-r----- 1 oracle oinstall 1073750016 Apr 5 22:41 temp.dbf

-rw-r----- 1 oracle oinstall 309338112 Apr 5 22:50 undotbs01.dbf

-rw-r----- 1 oracle oinstall 166469632 Apr 5 22:50 users01.dbf

SQL> create table t(int int);

Table created.

SQL> insert into t values (100);

1 row created.

SQL> commit;

SQL>alter system switch logfile;

System altered.

SQL> alter system checkpoint;

System altered.

有點理解不了!!!!問了下老師,才知道原來是開啟的檔案控制程式碼還在,重啟之後就沒有了!就會報錯

(體外話:也就是說rm這個檔案了,但是這個檔案實際上還是存在的,先說一下他的工作原理吧,然後我在把試驗分享給大家, 工作原理其實也不難,這個工具需要在ext3或者ext4 的檔案系統上才可以實現,因為ext3檔案系統是日誌型檔案系統,ext3檔案系統儲存資訊的時候是由inode號和block塊儲存的。

神馬? 不知道什麼是inode號?和block塊? 好吧,在說明白點,比如:一個分割槽比如一本書,那麼block塊就是書每頁的內容,而inode號 就是書的目錄,系統找檔案的時候先找inode號 然後根據inode號去找硬碟上的block快資訊,明白了吧!

在說一下刪除的原理吧。 當硬碟上的一個檔案刪除,其實沒有真正想象中的那樣在硬碟上清除掉的,他是把inode號和block塊的那個鏈子 斷開,但是真正的資料還是在硬碟上的,有沒有感覺在windos上刪除是那麼快,沒考慮到這吧,當你在刪除檔案的地方重新複製了新檔案,那時候才會把之前的檔案覆蓋掉,也就是說刪除了沒有關係,千萬不要往那個位置放檔案了)

因為資料庫是一致性關閉的,也就是不需要例項恢復,也就不需要丟失的redo,所以可以直接刪除重建,當然也可以recover database 來恢復丟失的redo,所以針對這種情況,有兩種恢復方式:

方法一:直接clear相應的redo日誌組!也就是刪除重新建立!

SQL> shutdown immediate #一致性關閉

Database closed.

Database dismounted.

ORACLE instance shut down.

SQL> startup mount

ORACLE instance started.

Total System Global Area 1603411968 bytes

Fixed Size 2253664 bytes

Variable Size 1275071648 bytes

Database Buffers 318767104 bytes

Redo Buffers 7319552 bytes

Database mounted.

SQL> archive log list;

Database log mode No Archive Mode

Automatic archival Disabled

Archive destination USE_DB_RECOVERY_FILE_DEST

Oldest online log sequence 30641

Current log sequence 30642

清理刪除從新建立或者直接clear所有的redo 日誌組,包括當前狀態的和active狀態的redo 日誌組!

SQL> alter database clear logfile group 1;

Database altered.

SQL> alter database clear logfile group 3;

Database altered.

SQL> alter database open ;

Database altered.

方法二:recover的方式恢復重做日誌,我的實驗過程中,有的時候這個方法會報錯,如果報錯那麼就使用第一種方式恢復!

SQL> shutdown immediate

Database closed.

Database dismounted.

ORACLE instance shut down.

SQL> startup mount

ORACLE instance started.

Total System Global Area 830930944 bytes

Fixed Size 2257800 bytes

Variable Size 536874104 bytes

Database Buffers 289406976 bytes

Redo Buffers 2392064 bytes

Database mounted.

SQL>

###恢復丟失的redo檔案,但是需要open resetlogs之後才能自動建立上!

SQL> recover database until cancel;

Media recovery complete.

SQL> alter database open resetlogs;

Database altered.

二:Oracle沒開歸檔,非一致性關閉資料庫

[root@testdb59 /data/u01/app/oracle/oradata/stdb59]# rm -f *.log

SQL> shu abort ###非一致性關閉資料庫

ORACLE instance shut down.

這個時候嘗試使用前面的clear或者recover database都會報錯,無法恢復,因為這個時候是需要做例項恢復的,那麼什麼時候需要例項恢復的判斷依據,請參考另一篇文章(Oracle原理-----關於oracle例項恢復的前滾和回滾的理解),報錯如下:

首先嚐試重建,當你嘗試clear當前的日誌組的時候,會報錯提示是需要的!!!因為非一致性關閉確實需要使用丟失的active和current狀態的redo來例項恢復!

首先啟動資料庫到mount狀態

SQL> alter database clear logfile group 3;

alter database clear logfile group 3

*

ERROR at line 1:

ORA-01624: log 3 needed for crash recovery of instance stdb59 (thread 1)

ORA-00312: online log 3 thread 1:

'/data/u01/app/oracle/oradata/stdb59/redo03.log'

然後嘗試recover database,結果肯定不可以,因為例項恢復需要的redo已經丟失!!

SQL> recover database until cancel;

ORA-00279: change 21959466 generated at 04/06/2019 21:15:45 needed for thread 1

ORA-00289: suggestion :

/data/u01/app/oracle/fast_recovery_area/STDB59/archivelog/2019_04_06/o1_mf_1_2_%

u_.arc

ORA-00280: change 21959466 for thread 1 is in sequence #2

Specify log: {<RET>=suggested | filename | AUTO | CANCEL}

CANCEL

ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below

ORA-01194: file 1 needs more recovery to be consistent

ORA-01110: data file 1: '/data/u01/app/oracle/oradata/stdb59/system01.dbf'

ORA-01112: media recovery not started

SQL> alter database open RESETLOGS;

alter database open RESETLOGS

ERROR at line 1:

ORA-01194: file 1 needs more recovery to be consistent

ORA-01110: data file 1: '/data/u01/app/oracle/oradata/stdb59/system01.dbf'

那麼針對這種情況,恢復的方式如下:

使用一個隱含引數_allow_resetlogs_corruption強制啟動資料庫,設定此引數之後,在資料庫Open過程中,Oracle會跳過某些一致性檢查,從而使資料庫可能跳過不一致狀態,到達open資料庫的目的

SQL> create pfile='/home/oracle/pfile.ora' from spfile;

File created.

然後在/home/oracle/pfile.ora新增上

*._allow_resetlogs_corruption=true

SQL> startup mount pfile='/home/oracle/pfile.ora';

SQL> recover database until cancel; #恢復丟失的redo檔案

ORA-00279: change 21959471 generated at 04/06/2019 22:34:01 needed for thread 1

ORA-00289: suggestion :

/data/u01/app/oracle/fast_recovery_area/STDB59/archivelog/2019_04_06/o1_mf_1_2_%

u_.arc

ORA-00280: change 21959471 for thread 1 is in sequence #2

Specify log: {<RET>=suggested | filename | AUTO | CANCEL}

CANCEL

ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below

ORA-01194: file 1 needs more recovery to be consistent

ORA-01110: data file 1: '/data/u01/app/oracle/oradata/stdb59/system01.dbf'

ORA-01112: media recovery not started

幸運的話就可以直接以resetlogs方式open資料庫了!

SQL> alter database open RESETLOGS;

Database altered.

如果遇到下面的錯誤,那麼你就得重建控制檔案了:

SQL> alter database open RESETLOGS;

alter database open RESETLOGS

*

ERROR at line 1:

ORA-01092: ORACLE instance terminated. Disconnection forced

ORA-00704: bootstrap process failure

ORA-00704: bootstrap process failure

ORA-00600: internal error code, arguments: [2662], [0], [21959484], [0],

[21959877], [4194545], [], [], [], [], [], []

Process ID: 13177

Session ID: 63 Serial number: 5

重建資料庫控制檔案

1)直接使用如下alter database backup controlfile這種會報錯

SQL> alter database backup controlfile to trace as '/data/u01/control_rebuild.trc';

alter database backup controlfile to trace as '/data/u01/control_rebuild.trc'

*

ERROR at line 1:

ORA-16433: The database must be opened in read/write mode.

2)還可以使用如下特定的格式來重建,

查詢資料庫的redo 資訊:

SQL> select GROUP#,MEMBER from v$logfile;

GROUP# MEMBER

3 /data/u01/app/oracle/oradata/stdb59/redo03.log

1 /data/u01/app/oracle/oradata/stdb59/redo01.log

查詢資料庫的datafile資訊

SQL> select MEMBER from v$logfile;

MEMBER

--------------------------------------------------------------------------------

/data/u01/app/oracle/oradata/stdb59/redo03.log

/data/u01/app/oracle/oradata/stdb59/redo01.log

/data/u01/app/oracle/oradata/stdb59/redo04.log

/data/u01/app/oracle/oradata/stdb59/redo05.log

/data/u01/app/oracle/oradata/stdb59/redo06.log

/data/u01/app/oracle/oradata/stdb59/redo07.log

查出資料庫字符集:

SQL> select userenv('language') nls_lang from dual;

NLS_LANG

----------------------------------------------------

AMERICAN_AMERICA.AL32UTF8

然後編輯出建立控制檔案的指令碼:注意這裡的的testdb57為資料庫(db_name),如果是adg轉換成的主庫,不要寫db_unique_name

CREATE CONTROLFILE REUSE DATABASE 'testdb57' NORESETLOGS ARCHIVELOG

MAXLOGFILES 50

MAXLOGMEMBERS 5

MAXDATAFILES 100

MAXINSTANCES 8

MAXLOGHISTORY 226

LOGFILE

GROUP 3 '/data/u01/app/oracle/oradata/stdb59/redo03.log' SIZE 50M,

GROUP 1 '/data/u01/app/oracle/oradata/stdb59/redo01.log' SIZE 50M

DATAFILE

'/data/u01/app/oracle/oradata/stdb59/system01.dbf',

'/data/u01/app/oracle/oradata/stdb59/sysaux01.dbf',

'/data/u01/app/oracle/oradata/stdb59/undotbs01.dbf',

'/data/u01/app/oracle/oradata/stdb59/users01.dbf',

'/data/u01/app/oracle/oradata/stdb59/liuwenhe.dbf',

'/data/u01/app/oracle/oradata/stdb59/soe.dbf',

'/data/u01/app/oracle/oradata/stdb59/soe3.dbf'

CHARACTER SET AL32UTF8;

然後直接將資料庫啟動到nomount狀態,執行建立指令碼即可

SQL> startup nomount pfile='/home/oracle/pfile.ora';

ORACLE instance started.

Total System Global Area 1603411968 bytes

Fixed Size 2253664 bytes

Variable Size 1275071648 bytes

Database Buffers 318767104 bytes

Redo Buffers 7319552 bytes

CREATE CONTROLFILE REUSE DATABASE 'testdb57' NORESETLOGS ARCHIVELOG

MAXLOGFILES 50

MAXLOGMEMBERS 5

MAXDATAFILES 100

MAXINSTANCES 8

MAXLOGHISTORY 226

LOGFILE

GROUP 3 '/data/u01/app/oracle/oradata/stdb59/redo03.log' SIZE 50M,

GROUP 1 '/data/u01/app/oracle/oradata/stdb59/redo01.log' SIZE 50M

DATAFILE

'/data/u01/app/oracle/oradata/stdb59/system01.dbf',

'/data/u01/app/oracle/oradata/stdb59/sysaux01.dbf',

'/data/u01/app/oracle/oradata/stdb59/undotbs01.dbf',

'/data/u01/app/oracle/oradata/stdb59/users01.dbf',

'/data/u01/app/oracle/oradata/stdb59/liuwenhe.dbf',

'/data/u01/app/oracle/oradata/stdb59/soe.dbf',

'/data/u01/app/oracle/oradata/stdb59/soe3.dbf'

CHARACTER SET AL32UTF8;

Control file created.

然後使用oradebug推進記憶體中scn號,以便於執行後面的recover來恢復丟失的redo檔案,因為recover的過程會讀取記憶體中scn。注意 alter session set events '10015 trace name adjust_scn level 10';這種方式在11.2.0.4已經失效了

(題外話:我們先聊聊Oracle的SCN。在資料庫內部,SCN是一個單向遞增的數字編號,控制檔案、資料檔案、線上Redo日誌、歸檔日誌和備份集合中,都包括這個數字編號。在內部檔案中,SCN是透過Base和Wrap兩個部分進行儲存。Base是SCN編號的基礎位,是透過32位二進位制位進行儲存。一旦超過這32位長度,系統會自動在Wrap進位。也就是說,Wrap表示的超過4G個數的進位次數)

SQL> oradebug poke 0x06001AE70 4 0x001B7740

oradebug 推進scn號,poke命令中,第一位引數是對應寫入的記憶體位數,第二位引數是寫入長度,第三位引數是寫入取值。預設寫入取值是10進位制,我們在這裡指定寫入16進位制(0x開頭),每一個取值段,用8個16進位制對應,對應到數字位數是4位

首先查出資料庫的控制檔案中的scn號

SQL> select file#, checkpoint_change# from v$datafile;

FILE# CHECKPOINT_CHANGE#

---------- ------------------

1 21959486

2 21959486

3 21959486

4 21959486

5 21959486

6 21959486

7 21959486

7 rows selected.

SQL> oradebug setmypid

Statement processed.

SQL> oradebug DUMPvar SGA kcsgscn_

kcslf kcsgscn_ [06001AE70, 06001AEA0) = 014F14A2 00000001 00000000 00000000 000000EB 00000000 00000000 00000000 00000000 00000000 6001AB50 00000000

SQL> oradebug poke 0x06001AE70 4 21959486

BEFORE: [06001AE70, 06001AE74) = 00000000

AFTER: [06001AE70, 06001AE74) = 014F133E

(或者可以把21959486轉換成16進位制,然後再修改

SQL> select to_char(21959486, 'XXXXXXXXXXX') from dual;

TO_CHAR(2195

------------

14F133E

SQL> oradebug poke 0x06001AE70 4 0x14F133E

BEFORE: [06001AE70, 06001AE74) = 00000000

AFTER: [06001AE70, 06001AE74) = 014F133E)

再次檢視確實已經變成了014F133E(對應10進位制是21959486)

SQL> oradebug DUMPvar SGA kcsgscn_

kcslf kcsgscn_ [06001AE70, 06001AEA0) = 014F133E 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 6001AB50 00000000

然後執行recover進行不完全恢復:

SQL> recover database until cancel;

ORA-00279: change 21959486 generated at 04/06/2019 23:52:28 needed for thread 1

ORA-00289: suggestion :

/data/u01/app/oracle/fast_recovery_area/STDB59/archivelog/2019_04_07/o1_mf_1_2_%

u_.arc

ORA-00280: change 21959486 for thread 1 is in sequence #2

Specify log: {<RET>=suggested | filename | AUTO | CANCEL}

CANCEL

ORA-01547: warning: RECOVER succeeded but OPEN RESETLOGS would get error below

ORA-01194: file 1 needs more recovery to be consistent

ORA-01110: data file 1: '/data/u01/app/oracle/oradata/stdb59/system01.dbf'

ORA-01112: media recovery not started

SQL> alter database open resetlogs;

Database altered.

至此恢復成功!

三:oracle開歸檔,一致性關閉

這種情況是同情況1,不需要做例項恢復,所以可以直接刪除從新或者recover所有的redo組即可,

方法一:直接clear相應的redo日誌組!也就是刪除重新建立!

SQL> shutdown immediate #一致性關閉

Database closed.

Database dismounted.

ORACLE instance shut down.

SQL> startup mount

ORACLE instance started.

Total System Global Area 1603411968 bytes

Fixed Size 2253664 bytes

Variable Size 1275071648 bytes

Database Buffers 318767104 bytes

Redo Buffers 7319552 bytes

Database mounted.

清理刪除從新建立或者直接clear所有的redo 日誌組,包括當前狀態的和active狀態的redo 日誌組!

SQL> alter database clear logfile group 1;

Database altered.

SQL> alter database clear logfile group 3;

Database altered.

SQL> alter database open ;

Database altered.

方法二:recover的方式恢復重做日誌,我的實驗過程中,有的時候這個方法會報錯,如果報錯那麼就使用第一種方式恢復!

SQL> shutdown immediate

Database closed.

Database dismounted.

ORACLE instance shut down.

SQL> startup mount

ORACLE instance started.

Total System Global Area 830930944 bytes

Fixed Size 2257800 bytes

Variable Size 536874104 bytes

Database Buffers 289406976 bytes

Redo Buffers 2392064 bytes

Database mounted.

SQL>

###恢復丟失的redo檔案,但是需要open resetlogs之後才能自動建立上!

SQL> recover database until cancel;

Media recovery complete.

SQL> alter database open resetlogs;

Database altered.

四:開歸檔,非一致性關閉;

這種情況,只能藉助歸檔日誌做不完全恢復!

SQL> select * from v$log;

GROUP# THREAD# SEQUENCE# BYTES BLOCKSIZE MEMBERS ARC

---------- ---------- ---------- ---------- ---------- ---------- ---

STATUS FIRST_CHANGE# FIRST_TIM NEXT_CHANGE# NEXT_TIME

---------------- ------------- --------- ------------ ---------

1 1 39 52428800 512 1 YES

INACTIVE 4318162327 20-APR-19 4318209770 20-APR-19

3 1 40 52428800 512 1 NO

CURRENT 4318209770 20-APR-19 2.8147E+14

SQL> archive log list;

Database log mode Archive Mode

Automatic archival Enabled

Archive destination USE_DB_RECOVERY_FILE_DEST

Oldest online log sequence 39

Next log sequence to archive 40

Current log sequence 40

刪除redo log檔案

[oracle@testdb59 stdb59]$ rm -f *.log

然後非一致性關閉

SQL> shu abort

ORACLE instance shut down.

解決過程:

SQL> startup mount

ORACLE instance started.

Total System Global Area 1603411968 bytes

Fixed Size 2253664 bytes

Variable Size 1275071648 bytes

Database Buffers 318767104 bytes

Redo Buffers 7319552 bytes

Database mounted.

###恢復丟失的redo檔案,但是需要open resetlogs之後才能自動建立上!

SQL> recover database until cancel;

Media recovery complete.

嘗試resetlog方式開啟,如果報錯如下,那麼還得藉助隱含引數_allow_resetlogs_corruption;

SQL> alter database open RESETLOGS;

alter database open RESETLOGS

*

ERROR at line 1:

ORA-01194: file 1 needs more recovery to be consistent

ORA-01110: data file 1: '/data/u01/app/oracle/oradata/stdb59/system01.dbf'

使用一個隱含引數_allow_resetlogs_corruption強制啟動資料庫,設定此引數之後,在資料庫Open過程中,Oracle會跳過某些一致性檢查,從而使資料庫可能跳過不一致狀態,到達open資料庫的目的

SQL> create pfile='/home/oracle/pfile.ora' from spfile;

File created.

然後在/home/oracle/pfile.ora新增上

*._allow_resetlogs_corruption=true

SQL> startup mount pfile='/home/oracle/pfile.ora';

SQL> alter database open RESETLOGS;

Database altered.

然後一致性關閉資料庫,去掉隱含引數_allow_resetlogs_corruption,重啟資料庫!

總結:不管是開歸檔還是沒開歸檔,只要是非一致性關閉資料庫,就需要藉助隱含引數_allow_resetlogs_corruption,一致性關閉資料庫恢復的話比較簡單,啟動到mount狀態,重建丟失的redo檔案即可!

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/29654823/viewspace-2642066/,如需轉載,請註明出處,否則將追究法律責任。

相關文章