系統crash掉導致ORA-00600的處理

zhang41082發表於2019-06-14

國慶期間重新整理了機房,一堆機器移行換位,估計oracle沒有關閉就直接拔電源了。到今天開發人員報告一臺測試的db連不上,於是處理開始。

[@more@]

因為這臺機器的資料庫是開機自動啟動的,直接登陸上去檢視listener狀態,正常,然後登陸到資料庫中檢視資料庫狀態:
SQL> select open_mode from v$database;

OPEN_MODE
----------
MOUNTED

奇怪,於是先把資料庫關閉,然後重新啟動,報錯如下:
SQL> startup
ORACLE instance started.

Total System Global Area 1375731712 bytes
Fixed Size 1260780 bytes
Variable Size 603980564 bytes
Database Buffers 754974720 bytes
Redo Buffers 15515648 bytes
Database mounted.
ORA-00600: internal error code, arguments: [kcratr1_lastbwr], [], [], [], [], [], [], []

檢視ALERT日誌,發現
Errors in file /opt/oracle/admin/billdb/udump/billdb_ora_7186.trc:
ORA-00600: internal error code, arguments: [kcratr1_lastbwr], [], [], [], [], [], [], []
ORA-600 signalled during: ALTER DATABASE OPEN...

開啟上面提到的trace檔案
*** SERVICE NAME:() 2007-10-11 10:08:49.493
*** SESSION ID:(989.3) 2007-10-11 10:08:49.493
Successfully allocated 2 recovery slaves
Using 543 overflow buffers per recovery slave
Thread 1 checkpoint: logseq 67179, block 2, scn 2132065607
cache-low rba: logseq 67179, block 5453
on-disk rba: logseq 67179, block 5828, scn 2132066974
Starting CRASH recovery for thread 1 sequence 67180 block 1
Thread 1 current log 5
Scanning log 5 thread 1 sequence 67179
Scanning log 6 thread 1 sequence 67178
Cannot find online redo log for thread 1 sequence 67180
start recovery at logseq 67179, block 5453, scn 0
----- Redo read statistics for thread 1 -----
Read rate (ASYNC): 252Kb in 0.03s => 8.22 Mb/sec
Total physical reads: 252Kb
Longest record: 9Kb, moves: 0/341 (0%)
Change moves: 1/24 (4%), moved: 0Mb
Longest LWN: 35Kb, moves: 0/110 (0%), moved: 0Mb
Last redo scn: 0x0000.7f14c2af (2132066991)
----------------------------------------------
******** WRITE VERIFICATION FAILED ********
File 15 Block 89 (rdba 0x3c00059)
BWR version: 0x0000.7f14c25e.01 flg: 0x04
Disk version: 0x0000.7f14c12d.01 flag: 0x04
*** 2007-10-11 10:08:49.541
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [kcratr1_lastbwr], [], [], [], [], [], [], []
Current SQL statement for this session:
ALTER DATABASE OPEN

上面提示說貌似找不到67180的日誌了,去日誌目錄看看明明在的,於是跑到metalink上去,查到如下結果:
Changes
There was a disk problem that caused the database to crash.
Cause
Oracle is unable to perform instance recover but it works when is invoked manually.
Solution
Mount the database and issue a recover statement

SQL> startup mount;

SQL> recover database;

SQL> alter database open

於是照搬上面的步驟,手工恢復後問題解決。
但搞不明白為啥oracle自己不能恢復,手工恢復就可以呢?下一步還是查查硬體錯誤吧。

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/25016/viewspace-975790/,如需轉載,請註明出處,否則將追究法律責任。

相關文章