ORA-00600 [kcrfr_update_nab_2]處理過程
ORA-00600 [kcrfr_update_nab_2]處理過程
資料庫testdb1 ,AIX,oracle 10.2.0.1,asm;當刪除partition時掛起,如:
ALTER TABLE DXUSER.HISTWEBCDMA1X DROP PARTITION P20130728;等待事件“DFS lock handle”,這個等待事件為CI跨例項的等待,有DLM管理;由於資料庫是單例項的,涉及跨例項只能是ASM例項;
查詢asm alert日誌,ASM例項有報錯日誌+asm_ora_762040.trc,
*** 2014-03-03 12:15:18.846
*** SERVICE NAME:() 2014-03-03 12:15:18.825
*** SESSION ID:(36.7347) 2014-03-03 12:15:18.825
Waited for detached process: RBAL for 300 seconds:
同時,在errpt中發現報錯:
testdb1#errpt |tail
825849BF 0303104614 T H fcs0 ADAPTER ERROR
C62E1EB7 0303104614 P H hdisk63 DISK OPERATION ERROR
C62E1EB7 0303104614 P H hdisk12 DISK OPERATION ERROR
C62E1EB7 0303104614 P H hdisk124 DISK OPERATION ERROR
C62E1EB7 0303104614 P H hdisk74 DISK OPERATION ERROR
B8FBD189 0303104614 T S fscsi0 SOFTWARE PROGRAM ERROR
B8FBD189 0303104614 T S fscsi0 SOFTWARE PROGRAM ERROR
825849BF 0303104614 T H fcs0 ADAPTER ERROR
825849BF 0303104614 T H fcs0 ADAPTER ERROR
系統報錯顯示,為硬碟或儲存控制器等故障,於是通報故障;經過確認處理,更換儲存部件,然後硬重啟了資料庫伺服器;等我檢查資料庫伺服器時,資料庫不能開啟:
SQL> startup open
ORACLE instance started.
Total System Global Area 1.6744E+10 bytes
Fixed Size 2050200 bytes
Variable Size 1694500712 bytes
Database Buffers 1.5032E+10 bytes
Redo Buffers 14725120 bytes
Database mounted.
ORA-00600: internal error code, arguments: [kcrfr_update_nab_2],[0x7000003EF9D93F0], [2], [], [], [], [], []
檢視alert日誌:
Beginning crash recovery of 1 threads
parallel recovery started with 15 processes
Tue Mar 4 07:47:39 2014
Started redo scan
Tue Mar 4 07:47:40 2014
Errors in file /u01/app/oracle/admin/testdb/udump/testdb_ora_135988.trc:
ORA-00600: internal error code, arguments: [kcrfr_update_nab_2], [0x7000003EF9D
993F0], [2], [], [], [], [], []
Tue Mar 4 07:47:42 2014
Aborting crash recovery due to error 600
接著檢視錯誤日誌:
testdb1$more /u01/app/oracle/admin/testdb/udump/testdb_ora_135988.trc
/u01/app/oracle/admin/testdb/udump/testdb_ora_135988.trc
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - 64bit Production
With the Partitioning, Oracle Label Security, OLAP and Data Mining Scoring Engi
ne options
ORACLE_HOME = /u01/app/oracle/product/10.2.0/db_1
System name: AIX
Node name: testdb1
Release: 3
Version: 5
Machine: 00C051B24C00
Instance name: testdb
Redo thread mounted by this instance: 1
Oracle process number: 16
Unix process pid: 135988, image: oracle@testdb1 (TNS V1-V3)
*** 2014-03-04 07:47:34.099
*** SERVICE NAME:() 2014-03-04 07:47:34.088
*** SESSION ID:(1643.3) 2014-03-04 07:47:34.088
Successfully allocated 15 recovery slaves
Using 20 overflow buffers per recovery slave
Thread 1 checkpoint: logseq 21269, block 2, scn 109974607248
cache-low rba: logseq 21269, block 569541
on-disk rba: logseq 21269, block 584155, scn 109974738191
從上面日誌看是在Started redo scan之後報錯,而報錯的日誌序號為21269,現在檢視logseq21269是哪個日誌,
SQL> select * from v$log;
GROUP# THREAD# SEQUENCE# BYTES MEMBERS ARC STATUS
---------- ---------- ---------- ---------- ---------- --- ----------------
1 1 21268 52428800 2 NO INACTIVE
2 1 21266 52428800 2 NO INACTIVE
6 1 21265 524288000 2 NO INACTIVE
4 1 21269 524288000 2 NO CURRENT
5 1 21264 524288000 2 NO INACTIVE
3 1 21267 52428800 2 NO INACTIVE
日誌組4為,
SQL> select member fromv$logfile;
+SYSDG/testdb/onlinelog/group_4.267.676633559 +DATADG1/testdb/onlinelog/group_4.363.676633561
查詢網路發現這個ORA-00600[kcrfr_update_nab_2]錯誤為罕見報錯,MOS和網路上相關資訊較少;MOS上多認為是bug,沒有繞開和解決方法;只能求助google,找到一篇“kcrfr_update_nab_2”文章,記錄了作者的解決過程(kcrfr_update_nab_2/),大體過程是刪除報錯日誌組中的組員2檔案(即日誌組中的第二個組員),然後recover database,再open,開啟資料庫後重建出錯日誌組;
具體操作:
SQL> startup open
ORACLE instance started.
Total System Global Area 1.6744E+10 bytes
Fixed Size 2050200 bytes
Variable Size 1694500712 bytes
Database Buffers 1.5032E+10 bytes
Redo Buffers 14725120 bytes
Database mounted.
ORA-00600: internal error code, arguments: [kcrfr_update_nab_2],
[0x7000003EF9D93F0], [2], [], [], [], [], []
找到報錯日誌組的redo檔案,刪除member 1檔案,即日誌組的第2個組員檔案;
$asmcmd
ASMCMD> cd +datadg1/testdb/ONLINELOG/
ASMCMD> ls
group_1.360.676633379
group_2.361.676633469
group_3.362.676633477
group_4.363.676633561
group_5.364.676633571
group_6.365.676633579
ASMCMD> rm group_4.363.676633561
SQL> recover database;
Media recovery complete.
SQL> shutdown immediate
SQL>startup open;
資料庫開啟後,要重建報錯redo group,即group 4;
SQL>alter database drop logfile group 4;
SQL>alter database add logfile thread 1 group 4 ('+SYSDG','+DATADG1') size 512M ;
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/16976507/viewspace-1266952/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- 異常處理過程
- DML 語句處理過程
- Nucleus中斷處理過程!!!!
- oracle處理SQL的過程OracleSQL
- 【故障處理】一次RAC故障處理過程
- 【原始碼】Redis命令處理過程原始碼Redis
- 某次BW 異常處理過程
- ora-14452處理過程
- HSG80故障處理過程
- ora-04031處理過程
- SQL語句的處理過程SQL
- 分散裝運處理的過程
- python中PCA的處理過程PythonPCA
- Ceph pg unfound處理過程詳解
- DOM在Ahooks中的處理過程Hook
- ORACLE 查詢語句處理過程(Oracle
- MYSQL匯入中斷處理過程MySql
- SQL語句的處理過程修正SQL
- OnWndMsg函式的處理過程函式
- WCDMA測試庫故障處理過程
- ORA-00600 [4194], [55]處理
- ORA-00600 Error的通用處理Error
- ORA-00600 [13013] [5001] [474] [4198011] [102]-Oracle 問題處理過程Oracle
- 大資料處理過程是怎樣大資料
- ovm安裝過程及中斷處理
- 資料庫變慢的處理過程資料庫
- oracle taf unknown 問題處理過程Oracle
- ORA-00600 [25027]問題處理
- 一次壞塊的處理過程(一)
- zookeeper原始碼 — 五、處理寫請求過程原始碼
- 一次壞塊的處理過程(二)
- MySQL儲存過程的異常處理方法MySql儲存過程
- fastHttp服務端處理請求的過程ASTHTTP服務端
- 記一次ceph pg unfound處理過程
- Flink流處理過程的部分原理分析
- Linux 核心處理中斷全過程解析Linux
- GC析構物件和列表的處理過程GC物件
- undo表空間損壞的處理過程