【故障診斷】cr塊slot notfound解決過程
在一次嚴重故障中控制檔案重建,強行拉起庫,前面的問題解決了(HA引發的血案,涉及他人誤操作,具體略)
資料庫和應用起來後卻遇到大量cr塊slot notfound的問題,時間正是業務高峰的白天,一陣頭皮發麻,具體報錯內容:
alert:
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [kdsgrp1], [], [], [], [], [], [], []
相關trace
tpioadb1$ls -trl *.trc
-rw-r----- 1 oracle oinstall 181401269 Oct 14 12:27 oa_ora_17825870.trc
-rw-r----- 1 oracle oinstall 181462560 Oct 14 12:28 oa_ora_49872920.trc
-rw-r----- 1 oracle oinstall 181776046 Oct 14 12:29 oa_ora_52166770.trc
-rw-r----- 1 oracle oinstall 182010875 Oct 14 12:29 oa_ora_47317200.trc
-rw-r----- 1 oracle oinstall 182080391 Oct 14 12:30 oa_ora_198714.trc
-rw-r----- 1 oracle oinstall 182205907 Oct 14 12:30 oa_ora_47448310.trc
-rw-r----- 1 oracle oinstall 182159439 Oct 14 12:31 oa_ora_48234660.trc
-rw-r----- 1 oracle oinstall 182186772 Oct 14 12:31 oa_ora_41617502.trc
-rw-r----- 1 oracle oinstall 182186742 Oct 14 12:31 oa_ora_48824328.trc
-rw-r----- 1 oracle oinstall 182186771 Oct 14 12:31 oa_ora_52166772.trc
-rw-r----- 1 oracle oinstall 182258339 Oct 14 12:32 oa_ora_39913614.trc
-rw-r----- 1 oracle oinstall 182258369 Oct 14 12:32 oa_ora_30148840.trc
-rw-r----- 1 oracle oinstall 182375997 Oct 14 12:32 oa_ora_17825876.trc
-rw-r----- 1 oracle oinstall 182408523 Oct 14 12:32 oa_ora_66256924.trc
-rw-r----- 1 oracle oinstall 182408580 Oct 14 12:32 oa_ora_20121758.trc
-rw-r----- 1 oracle oinstall 182490385 Oct 14 12:32 oa_ora_6424806.trc
-rw-r----- 1 oracle oinstall 182490387 Oct 14 12:33 oa_ora_15272010.trc
-rw-r----- 1 oracle oinstall 182490356 Oct 14 12:33 oa_ora_4851758.trc
-rw-r----- 1 oracle oinstall 182516313 Oct 14 12:33 oa_ora_16713804.trc
-rw-r----- 1 oracle oinstall 182594432 Oct 14 12:33 oa_ora_28510210.trc
-rw-r----- 1 oracle oinstall 182616536 Oct 14 12:33 oa_ora_30015720.trc
-rw-r----- 1 oracle oinstall 182700158 Oct 14 12:33 oa_ora_25102374.trc
-rw-r----- 1 oracle oinstall 182734746 Oct 14 12:34 oa_ora_28641506.trc
-rw-r----- 1 oracle oinstall 182734715 Oct 14 12:34 oa_ora_3016860.trc
-rw-r----- 1 oracle oinstall 182841440 Oct 14 12:34 oa_ora_52429006.trc
-rw-r----- 1 oracle oinstall 182865907 Oct 14 12:34 oa_ora_47185978.trc
-rw-r----- 1 oracle oinstall 182865909 Oct 14 12:34 oa_ora_57212934.trc
-rw-r----- 1 oracle oinstall 182865907 Oct 14 12:34 oa_ora_23267338.trc
-rw-r----- 1 oracle oinstall 182865938 Oct 14 12:34 oa_ora_14354628.trc
-rw-r----- 1 oracle oinstall 70326197 Oct 14 12:34 oa_ora_7276572.trc
-rw-r----- 1 oracle oinstall 152186229 Oct 14 12:34 oa_ora_52953126.trc
-rw-r----- 1 oracle oinstall 96766948 Oct 14 12:34 oa_ora_47448312.trc
-rw-r----- 1 oracle oinstall 25950203 Oct 14 12:34 oa_ora_44040304.trc
-rw-r----- 1 oracle oinstall 124525 Oct 14 12:34 oa_ora_30015724.trc
-rw-r----- 1 oracle oinstall 11891981 Oct 14 12:34 oa_ora_23332896.trc
-rw-r----- 1 oracle oinstall 67996842 Oct 14 12:34 oa_ora_20580500.trc
-rw-r----- 1 oracle oinstall 50813235 Oct 14 12:34 oa_ora_17301742.trc
-rw-r----- 1 oracle oinstall 171017368 Oct 14 12:34 oa_ora_14811386.trc
tpioadb1$more oa_ora_52429006.trc
Trace file /home/oracle/app/diag/rdbms/oa/oa/trace/oa_ora_52429006.trc
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Oracle Label Security, OLAP, Data Mining,
Oracle Database Vault and Real Application Testing options
ORACLE_HOME = /home/oracle/app/product/11.2.0/db_1
System name: AIX
Node name: tpioadb1
Release: 1
Version: 6
Machine: 00CB77B54C00
Instance name: oa
Redo thread mounted by this instance: 1
Oracle process number: 169
Unix process pid: 52429006, image: oracle@tpioadb1
*** 2014-10-14 12:33:21.941
*** SESSION ID:(1289.36435) 2014-10-14 12:33:21.941
*** CLIENT ID:() 2014-10-14 12:33:21.941
*** SERVICE NAME:(SYS$USERS) 2014-10-14 12:33:21.941
*** MODULE NAME:(JDBC Thin Client) 2014-10-14 12:33:21.941
*** ACTION NAME:() 2014-10-14 12:33:21.941
* kdsgrp1-1: *************************************************
row 0x024e73b1.97 continuation at
0x024e73b1.97 file# 9 block# 947121 slot 151 not found
KDSTABN_GET: 0 ..... ntab: 1
curSlot: 151 ..... nrows: 216
kdsgrp - dump CR block dba=0x024e73b1
Block header dump: 0x024e73b1
Object id on Block? Y
seg/obj: 0x1ce1e csc: 0x638.28d23ba0 itc: 2 flg: E typ: 1 - DATA
brn: 0 bdba: 0x24e7281 ver: 0x01 opc: 0
inc: 0 exflg: 0
Itl Xid Uba Flag Lck Scn/Fsc
0x01 0x003b.012.0000246a 0x00c048a5.011d.32 --U- 1 fsc 0x0016.28d23ba1
0x02 0x0002.006.00021b71 0x00c03c7f.1478.18 C--- 0 scn 0x0638.28d21c9c
bdba: 0x024e73b1
data_block_dump,data header at 0x70001028890e064
===============
通過語句找到該物件和塊:
Select owner, segment_name, segment_type, partition_name,tablespace_name
From dba_extents
Where relative_fno = 9
And 947121 between block_id and (block_id+blocks-1);
dump該塊,發現該塊裡面只有1個table的216行,由於系統已經帶病執行,且無可用備份,無停機時間,意味著最多可能丟失這個表的216行資料。
dump內容如下
tpioadb1$more oa_ora_38142130.trc
Trace file /home/oracle/app/diag/rdbms/oa/oa/trace/oa_ora_38142130.trc
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Oracle Label Security, OLAP, Data Mining,
Oracle Database Vault and Real Application Testing options
ORACLE_HOME = /home/oracle/app/product/11.2.0/db_1
System name: AIX
Node name: tpioadb1
Release: 1
Version: 6
Machine: 00CB77B54C00
Instance name: oa
Redo thread mounted by this instance: 1
Oracle process number: 177
Unix process pid: 38142130, image: oracle@tpioadb1
*** 2014-10-14 14:39:49.105
*** SESSION ID:(167.64629) 2014-10-14 14:39:49.105
*** CLIENT ID:() 2014-10-14 14:39:49.105
*** SERVICE NAME:(SYS$USERS) 2014-10-14 14:39:49.105
*** MODULE NAME:(JDBC Thin Client) 2014-10-14 14:39:49.105
*** ACTION NAME:() 2014-10-14 14:39:49.105
* kdsgrp1-1: *************************************************
row 0x024e73b1.97 continuation at
0x024e73b1.97 file# 9 block# 947121 slot 151 not found
KDSTABN_GET: 0 ..... ntab: 1
curSlot: 151 ..... nrows: 216
kdsgrp - dump CR block dba=0x024e73b1
Block header dump: 0x024e73b1
Object id on Block? Y
seg/obj: 0x1ce1e csc: 0x638.29bb4112 itc: 2 flg: E typ: 1 - DATA
brn: 0 bdba: 0x24e7281 ver: 0x01 opc: 0
inc: 0 exflg: 0
Itl Xid Uba Flag Lck Scn/Fsc
0x01 0x002d.01a.0000a6d7 0x04c08774.04a9.29 --U- 1 fsc 0x0017.29bb4115
0x02 0x002f.007.0000945e 0x00c02f56.0412.0d C--- 0 scn 0x0638.29bb08d6
bdba: 0x024e73b1
data_block_dump,data header at 0x7000105876a6064
===============
tsiz: 0x1f98
hsiz: 0x1c2
pbl: 0x7000105876a6064
76543210
flag=--------
ntab=1
nrow=216
frre=29
fsbo=0x1c2
fseo=0x7d4
avsp=0xbe9
tosp=0xc02
0xe:pti[0] nrow=216 offs=0
後通過業務分析,找到白天有dml的行,通過create as select查詢者部分行,發現會話hang,報錯依舊,採用逐行insert的方式,最終定位出有問題的只有一行,通過dump的行資料,將關鍵欄位轉化出來給業務確認,確認為該行為無效行,剔除該行後,安排在下班後維護視窗內重建該表,問題解決,業務未受到影響。
資料庫和應用起來後卻遇到大量cr塊slot notfound的問題,時間正是業務高峰的白天,一陣頭皮發麻,具體報錯內容:
alert:
ksedmp: internal or fatal error
ORA-00600: internal error code, arguments: [kdsgrp1], [], [], [], [], [], [], []
相關trace
tpioadb1$ls -trl *.trc
-rw-r----- 1 oracle oinstall 181401269 Oct 14 12:27 oa_ora_17825870.trc
-rw-r----- 1 oracle oinstall 181462560 Oct 14 12:28 oa_ora_49872920.trc
-rw-r----- 1 oracle oinstall 181776046 Oct 14 12:29 oa_ora_52166770.trc
-rw-r----- 1 oracle oinstall 182010875 Oct 14 12:29 oa_ora_47317200.trc
-rw-r----- 1 oracle oinstall 182080391 Oct 14 12:30 oa_ora_198714.trc
-rw-r----- 1 oracle oinstall 182205907 Oct 14 12:30 oa_ora_47448310.trc
-rw-r----- 1 oracle oinstall 182159439 Oct 14 12:31 oa_ora_48234660.trc
-rw-r----- 1 oracle oinstall 182186772 Oct 14 12:31 oa_ora_41617502.trc
-rw-r----- 1 oracle oinstall 182186742 Oct 14 12:31 oa_ora_48824328.trc
-rw-r----- 1 oracle oinstall 182186771 Oct 14 12:31 oa_ora_52166772.trc
-rw-r----- 1 oracle oinstall 182258339 Oct 14 12:32 oa_ora_39913614.trc
-rw-r----- 1 oracle oinstall 182258369 Oct 14 12:32 oa_ora_30148840.trc
-rw-r----- 1 oracle oinstall 182375997 Oct 14 12:32 oa_ora_17825876.trc
-rw-r----- 1 oracle oinstall 182408523 Oct 14 12:32 oa_ora_66256924.trc
-rw-r----- 1 oracle oinstall 182408580 Oct 14 12:32 oa_ora_20121758.trc
-rw-r----- 1 oracle oinstall 182490385 Oct 14 12:32 oa_ora_6424806.trc
-rw-r----- 1 oracle oinstall 182490387 Oct 14 12:33 oa_ora_15272010.trc
-rw-r----- 1 oracle oinstall 182490356 Oct 14 12:33 oa_ora_4851758.trc
-rw-r----- 1 oracle oinstall 182516313 Oct 14 12:33 oa_ora_16713804.trc
-rw-r----- 1 oracle oinstall 182594432 Oct 14 12:33 oa_ora_28510210.trc
-rw-r----- 1 oracle oinstall 182616536 Oct 14 12:33 oa_ora_30015720.trc
-rw-r----- 1 oracle oinstall 182700158 Oct 14 12:33 oa_ora_25102374.trc
-rw-r----- 1 oracle oinstall 182734746 Oct 14 12:34 oa_ora_28641506.trc
-rw-r----- 1 oracle oinstall 182734715 Oct 14 12:34 oa_ora_3016860.trc
-rw-r----- 1 oracle oinstall 182841440 Oct 14 12:34 oa_ora_52429006.trc
-rw-r----- 1 oracle oinstall 182865907 Oct 14 12:34 oa_ora_47185978.trc
-rw-r----- 1 oracle oinstall 182865909 Oct 14 12:34 oa_ora_57212934.trc
-rw-r----- 1 oracle oinstall 182865907 Oct 14 12:34 oa_ora_23267338.trc
-rw-r----- 1 oracle oinstall 182865938 Oct 14 12:34 oa_ora_14354628.trc
-rw-r----- 1 oracle oinstall 70326197 Oct 14 12:34 oa_ora_7276572.trc
-rw-r----- 1 oracle oinstall 152186229 Oct 14 12:34 oa_ora_52953126.trc
-rw-r----- 1 oracle oinstall 96766948 Oct 14 12:34 oa_ora_47448312.trc
-rw-r----- 1 oracle oinstall 25950203 Oct 14 12:34 oa_ora_44040304.trc
-rw-r----- 1 oracle oinstall 124525 Oct 14 12:34 oa_ora_30015724.trc
-rw-r----- 1 oracle oinstall 11891981 Oct 14 12:34 oa_ora_23332896.trc
-rw-r----- 1 oracle oinstall 67996842 Oct 14 12:34 oa_ora_20580500.trc
-rw-r----- 1 oracle oinstall 50813235 Oct 14 12:34 oa_ora_17301742.trc
-rw-r----- 1 oracle oinstall 171017368 Oct 14 12:34 oa_ora_14811386.trc
tpioadb1$more oa_ora_52429006.trc
Trace file /home/oracle/app/diag/rdbms/oa/oa/trace/oa_ora_52429006.trc
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Oracle Label Security, OLAP, Data Mining,
Oracle Database Vault and Real Application Testing options
ORACLE_HOME = /home/oracle/app/product/11.2.0/db_1
System name: AIX
Node name: tpioadb1
Release: 1
Version: 6
Machine: 00CB77B54C00
Instance name: oa
Redo thread mounted by this instance: 1
Oracle process number: 169
Unix process pid: 52429006, image: oracle@tpioadb1
*** 2014-10-14 12:33:21.941
*** SESSION ID:(1289.36435) 2014-10-14 12:33:21.941
*** CLIENT ID:() 2014-10-14 12:33:21.941
*** SERVICE NAME:(SYS$USERS) 2014-10-14 12:33:21.941
*** MODULE NAME:(JDBC Thin Client) 2014-10-14 12:33:21.941
*** ACTION NAME:() 2014-10-14 12:33:21.941
* kdsgrp1-1: *************************************************
row 0x024e73b1.97 continuation at
0x024e73b1.97 file# 9 block# 947121 slot 151 not found
KDSTABN_GET: 0 ..... ntab: 1
curSlot: 151 ..... nrows: 216
kdsgrp - dump CR block dba=0x024e73b1
Block header dump: 0x024e73b1
Object id on Block? Y
seg/obj: 0x1ce1e csc: 0x638.28d23ba0 itc: 2 flg: E typ: 1 - DATA
brn: 0 bdba: 0x24e7281 ver: 0x01 opc: 0
inc: 0 exflg: 0
Itl Xid Uba Flag Lck Scn/Fsc
0x01 0x003b.012.0000246a 0x00c048a5.011d.32 --U- 1 fsc 0x0016.28d23ba1
0x02 0x0002.006.00021b71 0x00c03c7f.1478.18 C--- 0 scn 0x0638.28d21c9c
bdba: 0x024e73b1
data_block_dump,data header at 0x70001028890e064
===============
Select owner, segment_name, segment_type, partition_name,tablespace_name
From dba_extents
Where relative_fno = 9
And 947121 between block_id and (block_id+blocks-1);
dump該塊,發現該塊裡面只有1個table的216行,由於系統已經帶病執行,且無可用備份,無停機時間,意味著最多可能丟失這個表的216行資料。
dump內容如下
tpioadb1$more oa_ora_38142130.trc
Trace file /home/oracle/app/diag/rdbms/oa/oa/trace/oa_ora_38142130.trc
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Oracle Label Security, OLAP, Data Mining,
Oracle Database Vault and Real Application Testing options
ORACLE_HOME = /home/oracle/app/product/11.2.0/db_1
System name: AIX
Node name: tpioadb1
Release: 1
Version: 6
Machine: 00CB77B54C00
Instance name: oa
Redo thread mounted by this instance: 1
Oracle process number: 177
Unix process pid: 38142130, image: oracle@tpioadb1
*** 2014-10-14 14:39:49.105
*** SESSION ID:(167.64629) 2014-10-14 14:39:49.105
*** CLIENT ID:() 2014-10-14 14:39:49.105
*** SERVICE NAME:(SYS$USERS) 2014-10-14 14:39:49.105
*** MODULE NAME:(JDBC Thin Client) 2014-10-14 14:39:49.105
*** ACTION NAME:() 2014-10-14 14:39:49.105
* kdsgrp1-1: *************************************************
row 0x024e73b1.97 continuation at
0x024e73b1.97 file# 9 block# 947121 slot 151 not found
KDSTABN_GET: 0 ..... ntab: 1
curSlot: 151 ..... nrows: 216
kdsgrp - dump CR block dba=0x024e73b1
Block header dump: 0x024e73b1
Object id on Block? Y
seg/obj: 0x1ce1e csc: 0x638.29bb4112 itc: 2 flg: E typ: 1 - DATA
brn: 0 bdba: 0x24e7281 ver: 0x01 opc: 0
inc: 0 exflg: 0
Itl Xid Uba Flag Lck Scn/Fsc
0x01 0x002d.01a.0000a6d7 0x04c08774.04a9.29 --U- 1 fsc 0x0017.29bb4115
0x02 0x002f.007.0000945e 0x00c02f56.0412.0d C--- 0 scn 0x0638.29bb08d6
bdba: 0x024e73b1
data_block_dump,data header at 0x7000105876a6064
===============
tsiz: 0x1f98
hsiz: 0x1c2
pbl: 0x7000105876a6064
76543210
flag=--------
ntab=1
nrow=216
frre=29
fsbo=0x1c2
fseo=0x7d4
avsp=0xbe9
tosp=0xc02
0xe:pti[0] nrow=216 offs=0
後通過業務分析,找到白天有dml的行,通過create as select查詢者部分行,發現會話hang,報錯依舊,採用逐行insert的方式,最終定位出有問題的只有一行,通過dump的行資料,將關鍵欄位轉化出來給業務確認,確認為該行為無效行,剔除該行後,安排在下班後維護視窗內重建該表,問題解決,業務未受到影響。
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/16179598/viewspace-1329351/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- 一次DG故障診斷過程分析
- enq: HW - contention診斷及解決過程ENQ
- Oracle故障診斷Oracle
- 光纖故障診斷和故障排查
- ASM磁碟故障診斷(二)ASM
- ASM磁碟故障診斷(一)ASM
- 故障診斷學習工具
- RAC故障診斷指令碼指令碼
- 故障分析 | Kubernetes 故障診斷流程
- 9 Oracle Data Guard 故障診斷Oracle
- DB2故障診斷工具DB2
- mysql複製故障診斷與排除MySql
- 部落格連結—Oracle故障診斷Oracle
- 一次library cache pin故障的解決過程
- 如何診斷RAC系統中的'gc cr multi block request'?GCBloC
- websphere中介軟體故障診斷troubleshootingWeb
- 利用 Java dump 進行 JVM 故障診斷JavaJVM
- 伺服器硬碟掉了兩塊的解決過程伺服器硬碟
- 故障診斷為什麼要用深度學習?深度學習
- Cisco路由器故障診斷技術(轉)路由器
- sql導致資料庫整體效能下降的診斷和解決的全過程SQL資料庫
- 記一次使用gdb診斷gc問題全過程GC
- .記一次使用gdb診斷gc問題全過程GC
- 遭遇cursor:pin x等待事件定位阻塞會話診斷過程事件會話
- 軟體專案過程診斷與改進建議案例
- Solaris Linux SSH緩慢診斷與解決Linux
- 診斷並解決ORA-04031 錯誤
- 【故障處理】ORA-30012的解決過程
- 儲存過程遇到gc cr request等待儲存過程GC
- oracle之 redo過高診斷Oracle
- 大語言模型與資料庫故障診斷模型資料庫
- oracle 10046事件故障診斷一例Oracle事件
- 【記錄】Linux 系統故障診斷與排除Linux
- Oracle___診斷案例__資料庫的exp故障Oracle資料庫
- IBM Lotus Sametime Servers 故障診斷與排除IBMServer
- 風機故障診斷學習資源(更新中)
- ORA-8103 "object no longer exists" 診斷與解決Object
- 經緯恆潤遠端診斷車雲解決方案——下一代診斷技術