10G BUG: ORA-00600: internal error code, arguments: [kmgs_pre_process_request_6]

tolywang發表於2007-07-24

經查警告日誌中有下面的錯誤資訊:

Mon Jul 16 17:59:46 2007

Errors in file /adosprod/dump/bdump/adosprod_mman_381670.trc:

ORA-00600: internal error code, arguments: [kmgs_pre_process_request_6], [6], [453], [64], [3], [0x700000208C55290], [], []

Mon Jul 16 17:59:47 2007


Errors in file /adosprod/dump/bdump/adosprod_mman_381670.trc:

ORA-00600: internal error code, arguments: [kmgs_pre_process_request_6], [6], [453], [64], [3], [0x700000208C55290], [], []

Mon Jul 16 17:59:47 2007

MMAN: terminating instance due to error 822

Mon Jul 16 17:59:47 2007

Errors in file /adosprod/dump/bdump/adosprod_lns1_746478.trc:

ORA-00822: MMAN process terminated with error

Instance terminated by MMAN, pid = 381670

顯然,資料庫在遭遇600錯誤後例項被關閉。在metalink上查詢,發現是Bug 4433838

The error occurs when the parameter SGA_TARGET is set to an exact multiple of 4Gb.

處理辦法是:

Ensure the value set for the parameter SGA_TARGET is not an exact multiple of 4Gb.

使用者的SGA_TARGET8G,正好是4G的整數倍,本次故障遭遇該BUG的可能性非常大,告之使用者修改該引數,並持續觀察。

10G的BUG已經遇到過很多了,5月底一溫州客戶的ASM例項不能mount DISKGROUP,經METALINK確認也是一BUG,下面是當時的情況:

使用者將資料庫(RAC)從10.2.0.1升級到10.2.0.3後,試圖啟動兩個ASM例項時出同樣錯誤:
SQL> startup
ASM instance started

Total System Global Area 130023424 bytes
Fixed Size 2043664 bytes
Variable Size 102813936 bytes
ASM Cache 25165824 bytes
ORA-15032: not all alterations performed
ORA-15063: ASM discovered an insufficient number of disks for diskgroup
"ORADATA"


SQL> select * from v$asm_disk;

no rows selected

SQL> select * from v$asm_diskgroup;

no rows selected
1、查詢例項引數如下:
*.background_dump_dest='/oracle/product/10.2.0/admin/+ASM/bdump'
*.cluster_database=true
*.core_dump_dest='/oracle/product/10.2.0/admin/+ASM/cdump'
*.instance_type='asm'
*.large_pool_size=12M
*.remote_login_passwordfile='SHARED'
*.user_dump_dest='/oracle/product/10.2.0/admin/+ASM/udump'
+ASM1.instance_number=1
+ASM2.instance_number=2
*.asm_diskgroups='ORADATA'
*.asm_diskstring='/dev/vgora/*'
從引數來看沒有發現什麼問題。
2、從啟動ASM例項所出現的錯誤來看,顯然是因找不到DISKGROUP所指定的磁碟所導致。
3、進一步查詢卷組vgora的資訊,發現從作業系統上看該卷組的狀態和所包含PV的狀態都是正常的:
WZORA2:/#vgdisplay -v vgora
--- Volume groups ---
VG Name /dev/vgora
VG Write Access read/write
VG Status available, shared, server
Max LV 255
Cur LV 4
Open LV 4
Max PV 16
Cur PV 1
Act PV 1
Max PE per PV 65535
VGDA 2
PE Size (Mbytes) 16
Total PE 26242
Alloc PE 26220
Free PE 22
Total PVG 0
Total Spare PVs 0
Total Spare PVs in use 0

--- ---

WZORA2 Server
WZORA1 Client

--- Logical volumes ---
LV Name /dev/vgora/lvocr
LV Status available/syncd
LV Size (Mbytes) 304
Current LE 19
Allocated PE 19
Used PV 1

LV Name /dev/vgora/lvvote
LV Status available/syncd
LV Size (Mbytes) 48
Current LE 3
Allocated PE 3
Used PV 1

LV Name /dev/vgora/lvdata
LV Status available/syncd
LV Size (Mbytes) 419008
Current LE 26188
Allocated PE 26188
Used PV 1

LV Name /dev/vgora/asm
LV Status available/syncd
LV Size (Mbytes) 160
Current LE 10
Allocated PE 10
Used PV 1


--- Physical volumes ---
PV Name /dev/dsk/c4t0d2
PV Name /dev/dsk/c8t0d2 Alternate Link
PV Status available
Total PE 26242
Free PE 22
Autoswitch On

4、現在的問題是作業系統上狀態正常的盤為什麼在ASM例項啟動的時候不能被發現。
5、分別執行下面的命令,確定問題原因不是因為許可權問題導致。
$ cd $ORACLE_HOME/bin
$ ls -ltr oracle
-rwsr-s--x 1 oracle oinstall 284370040 May 26 10:25 oracle
$ cd /dev/dsk
$ ls -ltr
total 0
brw-r----- 1 bin sys 31 0x000000 Mar 21 02:55 c0t0d0
brw-r----- 1 bin sys 31 0x021000 Mar 21 02:55 c2t1d0
brw-r----- 1 bin sys 31 0x021002 Mar 21 02:55 c2t1d0s2
brw-r----- 1 bin sys 31 0x021003 Mar 21 02:55 c2t1d0s3
brw-r----- 1 bin sys 31 0x030000 Mar 21 02:55 c3t0d0
brw-r----- 1 bin sys 31 0x021001 Mar 21 03:09 c2t1d0s1
brw-r----- 1 bin sys 31 0x060000 Mar 29 17:06 c6t0d0
brw-r----- 1 bin sys 31 0x040200 Mar 29 17:06 c4t0d2
brw-r----- 1 bin sys 31 0x040100 Mar 29 17:06 c4t0d1
brw-r----- 1 bin sys 31 0x070000 Mar 29 17:08 c7t0d0
brw-r----- 1 bin sys 31 0x080100 Mar 29 17:08 c8t0d1
brw-r----- 1 bin sys 31 0x080200 Mar 29 17:09 c8t0d2
brw-r----- 1 bin sys 31 0x030001 Mar 29 17:18 c3t0d0s1
brw-r----- 1 bin sys 31 0x030002 Mar 29 17:18 c3t0d0s2
brw-r----- 1 bin sys 31 0x030003 Mar 29 17:18 c3t0d0s3
$ cd /dev/rdsk
$ ls -ltr
total 0
crw-r--r-- 1 bin sys 188 0x000000 Mar 21 02:55 c0t0d0
crw-r--r-- 1 bin sys 188 0x021003 Mar 21 02:55 c2t1d0s3
crw-r--r-- 1 bin sys 188 0x060000 Mar 29 17:06 c6t0d0
crw-r--r-- 1 bin sys 188 0x040100 Mar 29 17:06 c4t0d1
crw-r--r-- 1 bin sys 188 0x040200 Mar 29 17:06 c4t0d2
crw-r--r-- 1 bin sys 188 0x070000 Mar 29 17:08 c7t0d0
crw-r--r-- 1 bin sys 188 0x030000 Mar 29 17:17 c3t0d0
crw-r--r-- 1 bin sys 188 0x021001 Mar 29 17:20 c2t1d0s1
crw-r--r-- 1 bin sys 188 0x030001 Mar 29 17:20 c3t0d0s1
crw-r--r-- 1 bin sys 188 0x030003 Mar 29 17:21 c3t0d0s3
crw-r--r-- 1 bin sys 188 0x080100 Mar 29 17:41 c8t0d1
crw-r--r-- 1 bin sys 188 0x080200 Mar 29 17:41 c8t0d2
crw-r--r-- 1 bin sys 188 0x021002 Mar 29 17:48 c2t1d0s2
crw-r--r-- 1 bin sys 188 0x030002 Mar 29 17:48 c3t0d0s2
crw-r--r-- 1 bin sys 188 0x021000 Apr 24 10:54 c2t1d0
卷組vgora對應的兩個PV為/dev/dsk/c4t0d2和/dev/dsk/c8t0d2。執行下面的命令,將PV的屬主修改為oracle:oinstall
WZORA2:/# cd /dev/dsk
WZORA2:/dev/dsk#chown oracle:oinstall /dev/dsk/c4t0d2
WZORA2:/dev/dsk#chown oracle:oinstall /dev/dsk/c8t0d2
WZORA2:/dev/dsk#cd /dev/rdsk
WZORA2:/dev/rdsk#chown oracle:oinstall /dev/dsk/c4t0d2
WZORA2:/dev/rdsk#chown oracle:oinstall /dev/dsk/c8t0d2
重新啟動ASM例項,仍然失敗。
SQL> startup
ASM instance started

Total System Global Area 130023424 bytes
Fixed Size 2043664 bytes
Variable Size 102813936 bytes
ASM Cache 25165824 bytes
ORA-15032: not all alterations performed
ORA-15063: ASM discovered an insufficient number of disks for diskgroup
"ORADATA"
可見原因也不是許可權問題。

6、用kfed工具對盤組所對應的盤進行了讀取測試,發現讀取是正常的。
$ cd $ORACLE_HOME/rdbms/lib
$ cp ins_rdbms.mk ins_rdbms.mk_prekfed

ikfod: $(KFOD)
-mv -f $(ORACLE_HOME)/bin/kfod $(ORACLE_HOME)/bin/kfod0
-mv $(ORACLE_HOME)/rdbms/lib/kfod $(ORACLE_HOME)/bin/kfod
-chmod 751 $(ORACLE_HOME)/bin/kfod
改為
ikfod: $(KFOD)
-mv -f $(ORACLE_HOME)/bin/kfod $(ORACLE_HOME)/bin/kfod0
-mv $(ORACLE_HOME)/rdbms/lib/kfod $(ORACLE_HOME)/bin/kfod
-chmod 751 $(ORACLE_HOME)/bin/kfod

ikfed: $(KFED)
-mv -f $(ORACLE_HOME)/bin/kfed $(ORACLE_HOME)/bin/kfed0
-mv $(ORACLE_HOME)/rdbms/lib/kfed $(ORACLE_HOME)/bin/kfed
-chmod 751 $(ORACLE_HOME)/bin/kfed

$ make -f ins_rdbms.mk ikfed
$ kfed read /dev/vgora/rlvdata
kfbh.endian: 0 ; 0x000: 0x00
kfbh.hard: 130 ; 0x001: 0x82
kfbh.type: 1 ; 0x002: KFBTYP_DISKHEAD
kfbh.datfmt: 1 ; 0x003: 0x01
kfbh.block.blk: 0 ; 0x004: T=0 NUMB=0x0
kfbh.block.obj: 2147483648 ; 0x008: TYPE=0x8 NUMB=0x0
kfbh.check: 1585696588 ; 0x00c: 0x5e83cf4c
kfbh.fcn.base: 0 ; 0x010: 0x00000000
kfbh.fcn.wrap: 0 ; 0x014: 0x00000000
kfbh.spare1: 0 ; 0x018: 0x00000000
kfbh.spare2: 0 ; 0x01c: 0x00000000
kfdhdb.driver.provstr: ORCLDISK ; 0x000: length=8
kfdhdb.driver.reserved[0]: 0 ; 0x008: 0x00000000
kfdhdb.driver.reserved[1]: 0 ; 0x00c: 0x00000000
kfdhdb.driver.reserved[2]: 0 ; 0x010: 0x00000000
kfdhdb.driver.reserved[3]: 0 ; 0x014: 0x00000000
kfdhdb.driver.reserved[4]: 0 ; 0x018: 0x00000000
kfdhdb.driver.reserved[5]: 0 ; 0x01c: 0x00000000
kfdhdb.compat: 168820736 ; 0x020: 0x0a100000
kfdhdb.dsknum: 0 ; 0x024: 0x0000
kfdhdb.grptyp: 1 ; 0x026: KFDGTP_EXTERNAL
kfdhdb.hdrsts: 3 ; 0x027: KFDHDR_MEMBER
kfdhdb.dskname: ORADATA_0000 ; 0x028: length=12
kfdhdb.grpname: ORADATA ; 0x048: length=7
kfdhdb.fgname: ORADATA_0000 ; 0x068: length=12
kfdhdb.capname: ; 0x088: length=0
kfdhdb.crestmp.hi: 32887249 ; 0x0a8: HOUR=0x11 DAYS=0xe MNTH=0x4 YEAR=0x7d7
kfdhdb.crestmp.lo: 3430825984 ; 0x0ac: USEC=0x0 MSEC=0x390 SECS=0x7 MINS=0x33
kfdhdb.mntstmp.hi: 32888649 ; 0x0b0: HOUR=0x9 DAYS=0x1a MNTH=0x5 YEAR=0x7d7
kfdhdb.mntstmp.lo: 3740819456 ; 0x0b4: USEC=0x0 MSEC=0x218 SECS=0x2f MINS=0x37
kfdhdb.secsize: 1024 ; 0x0b8: 0x0400
kfdhdb.blksize: 4096 ; 0x0ba: 0x1000
kfdhdb.ausize: 1048576 ; 0x0bc: 0x00100000
kfdhdb.mfact: 113792 ; 0x0c0: 0x0001bc80
kfdhdb.dsksize: 419008 ; 0x0c4: 0x000664c0
kfdhdb.pmcnt: 5 ; 0x0c8: 0x00000005
kfdhdb.fstlocn: 1 ; 0x0cc: 0x00000001
kfdhdb.altlocn: 2 ; 0x0d0: 0x00000002
kfdhdb.f1b1locn: 2 ; 0x0d4: 0x00000002
kfdhdb.redomirrors[0]: 0 ; 0x0d8: 0x0000
kfdhdb.redomirrors[1]: 0 ; 0x0da: 0x0000
kfdhdb.redomirrors[2]: 0 ; 0x0dc: 0x0000
kfdhdb.redomirrors[3]: 0 ; 0x0de: 0x0000
kfdhdb.dbcompat: 168820736 ; 0x0e0: 0x0a100000
kfdhdb.grpstmp.hi: 32887249 ; 0x0e4: HOUR=0x11 DAYS=0xe MNTH=0x4 YEAR=0x7d7
kfdhdb.grpstmp.lo: 3430473728 ; 0x0e8: USEC=0x0 MSEC=0x238 SECS=0x7 MINS=0x33
kfdhdb.ub4spare[0]: 0 ; 0x0ec: 0x00000000
kfdhdb.ub4spare[1]: 0 ; 0x0f0: 0x00000000
kfdhdb.ub4spare[2]: 0 ; 0x0f4: 0x00000000
kfdhdb.ub4spare[3]: 0 ; 0x0f8: 0x00000000
kfdhdb.ub4spare[4]: 0 ; 0x0fc: 0x00000000
kfdhdb.ub4spare[5]: 0 ; 0x100: 0x00000000
kfdhdb.ub4spare[6]: 0 ; 0x104: 0x00000000
kfdhdb.ub4spare[7]: 0 ; 0x108: 0x00000000
kfdhdb.ub4spare[8]: 0 ; 0x10c: 0x00000000
kfdhdb.ub4spare[9]: 0 ; 0x110: 0x00000000
kfdhdb.ub4spare[10]: 0 ; 0x114: 0x00000000
kfdhdb.ub4spare[11]: 0 ; 0x118: 0x00000000
kfdhdb.ub4spare[12]: 0 ; 0x11c: 0x00000000
kfdhdb.ub4spare[13]: 0 ; 0x120: 0x00000000
kfdhdb.ub4spare[14]: 0 ; 0x124: 0x00000000
kfdhdb.ub4spare[15]: 0 ; 0x128: 0x00000000
kfdhdb.ub4spare[16]: 0 ; 0x12c: 0x00000000
kfdhdb.ub4spare[17]: 0 ; 0x130: 0x00000000
kfdhdb.ub4spare[18]: 0 ; 0x134: 0x00000000
kfdhdb.ub4spare[19]: 0 ; 0x138: 0x00000000
kfdhdb.ub4spare[20]: 0 ; 0x13c: 0x00000000
kfdhdb.ub4spare[21]: 0 ; 0x140: 0x00000000
kfdhdb.ub4spare[22]: 0 ; 0x144: 0x00000000
kfdhdb.ub4spare[23]: 0 ; 0x148: 0x00000000
kfdhdb.ub4spare[24]: 0 ; 0x14c: 0x00000000
kfdhdb.ub4spare[25]: 0 ; 0x150: 0x00000000
kfdhdb.ub4spare[26]: 0 ; 0x154: 0x00000000
kfdhdb.ub4spare[27]: 0 ; 0x158: 0x00000000
kfdhdb.ub4spare[28]: 0 ; 0x15c: 0x00000000
kfdhdb.ub4spare[29]: 0 ; 0x160: 0x00000000
kfdhdb.ub4spare[30]: 0 ; 0x164: 0x00000000
kfdhdb.ub4spare[31]: 0 ; 0x168: 0x00000000
kfdhdb.ub4spare[32]: 0 ; 0x16c: 0x00000000
kfdhdb.ub4spare[33]: 0 ; 0x170: 0x00000000
kfdhdb.ub4spare[34]: 0 ; 0x174: 0x00000000
kfdhdb.ub4spare[35]: 0 ; 0x178: 0x00000000
kfdhdb.ub4spare[36]: 0 ; 0x17c: 0x00000000
kfdhdb.ub4spare[37]: 0 ; 0x180: 0x00000000
kfdhdb.ub4spare[38]: 0 ; 0x184: 0x00000000
kfdhdb.ub4spare[39]: 0 ; 0x188: 0x00000000
kfdhdb.ub4spare[40]: 0 ; 0x18c: 0x00000000
kfdhdb.ub4spare[41]: 0 ; 0x190: 0x00000000
kfdhdb.ub4spare[42]: 0 ; 0x194: 0x00000000
kfdhdb.ub4spare[43]: 0 ; 0x198: 0x00000000
kfdhdb.ub4spare[44]: 0 ; 0x19c: 0x00000000
kfdhdb.ub4spare[45]: 0 ; 0x1a0: 0x00000000
kfdhdb.ub4spare[46]: 0 ; 0x1a4: 0x00000000
kfdhdb.ub4spare[47]: 0 ; 0x1a8: 0x00000000
kfdhdb.ub4spare[48]: 0 ; 0x1ac: 0x00000000
kfdhdb.ub4spare[49]: 0 ; 0x1b0: 0x00000000
kfdhdb.ub4spare[50]: 0 ; 0x1b4: 0x00000000
kfdhdb.ub4spare[51]: 0 ; 0x1b8: 0x00000000
kfdhdb.ub4spare[52]: 0 ; 0x1bc: 0x00000000
kfdhdb.ub4spare[53]: 0 ; 0x1c0: 0x00000000
kfdhdb.ub4spare[54]: 0 ; 0x1c4: 0x00000000
kfdhdb.ub4spare[55]: 0 ; 0x1c8: 0x00000000
kfdhdb.ub4spare[56]: 0 ; 0x1cc: 0x00000000
kfdhdb.ub4spare[57]: 0 ; 0x1d0: 0x00000000
kfdhdb.acdb.aba.seq: 0 ; 0x1d4: 0x00000000
kfdhdb.acdb.aba.blk: 0 ; 0x1d8: 0x00000000
kfdhdb.acdb.ents: 0 ; 0x1dc: 0x0000
kfdhdb.acdb.ub2spare: 0 ; 0x1de: 0x0000
7、執行了下面的命令:/usr/sbin/lvmchk /dev/vgora/rlvdata,沒有任何結果返回,將檔案/usr/sbin/lvmchk改名後,重新啟動ASM例項,發現可以啟動。METALINK工程師據此認為故障原因是bug 6051728,並提供兩種解決方案:
(i) Remove the patch (which installed the lvmchk) (or)
(ii) Rename /usr/sbin/lvmchk to someother name"
8、於是採用第二種方案,果然順利啟動RAC
9、最後進行資料庫升級的後續事宜。並啟動應用伺服器,登陸應用,一切正常。

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/35489/viewspace-84816/,如需轉載,請註明出處,否則將追究法律責任。

相關文章