當心ORACLE 12.2 RAC新特性引入的BUG導致ORA-4031

zhangweizhao發表於2018-06-04

前幾日朋友有套資料庫升級到oracle 12C r2後出現了ORA-4031, 是一套on Linux x86 64-bit 2節點的RAC資料庫,Hard parse並不高,並且沒有使用AMM, ASMM。SGA分配80G, shared pool固定在13Gb.

— nodel 1 db alert log file

2018-05-29T13:44:51.107475+08:00
Thread 1 advanced to log sequence 6680 (LGWR switch)
  Current log# 14 seq# 6680 mem# 0: +SSDDG1/anbob/ONLINELOG/group_14.260.944852559
2018-05-29T13:57:28.332831+08:00
Errors in file /oracle/app/diag/rdbms/anbob/anbob1/trace/anbob1_m001_38524.trc  (incident=1804178):
ORA-04031: unable to allocate 640 bytes of shared memory ("shared pool"," SELECT source,        (case...","kkqctdrvTD: co","ckydef:kkqcscpcky")
2018-05-29T13:57:28.332861+08:00
Errors in file /oracle/app/diag/rdbms/anbob/anbob1/trace/anbob1_ora_38534.trc  (incident=1804506):
ORA-04031: unable to allocate 40 bytes of shared memory ("shared pool","select con#,obj#,rcon#,enabl...","SQLA","tmp")
Incident details in: /oracle/app/diag/rdbms/anbob/anbob1/incident/incdir_1804506/anbob1_ora_38534_i1804506.trc
Incident details in: /oracle/app/diag/rdbms/anbob/anbob1/incident/incdir_1804178/anbob1_m001_38524_i1804178.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
Errors in file /oracle/app/diag/rdbms/anbob/anbob1/trace/anbob1_m001_38524.trc  (incident=1804179):
ORA-04031: unable to allocate 4152 bytes of shared memory ("shared pool"," SELECT source,        (case...","9003.kgght","9003.kgght")
ORA-04031: unable to allocate 640 bytes of shared memory ("shared pool"," SELECT source,        (case...","kkqctdrvTD: co","ckydef:kkqcscpcky")
Incident details in: /oracle/app/diag/rdbms/anbob/anbob1/incident/incdir_1804179/anbob1_m001_38524_i1804179.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
2018-05-29T13:57:28.588932+08:00
ORA-04031 heap dump being written to trace file /oracle/app/diag/rdbms/anbob/anbob1/incident/incdir_1804506/anbob1_ora_38534_i1804506.trc
2018-05-29T13:57:28.971700+08:00
Errors in file /oracle/app/diag/rdbms/anbob/anbob1/trace/anbob1_ora_36389.trc  (incident=1804658):
ORA-04031: unable to allocate 424 bytes of shared memory ("shared pool","EXEC_OPR_POS","KGLS^62f5cb39","KGLS MEM BLOCK")
Incident details in: /oracle/app/diag/rdbms/anbob/anbob1/incident/incdir_1804658/anbob1_ora_36389_i1804658.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.
2018-05-29T13:57:29.086756+08:00
Errors in file /oracle/app/diag/rdbms/anbob/anbob1/trace/anbob1_ora_75557.trc  (incident=1800986):
ORA-04031: unable to allocate 40 bytes of shared memory ("shared pool","select * from (select id,msi...","SQLA","tmp")
Incident details in: /oracle/app/diag/rdbms/anbob/anbob1/incident/incdir_1800986/anbob1_ora_75557_i1800986.trc
Use ADRCI or Support Workbench to package the incident.
See Note 411.1 at My Oracle Support for error and packaging details.

M001程式在申請”shared pool”遇到ora-4031, 當然問題時間段的AWR沒有生成, 當時建議做heapdump如下

SQL> connect / as sysdba
SQL> alter session set events 'immediate trace name heapdump level 536870914';

OR

sqlplus "/ as sysdba"
oradebug setmypid
oradebug unlimit
oradebug dump heapdump 536870914
oradebug tracefile_name
exit

讓現場的收集了部分trace file, 並沒有什麼異常資訊,這時可以看問題時間點前最近的AWR,找”SGA breakdown difference”部份,初步判斷異常變化。



Note:
“gc index split transaction” 記憶體區使用超過了5Gb, 這在以前的版本中是很少見的. 並且繼續確認了幾份AWR,該heap 記憶體區時一直增長的, 如果並且如果手動從X$KSMSS 確認當前的”gc index split transaction”也是在持續增長。

gc index split transaction

gc index split transaction 是用於Oracle 12cr2 引入的新特性”fast index split wait”, 該特性是在KCL layer實現,用於RAC的Global Cache,使用它表示一個程式index splits等待遠端或本地程式, 基本上每個session在連線時都會分配一組buffer. 所以這個現象只存在於RAC環境。並且配置_gc_fast_index_split_wait=0沒有什麼作用。

似乎是因為使用了kclxidinit() used kghalp (permanent memory) 分配的是perm trunk,而不是kghalf (freeable memory)

解決方案

Mos中查詢了一下,比較符合bug 27163928.

重啟可以臨時解決問題,但是會逐漸再次因為fast index split wait增長導致ora-4031.

or

安裝one-off patch 27163928

or

升級到oracle 18.1(if released!)

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/23368118/viewspace-2155593/,如需轉載,請註明出處,否則將追究法律責任。

相關文章