當心ORACLE 12.2 RAC新特性引入的BUG導致ORA-4031
前幾日朋友有套資料庫升級到oracle 12C r2後出現了ORA-4031, 是一套on Linux x86 64-bit 2節點的RAC資料庫,Hard parse並不高,並且沒有使用AMM, ASMM。SGA分配80G, shared pool固定在13Gb.
— nodel 1 db alert log file
2018-05-29T13:44:51.107475+08:00 Thread 1 advanced to log sequence 6680 (LGWR switch) Current log# 14 seq# 6680 mem# 0: +SSDDG1/anbob/ONLINELOG/group_14.260.944852559 2018-05-29T13:57:28.332831+08:00 Errors in file /oracle/app/diag/rdbms/anbob/anbob1/trace/anbob1_m001_38524.trc (incident=1804178): ORA-04031: unable to allocate 640 bytes of shared memory ("shared pool"," SELECT source, (case...","kkqctdrvTD: co","ckydef:kkqcscpcky") 2018-05-29T13:57:28.332861+08:00 Errors in file /oracle/app/diag/rdbms/anbob/anbob1/trace/anbob1_ora_38534.trc (incident=1804506): ORA-04031: unable to allocate 40 bytes of shared memory ("shared pool","select con#,obj#,rcon#,enabl...","SQLA","tmp") Incident details in: /oracle/app/diag/rdbms/anbob/anbob1/incident/incdir_1804506/anbob1_ora_38534_i1804506.trc Incident details in: /oracle/app/diag/rdbms/anbob/anbob1/incident/incdir_1804178/anbob1_m001_38524_i1804178.trc Use ADRCI or Support Workbench to package the incident. See Note 411.1 at My Oracle Support for error and packaging details. Use ADRCI or Support Workbench to package the incident. See Note 411.1 at My Oracle Support for error and packaging details. Errors in file /oracle/app/diag/rdbms/anbob/anbob1/trace/anbob1_m001_38524.trc (incident=1804179): ORA-04031: unable to allocate 4152 bytes of shared memory ("shared pool"," SELECT source, (case...","9003.kgght","9003.kgght") ORA-04031: unable to allocate 640 bytes of shared memory ("shared pool"," SELECT source, (case...","kkqctdrvTD: co","ckydef:kkqcscpcky") Incident details in: /oracle/app/diag/rdbms/anbob/anbob1/incident/incdir_1804179/anbob1_m001_38524_i1804179.trc Use ADRCI or Support Workbench to package the incident. See Note 411.1 at My Oracle Support for error and packaging details. 2018-05-29T13:57:28.588932+08:00 ORA-04031 heap dump being written to trace file /oracle/app/diag/rdbms/anbob/anbob1/incident/incdir_1804506/anbob1_ora_38534_i1804506.trc 2018-05-29T13:57:28.971700+08:00 Errors in file /oracle/app/diag/rdbms/anbob/anbob1/trace/anbob1_ora_36389.trc (incident=1804658): ORA-04031: unable to allocate 424 bytes of shared memory ("shared pool","EXEC_OPR_POS","KGLS^62f5cb39","KGLS MEM BLOCK") Incident details in: /oracle/app/diag/rdbms/anbob/anbob1/incident/incdir_1804658/anbob1_ora_36389_i1804658.trc Use ADRCI or Support Workbench to package the incident. See Note 411.1 at My Oracle Support for error and packaging details. 2018-05-29T13:57:29.086756+08:00 Errors in file /oracle/app/diag/rdbms/anbob/anbob1/trace/anbob1_ora_75557.trc (incident=1800986): ORA-04031: unable to allocate 40 bytes of shared memory ("shared pool","select * from (select id,msi...","SQLA","tmp") Incident details in: /oracle/app/diag/rdbms/anbob/anbob1/incident/incdir_1800986/anbob1_ora_75557_i1800986.trc Use ADRCI or Support Workbench to package the incident. See Note 411.1 at My Oracle Support for error and packaging details.
M001程式在申請”shared pool”遇到ora-4031, 當然問題時間段的AWR沒有生成, 當時建議做heapdump如下
SQL> connect / as sysdba SQL> alter session set events 'immediate trace name heapdump level 536870914';
OR
sqlplus "/ as sysdba" oradebug setmypid oradebug unlimit oradebug dump heapdump 536870914 oradebug tracefile_name exit
讓現場的收集了部分trace file, 並沒有什麼異常資訊,這時可以看問題時間點前最近的AWR,找”SGA breakdown difference”部份,初步判斷異常變化。
Note:
“gc index split transaction” 記憶體區使用超過了5Gb, 這在以前的版本中是很少見的. 並且繼續確認了幾份AWR,該heap 記憶體區時一直增長的, 如果並且如果手動從X$KSMSS 確認當前的”gc index split transaction”也是在持續增長。
gc index split transaction
gc index split transaction 是用於Oracle 12cr2 引入的新特性”fast index split wait”, 該特性是在KCL layer實現,用於RAC的Global Cache,使用它表示一個程式index splits等待遠端或本地程式, 基本上每個session在連線時都會分配一組buffer. 所以這個現象只存在於RAC環境。並且配置_gc_fast_index_split_wait=0沒有什麼作用。
似乎是因為使用了kclxidinit() used kghalp (permanent memory) 分配的是perm trunk,而不是kghalf (freeable memory)
解決方案
Mos中查詢了一下,比較符合bug 27163928.
重啟可以臨時解決問題,但是會逐漸再次因為fast index split wait增長導致ora-4031.
or
安裝one-off patch 27163928
or
升級到oracle 18.1(if released!)
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/23368118/viewspace-2155593/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- Oracle 12.2 BUG :分割槽維護導致的 ORA-600 KKPOFPCD3Oracle
- 【ASK_ORACLE】Oracle 12.2 Bug導致網路卡出現故障後RAC庫未向TCP註冊本地VIP監聽OracleTCP
- Alter system suspend/resume 導致的bug和特性
- 由drop datafile導致的oracle bugOracle
- MySQL Bug導致異常當機的分析流程MySql
- Oracle 12.2的新特性之PDB級別閃回Oracle
- Oracle Database 12.2新特性詳解 --該國強OracleDatabase
- Oracle11g新特性導致空表不能匯出Oracle
- oracle bug 6825287導致DX鎖等待Oracle
- oracle rac歸檔使用nfs 導致oracle hungOracleNFS
- Oracle 12.2 新特性: Online PDB relocate (PDB hot move)Oracle
- ORACLE 11.2.0.4 rac for linux 鏈路宕導致的單節點異常當機OracleLinux
- Oracle RAC啟動因CTSS導致的異常Oracle
- Oracle 12.2 新特性 | PDB不同字符集變更深入解析Oracle
- Oracle 12.2新特性: PDB級閃回資料庫(Flashback PDB)Oracle資料庫
- C++各版本引入的新特性C++
- Oracle 12.2 新特性:線上PDB資料庫克隆(Pluggable Hot Clone)Oracle資料庫
- Oracle 12.2 RAC on Linux Best Practice DocumentationOracleLinux
- Firebug 2.0新特性
- goldengate 12c 12.2 新特性(updated)Go
- DRM特性引起的RAC節點當機
- RAC節點hang住, oracle bug導致了cpu過高,無法啟動叢集隔離Oracle
- asm例項自動dismount導致rac一個節點當機ASM
- Oracle 11.2.0.3RAC新特性-遷移spfile[Oracle基礎]Oracle
- Oracle RAC日常運維-NetworkManager導致叢集故障Oracle運維
- Goldengate 12.2新特性-自描述的佇列檔案Go佇列
- INTERVAL分割槽插入大量資料導致ORA-4031錯誤
- 分而治之:Oracle 18c及12.2分割槽新特性的N種優化實踐Oracle優化
- Oracle 11.2.0.3 Database for AIX bug導致ORA-04030的報錯OracleDatabaseAI
- 版本不當導致的exp出錯
- Go1.18 新特性:引入新的 netip 網路庫Go
- oracle BUG 5890312導致表空間瞬間暴漲Oracle
- ORACLE 10.2.0.4 RAC BUG IN HPOracle
- 開心檔之Java 9 新特性Java
- 【ASK_ORACLE】因process用盡導致的rac重啟的解決方法Oracle
- 又見想當然導致的誤譯
- RAC oracle 許可權更改導致 實力啟動失敗Oracle
- 當機導致slave異常分析