查詢執行hash join出現ora-040300 (QERHJ hash-joi,kllcqas:kllsltba)的診斷案例
某公安系統開發人員在plsql中執行查詢時報ORA-04030: 在嘗試分配 123416 位元組 (QERHJ hash-joi,kllcqas:kllsltba) 時程式記憶體不足。資料庫是aix上的rac 11.2.0.4。當時朋友找到我,我當時給他的建議是直接把報錯的查詢直接在資料庫伺服器上執行看是否報錯,因為在伺服器上用sqlplus來執行是沒有經過監聽程式的,但後面給我的答覆是在伺服器上執行也報錯。所以就遠端連線來處理。
檢視alert.log檔案發現如下錯誤資訊:
Errors in file /u01/app/oracle/diag/rdbms/test/test/incident/incdir_156708/test_ora_15859922_i156708.trc ORA-04030: 在嘗試分配 123416 位元組 (QERHJ hash-joi,kllcqas:kllsltba) 時程式記憶體不足
從alert.log檔案檢視到的非預設值引數資訊如下 :
System parameters with non-default values: processes = 1500 sessions = 2272 _realfree_heap_pagesize_hint= 256K _use_realfree_heap = TRUE spfile = "+DATA_DG/xtrk/spfilextrk.ora" sga_target = 14G control_files = "+DATA_DG/xtrk/controlfile/current.261.851142581" control_files = "+DATA_DG/xtrk/controlfile/current.260.851142581" db_block_size = 8192 _external_scn_rejection_threshold_hours= 1 compatible = "11.2.0.4.0" log_archive_format = "%t_%s_%r.dbf" cluster_database = TRUE db_create_file_dest = "+DATA_DG" db_recovery_file_dest = "+DATA_DG" db_recovery_file_dest_size= 400G thread = 1 undo_tablespace = "UNDOTBS1" instance_number = 1 remote_login_passwordfile= "EXCLUSIVE" db_domain = "" dispatchers = "(PROTOCOL=TCP) (SERVICE=xtrkXDB)" remote_listener = "test-scan:1521" audit_file_dest = "/u01/app/oracle/admin/test/adump" audit_trail = "DB" db_name = "test" open_cursors = 600 pga_aggregate_target = 10G diagnostic_dest = "/u01/app/oracle"
如是檢視錯誤跟蹤檔案:
Dump continued from file: /u01/app/oracle/diag/rdbms/xtrk/xtrk1/trace/xtrk1_ora_15859922.trc ORA-04030: 在嘗試分配 123416 位元組 (QERHJ hash-joi,kllcqas:kllsltba) 時程式記憶體不足 ========= Dump for incident 156708 (ORA 4030) ======== ----- Beginning of Customized Incident Dump(s) ----- ======================================= TOP 10 MEMORY USES FOR THIS PROCESS --------------------------------------- *** 2014-08-27 15:07:09.388 77% 86 MB, 727 chunks: "kllcqas:kllsltba " SQL QERHJ hash-joi ds=11160a378 dsprt=110bcb0f8 7% 8449 KB, 80 chunks: "QERHJ Bit vector " SQL QERHJ hash-joi ds=11160a378 dsprt=110bcb0f8 6% 6771 KB, 4 chunks: "kllcqc:kllcqslt " SQL QERHJ hash-joi ds=11160a378 dsprt=110bcb0f8 5% 5880 KB, 407 chunks: "free memory " SQL QERHJ hash-joi ds=11160a378 dsprt=110bcb0f8 2% 2056 KB, 2 chunks: "kllcqgf:kllsltba " SQL kxs-heap-w ds=110bcb0f8 dsprt=1109f0120 1% 651 KB, 28 chunks: "permanent memory " pga heap ds=110005210 dsprt=0 0% 376 KB, 1 chunk : "kfk_kfkio_freeq " KFK_IO_SUBHEAP ds=1108dae38 dsprt=110005210 0% 336 KB, 1 chunk : "kfkosd_p " KFK_IO_SUBHEAP ds=1108dae38 dsprt=110005210 0% 259 KB, 8 chunks: "free memory " pga heap ds=110005210 dsprt=0 0% 253 KB, 2 chunks: "free memory " top call heap ds=11011f360 dsprt=0
從上面報錯程式記憶體分配資訊來看報錯的chunks: "kllcqas:kllsltba " 大小是86M
======================================= PRIVATE MEMORY SUMMARY FOR THIS PROCESS --------------------------------------- ****************************************************** PRIVATE HEAP SUMMARY DUMP 111 MB total: 110 MB commented, 652 KB permanent 606 KB free (0 KB in empty extents), 109 MB, 1 heap: "session heap " 92 KB free held ------------------------------------------------------ Summary of subheaps at depth 1 110 MB total: 110 MB commented, 96 KB permanent 94 KB free (0 KB in empty extents), 108 MB, 1 heap: "kxs-heap-w " 84 KB free held ------------------------------------------------------ Summary of subheaps at depth 2 109 MB total: 108 MB commented, 45 KB permanent 32 KB free (4 KB in empty extents), 106 MB, 4 heaps: "QERHJ hash-joi " 10 KB free held ------------------------------------------------------ Summary of subheaps at depth 3 106 MB total: 100 MB commented, 41 KB permanent 5880 KB free (0 KB in empty extents), 91 MB, 727 chunks: "kllcqas:kllsltba " 5257 KB free held 8786 KB, 80 chunks: "QERHJ Bit vector " 336 KB free held
從111 MB total可知出錯程式只分配了111M的記憶體
========================================= REAL-FREE ALLOCATOR DUMP FOR THIS PROCESS ----------------------------------------- Dump of Real-Free Memory Allocator Heap [0x110959bf0] mag=0xfefe0001 flg=0x5000003 fds=0x0 blksz=262144 blkdstbl=0x110959c00, iniblk=8192 maxblk=524288 numsegs=15 In-use num=402 siz=116391936, Freeable num=0 siz=0, Free num=0 siz=0 ========================================== INSTANCE-WIDE PRIVATE MEMORY USAGE SUMMARY ------------------------------------------ Dumping Work Area Table (level=1) ===================================== Global SGA Info --------------- global target: 10240 MB auto target: 8997 MB max pga: 2047 MB pga limit: 4095 MB pga limit known: 0 pga limit errors: 0 pga inuse: 275 MB pga alloc: 343 MB pga freeable: 11 MB pga freed: 4150 MB pga to free: 0 broker request: 0 pga auto: 32 MB pga manual: 0 MB pga alloc (max): 3043 MB pga auto (max): 32 MB pga manual (max): 0 MB # workareas : 4 # workareas(max): 5 ================================ PER-PROCESS PRIVATE MEMORY USAGE -------------------------------- Private memory usage per Oracle process
當出現ora-04030時該例項分配的總PGA大小如下
------------------------- Top 10 processes: ------------------------- (percentage is of 420 MB total allocated memory) 27% pid 51: 111 MB used of 112 MB allocated < = CURRENT PROC 5% pid 13: 18 MB used of 22 MB allocated 5% pid 14: 18 MB used of 22 MB allocated 5% pid 36: 18 MB used of 22 MB allocated 5% pid 38: 18 MB used of 22 MB allocated 5% pid 39: 18 MB used of 22 MB allocated 5% pid 40: 18 MB used of 22 MB allocated 4% pid 10: 16 MB used of 17 MB allocated 4% pid 12: 13 MB used of 17 MB allocated 4% pid 20: 12 MB used of 15 MB allocated
從上面的percentage is of 420 MB total allocated memory可知這個例項的PGA只分配了420M。從27% pid 51: 111 MB used of 112 MB allocated < = CURRENT PROC可知當前程式分配了112M,使用了111M。而pid=51,那麼接下來檢視pid=51的詳細會話資訊:
------------------------------------ Begin session detail for pid 51 sid: 430 ser: 5 audsid: 400022 user: 239/TC_RKXT flags: (0x45) USR/- flags_idl: (0x1) BSY/-/-/-/-/- flags2: (0x40009) -/-/INC pid: 51 O/S info: user: grid, term: UNKNOWN, ospid: 15859922 image: oracle@xtrkdb1 client details: O/S info: user: Administrator, term: RKXXWEB, ospid: 4120:2956 machine: RKXX\RKXXWEB program: plsqldev.exe application name: PL/SQL Developer, hash value=1190136663 action name: SQL Window - New, hash value=3399691616 current SQL: SELECT b.master_relation 戶關係, b.hu_id_new 戶號,a.name 姓名,a.pid 身份證號碼,b.when_in_logged 入戶時間, b.in_category 入戶理由, b.dob, a.photo_flag 有無相片, substr(ORGNAME,7,10) 所屬派出所, b.quxcun_id,c.name 居委會名稱 from t_person a, t_huji b, tc_jcyw.t_org o ,tc_jcyw.t_quxcun c where a.person_id = b.person_id and b.org_id = o.sunit_code and b.quxcun_id=c.quxcun_id and b.zxbz = '0' and a.dob < =to_date( '1996.07.01','yyyy.mm.dd' ) and not Exists(select * from t_pid_accept m where m.iscancel is null and m.pid = a.pid ) End session detail for pid 51 ------------------------------------ ================ SWAP INFORMATION ---------------- swap info: free_mem = 22025.57M rsv = 128.00M alloc = 44.48M avail = 32768.00M swap_free = 32723.52M ----- End of Customized Incident Dump(s) ----- *** 2014-08-27 15:07:09.417 dbkedDefDump(): Starting incident default dumps (flags=0x2, level=3, mask=0x0) ----- Current SQL Statement for this session (sql_id=1gbsquwh58w4k) ----- SELECT b.master_relation 戶關係, b.hu_id_new 戶號,a.name 姓名,a.pid 身份證號碼,b.when_in_logged 入戶時間, b.in_category 入戶理由, b.dob, a.photo_flag 有無相片, substr(ORGNAME,7,10) 所屬派出所, b.quxcun_id,c.name 居委會名稱 from t_person a, t_huji b, tc_jcyw.t_org o ,tc_jcyw.t_quxcun c where a.person_id = b.person_id and b.org_id = o.sunit_code and b.quxcun_id=c.quxcun_id and b.zxbz = '0' and a.dob <=to_date( '1996.07.01','yyyy.mm.dd' ) and not Exists(select * from t_pid_accept m where m.iscancel is null and m.pid = a.pid )
從上面的資訊可知swap大小為32768M,SGA為14G,PGA為10G,當這個例項的總PGA為分配了420M,當這個會話程式只分配111M記憶體時不應該會報告ora-04030錯誤.這時我還是親自在伺服器上執行上面的報錯查詢發現不報錯,而開始我要朋友做測試,他是要別人做的,但給我的答案是報錯,看來還是得親自測試。這種情況在MOS有一篇文章(ID 758131.1)就是描述這種情況的並且說在aix下是一個特定的問題。造成這個問題的原因可能是因為作業系統資源限制,說當CRS啟動它的資源時是使用root使用者可能會出現ora-04030錯誤,如果是資源透過sqlplus,lsnrctl來啟動則沒有作業系統資源限制。如果我檢視/etc/security/limits檔案內容如下:
# cat limits * * Sizes are in multiples of 512 byte blocks, CPU time is in seconds * * fsize - soft file size in blocks * core - soft core file size in blocks * cpu - soft per process CPU time limit in seconds * data - soft data segment size in blocks * stack - soft stack segment size in blocks * rss - soft real memory usage in blocks * nofiles - soft file descriptor limit * fsize_hard - hard file size in blocks * core_hard - hard core file size in blocks * cpu_hard - hard per process CPU time limit in seconds * data_hard - hard data segment size in blocks * stack_hard - hard stack segment size in blocks * rss_hard - hard real memory usage in blocks * nofiles_hard - hard file descriptor limit * * The following table contains the default hard values if the * hard values are not explicitly defined: * * Attribute Value * ========== ============ * fsize_hard set to fsize * cpu_hard set to cpu * core_hard -1 * data_hard -1 * stack_hard 8388608 * rss_hard -1 * nofiles_hard -1 * * NOTE: A value of -1 implies "unlimited" * default: fsize = 2097151 core = 2097151 cpu = -1 data = 262144 # rss = 65536 rss = -1 stack = 65536 nofiles = 2000 root: fsize = -1 data = -1 stack = -1 core = -1 rss = -1 nofiles = -1 stack = -1 stack_hard = -1 data_hard = -1 cpu = -1 core_hard = -1 cpu_hard = -1 fsize_hard = -1 oracle: fsize = -1 data = -1 stack = -1 core = -1 rss = -1 nofiles = -1 stack = -1 stack_hard = -1 data_hard = -1 cpu = -1 core_hard = -1 cpu_hard = -1 fsize_hard = -1
發現我朋友已經修改了root使用者和oracle使用者的許可權但為什麼還是報ora-04030錯誤了。原因為在oracle 11g rac安裝不像以前的版本,現在安裝Grid Infrastructure 和Oracle資料庫軟體推薦使用不同的使用者來安裝。透過檢視這裡用grid使用者來安裝Grid Infrastructure ,oracle使用者來安裝的oracle資料庫軟。下面的limits檔案朋友只修改了root,oracle兩個使用者,但並沒有修改grid使用者對作業系統資源的使用限制。而在rac中,監聽程式是叢集軟體來啟動的也就是由grid使用者來管理的。由於修改grid使用者對作業系統資源的使用限制由於在limits檔案中增加以下內容:
grid: fsize = -1 data = -1 stack = -1 core = -1 rss = -1 nofiles = -1 stack = -1 stack_hard = -1 data_hard = -1 cpu = -1 core_hard = -1 cpu_hard = -1 fsize_hard = -1
由於重啟rac,在重啟之後再執行上面報錯語句一切正常了,造成這個問題的根本原因是沒有修改oracle,grid使用者對作業系統資源的使用限制所造成的,而這個問題應該是可以避免的,因為對這種作業系統資源的使用限制其實在安裝rac之前是要做修改的在安裝文件中也是有的,所以安裝也是需要仔細閱讀官方文件的。
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/26015009/viewspace-1261123/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- ORA-04030: 在嘗試分配...位元組 (hash-join subh,kllcqas:kllsltba) 時程式記憶體不足。記憶體
- Oracle診斷案例-Job任務停止執行Oracle
- Oracle診斷案例:Job任務停止執行(轉)Oracle
- Java執行緒診斷Java執行緒
- 【sql調優之執行計劃】hash joinSQL
- ORACLE診斷案例Oracle
- 查詢Tuxedo積壓的Oracle診斷指令碼UXOracle指令碼
- 省去join的查詢
- MYSQL INNODB中hash查詢表的實現MySql
- 案例 - EBS SQL效能診斷SQL
- 20200909]為什麼執行計劃不是出現hash join semi.txt
- 從Hash Join的執行計劃的細節中能看到點啥
- 一次Oracle診斷案例-Spfile案例Oracle
- join 查詢優化優化
- 使用 DBMS_SQLDIAG診斷各種查詢問題SQL
- 常見通用的Join查詢
- ORACLE Hash JoinOracle
- Oracle診斷案例-Sql_traceOracleSQL
- 執行SQL語句查詢時出現一個SQLException異常SQLException
- 檢視和診斷JVM執行緒資訊JVM執行緒
- Mysql join 的本質,Nested Loop Join 和 Hash JoinMySqlOOP
- MySQL 合併查詢join 查詢出的不同列合併到一個表中MySql
- PostgreSQLdblink非同步呼叫實現並行hash分片JOIN-含資料交、並、差提速案例SQL非同步並行
- 連線查詢簡析 join 、 left join 、 right join
- 我眼中的《深入淺出Oracle-入門、進階、診斷案例》Oracle
- oracle update left join查詢Oracle
- HASH JOIN ,MERGE JOIN ,NESTED LOOPOOP
- hash join\nest loop join\sort merge join的實驗OOP
- 常見物件-在大串中查詢小串出現的次數案例物件
- Oracle效能 - 常用查詢診斷及調整指令碼[不斷更新] (final)Oracle指令碼
- HASH join詳解
- TiDB 查詢優化及調優系列(三)慢查詢診斷監控及排查TiDB優化
- SQL中聯表查詢操作(LEFT JOIN, RIGHT JOIN, INNER JOIN)SQL
- 多執行緒查詢執行緒
- sql查詢是如何執行的?SQL
- oracle 正在執行的物件查詢Oracle物件
- HASH JOIN ,MERGE JOIN ,NESTED LOOP的比較OOP
- 資料庫實現原理#4(Hash Join)資料庫