ORA-04030處理一例

531968912發表於2016-10-18

ORA-04030錯誤屬於常見錯誤之一

oerr ora 04030
04030, 00000, "out of process memory when trying to allocate %s bytes (%s,%s)"
// *Cause:  Operating system process private memory has been exhausted
// *Action: 

頭兩天一同事遇上了一個ORA-04030的錯誤,跟我去年遇到的一次現象極為相似,翻出了當時寫的報告,在blog中分享下。

現象是開發人員透過PLSQL Developer工具在建立索引時報ORA-04030錯誤,導致索引建立失敗,但是透過splplus重新執行建立索引語句成功,下面記錄下處理過程:

1. ORA-04030報錯

ORA-04030錯誤引起的原因大概有以下幾種情況:

1) 是否有足夠的可用記憶體?

檢視系統記憶體使用情況

+-topas_nmon--r=Resources--------Host=gisdata2-------Refresh=2 secs---10:03.50------------+
| Memory ---------------------------------------------------------------------------------|
|          Physical  PageSpace |        pages/sec  In     Out | FileSystemCache           |
|% Used       87.7%      2.6%  | to Paging Space   0.5    0.0 | (numperm)  4.6%           |
|% Free       12.3%     97.4%  | to File System    0.0    0.0 | Process   66.1%           |
|MB Used   55450.2MB   523.7MB | Page Scans        0.0        | System    17.0%           |
|MB Free    7781.6MB 19956.3MB | Page Cycles       0.0        | Free      12.3%           |
|Total(MB) 63231.9MB 20480.0MB | Page Steals       0.0        |           ------          |
|                              | Page Faults    2110.3        | Total    100.0%           |
|------------------------------------------------------------ | numclient  4.6%           |
|Min/Maxperm     3065MB(  5%)  9195MB( 15%) <--% of RAM       | maxclient 15.0%           |
|Min/Maxfree    3000   4000       Total Virtual   81.7GB      | User      67.7%           |
|Min/Maxpgahead    2      8    Accessed Virtual   50.8GB 62.2%| Pinned    19.8%           |
|                                                             | lruable pages   15692064.0|
|-----------------------------------------------------------------------------------------|

可用看到剩餘記憶體還有7.7G左右,說明在作業系統層面還有足夠的可用記憶體

 

2) 是否設定了oracle的限制?

檢視oracle中與PGA相關的設定

SQL> show parameter PGA

NAME                                 TYPE                             VALUE
------------------------------------ -------------------------------- ------------------------------
pga_aggregate_target                 big integer                      8400M
SQL> 
SQL> 
SQL> select * from v$pgastat;

NAME                                                                  VALUE UNIT
---------------------------------------------------------------- ---------- ------------
aggregate PGA target parameter                                   8808038400 bytes
aggregate PGA auto target                                        7288095744 bytes
global memory bound                                               880803840 bytes
total PGA inuse                                                   720201728 bytes
total PGA allocated                                               985562112 bytes
maximum PGA allocated                                            2692780032 bytes   ---實際分配最大記憶體  2568.03516MB
total freeable PGA memory                                         163905536 bytes
process count                                                           414
max processes count                                                     529
PGA memory freed back to OS                                      2.4854E+11 bytes
total PGA used for auto workareas                                  10177536 bytes

NAME                                                                  VALUE UNIT
---------------------------------------------------------------- ---------- ------------
maximum PGA used for auto workareas                               642154496 bytes
total PGA used for manual workareas                                       0 bytes
maximum PGA used for manual workareas                                537600 bytes
over allocation count                                                     0
bytes processed                                                  5.7321E+11 bytes
extra bytes read/written                                         2125827072 bytes
cache hit percentage                                                  99.63 percent
recompute count (total)                                               61401

SQL> set linesize 120
SQL> col name for a30
SQL> col value for a20
SQL> col describ for a60
SQL> set pages 3000
SQL> SELECT x.ksppinm NAME, y.ksppstvl VALUE, x.ksppdesc describ
  2  FROM SYS.x$ksppi x, SYS.x$ksppcv y
  3  WHERE x.indx = y.indx
  4  AND x.ksppinm LIKE '%_pga_max_size%'
  5  /

NAME                           VALUE                DESCRIB
------------------------------ -------------------- ------------------------------------------------------------
_pga_max_size                  1761607680           Maximum size of the PGA memory for one process
SQL> set linesize 120
SQL> col name for a30
SQL> col value for a20
SQL> col describ for a60
SQL> set pages 3000
SQL> SELECT x.ksppinm NAME, y.ksppstvl VALUE, x.ksppdesc describ
  2  FROM SYS.x$ksppi x, SYS.x$ksppcv y
  3  WHERE x.indx = y.indx
  4  AND x.ksppinm LIKE '%_smm_max_size%'
  5  /

NAME                           VALUE                DESCRIB
------------------------------ -------------------- ------------------------------------------------------------
_smm_max_size                  860160               maximum work area size in auto mode (serial)

SQL>

從上面的內容可以看到PGA設定的大小為8400M,根據單個會話使用PGA的期望尺寸(也可以認為是實際分配的最大尺寸)計算公式是:min(5%*pga_aggregate_target,50%*_pga_max_size,_smm_max_size),可以簡單計算下min(5%*8400M,50%1680M,840M)=420M(其中_pga_max_size的單位為bytes,_smm_max_size的單位為kb),即單個會話能使用PGA的期望尺寸為420M,那報錯的會話是否超過了該限制?

檢視ORA-04030報錯的trace檔案:

TOP 10 MEMORY USES FOR THIS PROCESS
---------------------------------------
98%  109 MB, 1797 chunks: "permanent memory          "  SQL
         sort subheap    ds=1108b67f0  dsprt=11048a508
 1% 1493 KB, 1779 chunks: "free memory               "  
         session heap    ds=11047a818  dsprt=11019eae0
 0%  193 KB,  26 chunks: "permanent memory          "  
         pga heap        ds=110072eb0  dsprt=0
 0%   82 KB,   6 chunks: "frame segment             "  SQL
         kxs-heap-f      ds=11048a8c8  dsprt=11047a818
 0%   45 KB,   4 chunks: "permanent memory          "  
         session heap    ds=11047a818  dsprt=11019eae0
 0%   35 KB,  11 chunks: "kzctxhugi2                "  
         session heap    ds=11047a818  dsprt=11019eae0
 0%   33 KB,   1 chunk : "free memory               "  
         top call heap   ds=11019e8c0  dsprt=0
 0%   24 KB,  11 chunks: "koh-kghu session heap     "  
         session heap    ds=11047a818  dsprt=11019eae0
 0%   22 KB,   1 chunk : "Fixed Uga                 "  
         pga heap        ds=110072eb0  dsprt=0
 0%   20 KB,   5 chunks: "kxsFrame4kPage            "  
         session heap    ds=11047a818  dsprt=11019eae0
 
=======================================
PRIVATE MEMORY SUMMARY FOR THIS PROCESS
---------------------------------------
****************************************************** PRIVATE HEAP SUMMARY DUMP
111 MB total: 111 MB commented, 194 KB permanent
    45 KB free (0 KB in empty extents),
     111 MB,   1 heap:    "session heap   "           
------------------------------------------------------
Summary of subheaps at depth 1
111 MB total:
   109 MB commented, 55 KB permanent
  1513 KB free (1 KB in empty extents),
     2 KB uncommented freeable with mark,      110 MB,   1 heap:    "kxs-heap-w     "            1490 KB free held
------------------------------------------------------
Summary of subheaps at depth 2
109 MB total:
   109 MB commented, 10 KB permanent
    14 KB free (4 KB in empty extents),
     0 KB uncommented freeable with mark,      109 MB,   1 heap:    "sort subheap   "           
------------------------------------------------------
Summary of subheaps at depth 3
109 MB total:
     0 KB commented, 109 MB permanent
     2 KB free (0 KB in empty extents),
 
=========================================
REAL-FREE ALLOCATOR DUMP FOR THIS PROCESS
-----------------------------------------
 
Dump of Real-Free Memory Allocator Heap [0x11043aa30]
mag=0xfefe0001 flg=0x5000003 fds=0x0 blksz=65536
blkdstbl=0x11043aa40, iniblk=4096 maxblk=262144 numsegs=15
In-use num=1774 siz=116260864, Freeable num=0 siz=0, Free num=0 siz=0
 
==========================================
INSTANCE-WIDE PRIVATE MEMORY USAGE SUMMARY
------------------------------------------
 
Dumping Work Area Table (level=1)
=====================================
 
  Global SGA Info
  ---------------
 
    global target:     8400 MB
    auto target:       7170 MB
    max pga:           1680 MB
    pga limit:         3360 MB
    pga limit known:  0
    pga limit errors:     0
 
    pga inuse:          541 MB
    pga alloc:          692 MB
    pga freeable:        93 MB
    pga freed:        28937 MB
    pga to free:          0 
 
    pga auto:           108 MB
    pga manual:           0 MB
 
    pga alloc  (max):  2568 MB
    pga auto   (max):   148 MB
    pga manual (max):     0 MB
 
    # workareas     :     0
    # workareas(max):    10
 
 
 
================================
PER-PROCESS PRIVATE MEMORY USAGE
--------------------------------
 
Private memory usage per Oracle process
 
-------------------------
Top 10 processes:
-------------------------
(percentage is of 693 MB total allocated memory) 16% pid 168: 111 MB used of 111 MB allocated  <= CURRENT PROC 6% pid 3: 43 MB used of 43 MB allocated 
 2% pid 24: 402 KB used of 12 MB allocated (12 MB freeable)
 2% pid 33: 532 KB used of 12 MB allocated (11 MB freeable)
 2% pid 7: 11 MB used of 11 MB allocated 
 2% pid 8: 11 MB used of 11 MB allocated 
 2% pid 9: 11 MB used of 11 MB allocated 
 2% pid 10: 11 MB used of 11 MB allocated 
 2% pid 11: 11 MB used of 11 MB allocated 
 2% pid 12: 11 MB used of 11 MB allocated 

從以上trace檔案中可以看到,報錯的程式實際分配的程式只有111MB,遠遠未達到420M,說明並非是由於ORACLE自身的限制引起的ORA-04030報錯。

 

3) 哪個程式需要的記憶體過多?

上一個是否是由於oracle自身限制引起的解釋中,已經可以從trace檔案中看到,消耗最多記憶體的程式就是報ORA-04030的程式,消耗的記憶體為110M,並未發現其他更消耗記憶體的程式。

 

4) 是否設定了作業系統限制?

檢視作業系統限制,oracle使用者的限制

$ ulimit -a
time(seconds)        unlimited
file(blocks)         unlimited
data(kbytes)         unlimited
stack(kbytes)        4194304
memory(kbytes)       unlimited
coredump(blocks)     unlimited
nofiles(descriptors) unlimited
threads(per process) unlimited
processes(per user)  unlimited
$

root使用者的限制

gisdata2@root[/]ulimit -a
time(seconds)        unlimited
file(blocks)         unlimited
data(kbytes)         131072
stack(kbytes)        32768
memory(kbytes)       unlimited
coredump(blocks)     unlimited
nofiles(descriptors) unlimited
threads(per process) unlimited
processes(per user)  unlimited
gisdata2@root[/]

從上面root和oracle的limit限制來看,root使用者的data(kbytes)的限定值得關注,該屬性的意義是soft data segment size in blocks(程式資料段大小限制)。

 

2. 為何sqlplus會成功,PLSQL Developer卻會失敗

透過PLSQL Developer工具建立索引時報ORA-04030錯誤,但是透過SQLPLUS建立卻能成功,兩者除了使用的工具不同(PLSQL Developer和sqlplus),還有就是連線的方式不同(PLSQL Developer是透過監聽程式建立的程式連線;sqlplus是在資料庫伺服器上直接建立建立的連線,繞過了監聽程式建立的程式)。

從連線工具和方式的不同得到了不一樣的結果,如何來驗證到底是連線工具的問題或則是連線方式的引起的報錯?

由於PLSQL Developer只能透過監聽的連線方式進行連線,但是sqlplus可以透過監聽或則直接連線兩種方式進行,所以先對連線方式進行測試。

透過sqlplus以tnsnames.ora標籤名的方式透過監聽進行連線,並執行建立索引報錯的語句,發現錯誤依然存在,但是如果不透過監聽而直接連線是不會報錯的,說明跟是否透過監聽進行連線有很大的關係

3. 為何會受監聽的影響?

在ORACLE RAC環境中,由於crs的啟停是透過root使用者進行的

gisdata2@root[/]ps -ef|grep init
    root        1        0   0   Aug 31      - 18:24 /etc/init
    root  5046686        1   0   Jul 18      -  0:00 /bin/sh /etc/init.crsd run
    root  5702082 27066830   0   Jul 18      -  0:00 /bin/sh /etc/init.cssd oclsomon
    root  6554098 27066830   0   Jul 18      -  0:00 /bin/sh /etc/init.cssd daemon
    root 24969600        1   0   Jul 18      -  0:00 /bin/sh /etc/init.evmd run
    root 27066830        1   1   Jul 18      -  8:07 /bin/sh /etc/init.cssd fatal
    root 40108312 18940178   0 10:57:45  pts/1  0:00 grep init
gisdata2@root[/]

所以在crs會繼承root使用者的limit屬性,當透過crs或則srvctl命令啟動監聽時,也會繼承root使用者相應的limit屬性,即data(kbytes)為131072。如何驗證該推斷?

1) 透過監聽的形式進行連線

$ sqlplus sys/g_iS74Ldh@gissc2 as sysdba

SQL*Plus: Release 10.2.0.5.0 - Production on Mon Jul 20 17:20:02 2015

Copyright (c) 1982, 2010, Oracle.  All Rights Reserved.


Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Data Mining
and Real Application Testing options

SQL> select * from v$mystat where rownum<2;

       SID STATISTIC#      VALUE
---------- ---------- ----------
      1427          0          1

SQL> select spid from v$process where addr =(select paddr from v$session where sid=1427);

SPID
------------
10158446                 -----該連線的作業系統程式號

SQL> exit
Disconnected from Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Data Mining
and Real Application Testing options

透過dbx工具檢視該程式的limit資訊

$ dbx -a 10158446
Waiting to attach to process 10158446 ...
Successfully attached to oracle.
warning: Directory containing oracle could not be determined.
Apply 'use' command to initialize source path.

Type 'help' for help.
reading symbolic information ...
stopped in read at 0x90000000002e294 ($t1)
0x90000000002e294 (read+0x274) e8410028          ld   r2,0x28(r1)
(dbx) proc ulimit
proc ulimit
     ^ syntax error
(dbx) proc rlimit
rlimit name:          rlimit_cur               rlimit_max       (units)
 RLIMIT_CPU:         (unlimited)             (unlimited)        sec
 RLIMIT_FSIZE:       (unlimited)             (unlimited)        bytes
 RLIMIT_DATA:          134217728             (unlimited)        bytes
 RLIMIT_STACK:          33554432              4294967296        bytes
 RLIMIT_CORE:        (unlimited)             (unlimited)        bytes
 RLIMIT_RSS:         (unlimited)             (unlimited)        bytes
 RLIMIT_AS:          (unlimited)             (unlimited)        bytes
 RLIMIT_NOFILE:            65534                   65536        descriptors
 RLIMIT_THREADS:     (unlimited)             (unlimited)        per process
 RLIMIT_NPROC:       (unlimited)             (unlimited)        per user
(dbx) quit
$

從上面的內容可以看到data屬性的limit值為134217728bytes即131072kbytes與root的data(kbytes)  131072值完全吻合(stack的33554432bytes即32768也與root的stack(kbytes) 32768一致),說明是透過監聽建立連線程式的limit繼承於root使用者。

 

2) 不同監聽進行連線

$ sqlplus / as sysdba

SQL*Plus: Release 10.2.0.5.0 - Production on Mon Jul 20 17:16:09 2015

Copyright (c) 1982, 2010, Oracle.  All Rights Reserved.


Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Data Mining
and Real Application Testing options

SQL> select * from v$mystat where rownum<2;

       SID STATISTIC#      VALUE
---------- ---------- ----------
      1375          0          1

SQL> select spid from v$process where addr =(select paddr from v$session where sid=1375);

SPID
------------
47710432

SQL> 

$ 
$ dbx -a 47710432
Waiting to attach to process 47710432 ...
Successfully attached to oracle.
warning: Directory containing oracle could not be determined.
Apply 'use' command to initialize source path.

Type 'help' for help.
reading symbolic information ...
stopped in read at 0x90000000002e294 ($t1)
0x90000000002e294 (read+0x274) e8410028          ld   r2,0x28(r1)
(dbx) proc rlimit
rlimit name:          rlimit_cur               rlimit_max       (units)
 RLIMIT_CPU:         (unlimited)             (unlimited)        sec
 RLIMIT_FSIZE:       (unlimited)             (unlimited)        bytes
 RLIMIT_DATA:        (unlimited)             (unlimited)        bytes
 RLIMIT_STACK:          33554432              4294967296        bytes
 RLIMIT_CORE:        (unlimited)             (unlimited)        bytes
 RLIMIT_RSS:         (unlimited)             (unlimited)        bytes
 RLIMIT_AS:          (unlimited)             (unlimited)        bytes
 RLIMIT_NOFILE:            65534             (unlimited)        descriptors
 RLIMIT_THREADS:     (unlimited)             (unlimited)        per process
 RLIMIT_NPROC:       (unlimited)             (unlimited)        per user
(dbx) quit
$

從上面內容可以看到,如果不透過監聽連線資料庫建立的程式,它的data限制為unlimited的即無限制。

最後檢視crs中監聽的啟動日誌(/u01/oracle/product/10.2.0/db_1/log/gisdata2/racg中的日誌檔案ora.gisdata2.LISTENER_GISDATA2.lsnr.log):

2015-07-18 14:16:54.676: [    RACG][1] [28508196][1][ora.gisdata2.LISTENER_GISDATA2.lsnr]: The command completed successfully

2015-07-18 14:47:09.078: [    RACG][1] [29950132][1][ora.gisdata2.LISTENER_GISDATA2.lsnr]:
LSNRCTL for IBM/AIX RISC System/6000: Version 10.2.0.5.0 - Production on 18-JUL-2015 14:47:08

Copyright (c) 1991, 2010, Oracle.  All rights reserved.

Starting /u01/oracle/product/10.2.0/db_1/bin/tnslsnr: please wait...


2015-07-18 14:47:09.079: [    RACG][1] [29950132][1][ora.gisdata2.LISTENER_GISDATA2.lsnr]: TNSLSNR for IBM/AIX RISC System/6000: Version 10.2.0.5.0 - Production
System parameter file is /u01/oracle/product/10.2.0/db_1/network/admin/listener.ora
Log messages written to /u01/oracle/product/10.2.0/db_1/network/log/listener_gisdata2.log

2015-07-18 14:47:09.079: [    RACG][1] [29950132][1][ora.gisdata2.LISTENER_GISDATA2.lsnr]: Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=10.176.30.81)(PORT=1521)))
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=10.176.30.79)(PORT=1521)))
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC)))

從以上內容可以看到監聽是由crs(或則是srvctl命令)啟動的

以及監聽的執行時間:

$ lsnrctl status

LSNRCTL for IBM/AIX RISC System/6000: Version 10.2.0.5.0 - Production on 20-JUL-2015 16:31:03

Copyright (c) 1991, 2010, Oracle.  All rights reserved.

Connecting to (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521))
STATUS of the LISTENER
------------------------
Alias                     LISTENER_GISDATA2
Version                   TNSLSNR for IBM/AIX RISC System/6000: Version 10.2.0.5.0 - Production
Start Date                18-JUL-2015 14:56:39
Uptime                    2 days 1 hr. 34 min. 24 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      ON
Listener Parameter File   /u01/oracle/product/10.2.0/db_1/network/admin/listener.ora
Listener Log File         /u01/oracle/product/10.2.0/db_1/network/log/listener_gisdata2.log
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=10.176.30.81)(PORT=1521)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=10.176.30.79)(PORT=1521)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC)))
Services Summary...
Service "SYS$SYS.KUPC$C_1_20150720090431.GISSC" has 1 instance(s).
  Instance "gissc1", status READY, has 1 handler(s) for this service...
Service "SYS$SYS.KUPC$S_1_20150720090431.GISSC" has 1 instance(s).
  Instance "gissc1", status READY, has 1 handler(s) for this service...
Service "gissc" has 2 instance(s).
  Instance "gissc1", status READY, has 1 handler(s) for this service...
  Instance "gissc2", status READY, has 2 handler(s) for this service...
Service "gisscXDB" has 2 instance(s).
  Instance "gissc1", status READY, has 1 handler(s) for this service...
  Instance "gissc2", status READY, has 1 handler(s) for this service...
Service "gissc_XPT" has 2 instance(s).
  Instance "gissc1", status READY, has 1 handler(s) for this service...
  Instance "gissc2", status READY, has 2 handler(s) for this service...
Service "mygissc" has 2 instance(s).
  Instance "gissc1", status READY, has 1 handler(s) for this service...
  Instance "gissc2", status READY, has 2 handler(s) for this service...
The command completed successfully
$

監聽啟動的時間也與日誌中的時間對應。

 

由此可以得到結論,由於監聽是透過crs進行的啟動,繼承了root使用者的limit限制,每個會話所能持有的記憶體大小最大不能超過128M,當透過監聽進行資料庫連線時,由監聽建立的使用者程式也將繼承該limit限制,所以當透過PLSQL Developer連線資料庫(包括sqlplus等工具需要透過監聽建立使用者程式的情況),在建立索引過程中,當所請求的記憶體達到或非常接近該限制時,就會由於無法進一步申請更多的記憶體資源,丟擲ORA-04030錯誤。

 

提示:

如果是在Linux系統中,可以透過cat /proc/PID/limits的方法進行檢視單個程式的limit屬性值,其中PID為要檢視程式的程式號


摘自:%e5%a4%84%e7%90%86%e4%b8%80%e4%be%8b/

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/25462274/viewspace-2126654/,如需轉載,請註明出處,否則將追究法律責任。

相關文章