nproc/nfile之於oracle的意義

myownstars發表於2012-04-09
近來常常收到開發抱怨的郵件,即透過JDBC經常連線不上若干測試資料庫,對應的錯誤程式碼如下
Caused by: org.springframework.jdbc.CannotGetJdbcConnectionException: Could not get JDBC Connection; nested exception is org.apache.commons.dbcp.SQLNestedException: Cannot get a connection, pool error Timeout waiting for idle object
at org.springframework.orm.ibatis.SqlMapClientTemplate.execute(SqlMapClientTemplate.java:204)
at org.springframework.orm.ibatis.SqlMapClientTemplate.queryForObject(SqlMapClientTemplate.java:271)
at com.yihaodian.central.util.SqlMapClientTemplateMonitor.queryForObject(SqlMapClientTemplateMonitor.java:387)
at com.yihaodian.central.dao.impl.ProvinceDAOImpl.getProvinceByID(ProvinceDAOImpl.java:130)
at sun.reflect.GeneratedMethodAccessor690.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:307)
at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:182)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:149)
at org.springframework.aop.framework.adapter.AfterReturningAdviceInterceptor.invoke(AfterReturningAdviceInterceptor.java:50)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:171)
at org.springframework.aop.framework.adapter.MethodBeforeAdviceInterceptor.invoke(MethodBeforeAdviceInterceptor.java:50)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:171)
at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:89)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:171)
at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
at $Proxy89.getProvinceByID(Unknown Source)
at com.yihaodian.central.service.impl.ProvinceServiceImpl.getProvinceByID(ProvinceServiceImpl.java:14)
at sun.reflect.GeneratedMethodAccessor689.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:307)
at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:182)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:149)
at org.springframework.aop.framework.adapter.AfterReturningAdviceInterceptor.invoke(AfterReturningAdviceInterceptor.java:50)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:171)
at org.springframework.aop.framework.adapter.MethodBeforeAdviceInterceptor.invoke(MethodBeforeAdviceInterceptor.java:50)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:171)
at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:89)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:171)
at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:204)
at $Proxy125.getProvinceByID(Unknown Source)
at com.yihaodian.central.action.home.DetailAction.execute(DetailAction.java:566)
at sun.reflect.GeneratedMethodAccessor723.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at com.opensymphony.xwork2.DefaultActionInvocation.invokeAction(DefaultActionInvocation.java:404)
at com.opensymphony.xwork2.DefaultActionInvocation.invokeActionOnly(DefaultActionInvocation.java:267)
at com.opensymphony.xwork2.DefaultActionInvocation.invoke(DefaultActionInvocation.java:229)
at com.yihaodian.central.interceptor.ContinuePurchaseInterceptor.intercept(ContinuePurchaseInterceptor.java:58)
at com.opensymphony.xwork2.DefaultActionInvocation$2.doProfiling(DefaultActionInvocation.java:224)
at com.opensymphony.xwork2.DefaultActionInvocation$2.doProfiling(DefaultActionInvocation.java:223)
at com.opensymphony.xwork2.util.profiling.UtilTimerStack.profile(UtilTimerStack.java:455)
at com.opensymphony.xwork2.DefaultActionInvocation.invoke(DefaultActionInvocation.java:221)
at com.yihaodian.central.interceptor.CalInterceptor.intercept(CalInterceptor.java:74)
... 73 more
Caused by: org.apache.commons.dbcp.SQLNestedException: Cannot get a connection, pool error Timeout waiting for idle object
at org.apache.commons.dbcp.PoolingDataSource.getConnection(PoolingDataSource.java:114)
at org.apache.commons.dbcp.BasicDataSource.getConnection(BasicDataSource.java:1044)
at com.yihaodian.front.kernel.spring.YihaodianSplitDbDataSource.getConnection(YihaodianSplitDbDataSource.java:16)
at org.springframework.jdbc.datasource.lookup.AbstractRoutingDataSource.getConnection(AbstractRoutingDataSource.java:133)
at org.springframework.jdbc.datasource.DataSourceUtils.doGetConnection(DataSourceUtils.java:113)
at org.springframework.orm.ibatis.SqlMapClientTemplate.execute(SqlMapClientTemplate.java:190)
... 118 more
Caused by: java.util.NoSuchElementException: Timeout waiting for idle object
at org.apache.commons.pool.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:1134)
at org.apache.commons.dbcp.AbandonedObjectPool.borrowObject(AbandonedObjectPool.java:79)
at org.apache.commons.dbcp.PoolingDataSource.getConnection(PoolingDataSource.java:106)
... 123 more

檢查process數量,並沒有超過最大連線值;
對應的資料庫壓力也不大,沒有嚴重的等待事件,且所在server可以ping通;
最後才想起檢查作業系統配置
[oracle@test ~]$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 268288
max locked memory (kbytes, -l) 32
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 10240
cpu time (seconds, -t) unlimited
max user processes (-u) 268288
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
檢視/etc/security/limits.conf檔案,裡面沒有關於oracle的配置資訊,於是手工新增如下資訊
oracle soft nofile 2047
oracle hard nofile 65536
oracle soft nproc 1024
oracle hard nproc 65536
退出儲存
並不確定這種調整是否有效,但是連續好幾次都神奇的解決了此類問題,有可能是巧合。
關於如下兩個引數
- nofile - max number of open files
- nproc - max number of processes
進行一個實驗
 
 
先在oracle使用者下建立一個crontab job
[oracle@testdb ~]$ crontab -l
* */1 * * * sh /home/oracle/tools/test.sh >> /home/oracle/test1.log
內容如下
[oracle@testdb tools]$ more test.sh
#Enviroment define.
source /home/oracle/tools/SET_JOB_ENV.SH
#following is excution part
sqlplus / as sysdba@justin<exec test_shell_proc();
exec test_shell_proc_param($test_id);
exit
eof


Nproc引數測試
1、
一開始把nproc設定成10
oracle soft nofile 131072
oracle hard nofile 131072
oracle soft nproc 10
oracle hard nproc 10

透過secureCRT以oracle使用者登入,一直hang在那裡;而root使用者則可以順利登入,進入到/var/log目錄
[root@testdb log]# tail -n 200 -f secure
Nov 8 16:49:40 testdb sshd[16536]: Accepted password for oracle from 192.168.16.173 port 1719 ssh2
Nov 8 16:49:40 testdb sshd[16536]: pam_unix(sshd:session): session opened for user oracle by (uid=0)
Nov 8 16:49:40 testdb sshd[16538]: fatal: setresuid 500: Resource temporarily unavailable
Nov 8 16:49:40 testdb sshd[16536]: pam_unix(sshd:session): session closed for user oracle
Nov 8 16:49:45 testdb sshd[16539]: Accepted password for oracle from 192.168.16.173 port 1720 ssh2
Nov 8 16:49:45 testdb sshd[16539]: pam_unix(sshd:session): session opened for user oracle by (uid=0)
Nov 8 16:49:45 testdb sshd[16541]: fatal: setresuid 500: Resource temporarily unavailable
Nov 8 16:49:45 testdb sshd[16539]: pam_unix(sshd:session): session closed for user oracle


[root@testdb log]# tail -n 200 -f cron
Nov 8 16:49:01 testdb crond[16535]: (CRON) setuid failed: (Resource temporarily unavailable)
Nov 8 16:49:01 testdb crond[16535]: CRON (oracle) ERROR: failed to open PAM security session: Resource temporarily unavailable
Nov 8 16:49:01 testdb crond[16535]: CRON (oracle) ERROR: cannot set security context
Nov 8 16:50:01 testdb crond[16548]: (root) CMD (/usr/lib64/sa/sa1 1 1)
Nov 8 16:50:01 testdb crond[16547]: (CRON) setuid failed: (Resource temporarily unavailable)
Nov 8 16:50:01 testdb crond[16547]: CRON (oracle) ERROR: failed to open PAM security session: Resource temporarily unavailable
Nov 8 16:50:01 testdb crond[16547]: CRON (oracle) ERROR: cannot set security context


2、
恢復hard nproc值,soft nproc依舊設定為10
oracle soft nproc 10
oracle hard nproc 131072
依舊無法透過oracle使用者登入,錯誤日誌提示也一樣。

3、
將soft nproc設定成131072
可以透過oracle使用者順利登入,crontab job可以順利執行
[root@testdb log]# tail -n 200 -f secure
Nov 8 16:53:33 testdb sshd[16581]: Accepted password for oracle from 192.168.16.173 port 1753 ssh2
Nov 8 16:53:33 testdb sshd[16581]: pam_unix(sshd:session): session opened for user oracle by (uid=0)
[root@testdb log]# tail -n 200 -f cron
Nov 8 16:52:01 testdb crond[16568]: CRON (oracle) ERROR: failed to open PAM security session: Resource temporarily unavailable
Nov 8 16:52:01 testdb crond[16568]: CRON (oracle) ERROR: cannot set security context
Nov 8 16:53:01 testdb crond[16577]: (oracle) CMD (sh /home/oracle/tools/test.sh >> /home/oracle/test1.log)
Nov 8 16:54:01 testdb crond[16625]: (oracle) CMD (sh /home/oracle/tools/test.sh >> /home/oracle/test1.log)

小結: nproc設定過低,會導致無法以oracle使用者登入OS,且在/var/log/secure裡報告Resource temporarily unavailable錯誤;且oracle所屬的crontab job也無法執行

Nofile引數測試
1、
將其設定為2
oracle soft nofile 2
oracle hard nofile 2
oracle soft nproc 131072
oracle hard nproc 131072

此時透過secureCRT連線資料庫,彈出錯誤視窗,提示“socket: Too many open files”
[root@testdb log]# tail -n 10 -f secure
Nov 8 16:41:59 testdb sshd[16392]: Accepted password for oracle from 192.168.16.173 port 1651 ssh2
Nov 8 16:41:59 testdb sshd[16392]: pam_unix(sshd:session): session opened for user oracle by (uid=0)
Nov 8 16:41:59 testdb sshd[16394]: Disconnecting: socket: Too many open files
Nov 8 16:41:59 testdb sshd[16392]: pam_unix(sshd:session): session closed for user oracle

[root@testdb log]# tail -n 200 -f cron
Nov 8 16:44:01 testdb crond[16424]: System error
Nov 8 16:44:01 testdb crond[16424]: CRON (oracle) ERROR: failed to open PAM security session: Too many open files
Nov 8 16:44:01 testdb crond[16424]: CRON (oracle) ERROR: cannot set security context

2、
改為20
則可以登入
oracle soft nofile 20
oracle hard nofile 20

[root@testdb log]# tail -n 10 -f secure
Nov 8 16:44:45 testdb sshd[16426]: Accepted password for oracle from 192.168.16.173 port 1671 ssh2
Nov 8 16:44:45 testdb sshd[16426]: pam_unix(sshd:session): session opened for user oracle by (uid=0)

[root@testdb log]# tail -n 200 -f cron
Nov 8 16:44:01 testdb crond[16424]: System error
Nov 8 16:44:01 testdb crond[16424]: CRON (oracle) ERROR: failed to open PAM security session: Too many open files
Nov 8 16:44:01 testdb crond[16424]: CRON (oracle) ERROR: cannot set security context
Nov 8 16:45:01 testdb crond[16468]: (oracle) CMD (sh /home/oracle/tools/test.sh >> /home/oracle/test1.log)
可以執行crontab job了

3、
將soft設定為10
oracle soft nofile 10
oracle hard nofile 20
依舊無法登入
[root@testdb log]# tail -n 10 -f secure
Nov 8 16:46:47 testdb sshd[16504]: Accepted password for oracle from 192.168.16.173 port 1694 ssh2
Nov 8 16:47:01 testdb crond[16512]: pam_succeed_if(crond:session): error retrieving information about user 0
Nov 8 16:47:01 testdb crond[16512]: pam_unix(crond:session): session opened for user oracle by (uid=0)
Nov 8 16:47:01 testdb crond[16512]: PAM audit_open() failed: Too many open files
Nov 8 16:47:01 testdb crond[16512]: pam_succeed_if(crond:session): error retrieving information about user 0
Nov 8 16:47:01 testdb crond[16512]: pam_unix(crond:session): session closed for user oracle
Nov 8 16:47:01 testdb crond[16512]: PAM audit_open() failed: Too many open files
[root@testdb log]# tail -n 200 -f cron
Nov 8 16:47:01 testdb crond[16512]: System error
Nov 8 16:47:01 testdb crond[16512]: CRON (oracle) ERROR: failed to open PAM security session: Too many open files
Nov 8 16:47:01 testdb crond[16512]: CRON (oracle) ERROR: cannot set security context
Nov 8 16:48:01 testdb crond[16521]: System error
Nov 8 16:48:01 testdb crond[16521]: CRON (oracle) ERROR: failed to open PAM security session: Too many open files
Nov 8 16:48:01 testdb crond[16521]: CRON (oracle) ERROR: cannot set security context

但是可以透過pl/sql developer遠端連線資料庫。

小結:nofile設定過低,同樣也會導致無法以oracle使用者登入OS,且報告錯誤 Too many open files,同時crontab job也無法正常執行;soft值設定過低同樣會導致相應程式無法正常執行


來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/15480802/viewspace-720722/,如需轉載,請註明出處,否則將追究法律責任。

相關文章