【問題處理】Oracle process running out of OS kernel I/O resources

secooler發表於2010-11-14
今天遭遇RAC資料庫一個節點的Oracle使用者無法使用ssh登入,即便使用root使用者中轉切換亦不可行。
[root@secodb2 ~]# su - oracle
su: cannot set user id: 資源暫時不可用

1.問題現象
1)問題節點oracle程式數
[root@secodb2 bdump]# ps -ef |grep oracle | wc -l
2089

2)其中大量充斥著如下程式
oracle     888     1  0 Oct30 ?        00:00:00 /oracle/crs/oracle/product/10.2.0/crs/bin/racgmain check
oracle     895     1  0 Oct23 ?        00:00:00 /oracle/crs/oracle/product/10.2.0/crs/bin/racgmain check
oracle     924     1  0 Nov10 ?        00:00:00 /oracle/crs/oracle/product/10.2.0/crs/bin/racgmain check
oracle     952     1  0 Oct26 ?        00:00:00 /oracle/crs/oracle/product/10.2.0/crs/bin/racgmain check
oracle     961     1  0 Nov09 ?        00:00:00 /oracle/crs/oracle/product/10.2.0/crs/bin/racgmain check
oracle     971     1  0 Oct26 ?        00:00:00 /oracle/crs/oracle/product/10.2.0/crs/bin/racgmain check
oracle     991     1  0 Oct25 ?        00:00:00 /oracle/crs/oracle/product/10.2.0/crs/bin/racgmain check
oracle    1031     1  0 Nov11 ?        00:00:00 /oracle/crs/oracle/product/10.2.0/crs/bin/racgmain check
oracle    1046     1  0 Nov06 ?        00:00:00 /oracle/crs/oracle/product/10.2.0/crs/bin/racgmain check
oracle    1060     1  0 Oct20 ?        00:00:00 /oracle/crs/oracle/product/10.2.0/crs/bin/racgmain check
oracle    1064     1  0 Oct28 ?        00:00:00 /oracle/crs/oracle/product/10.2.0/crs/bin/racgmain check
oracle    1074     1  0 Oct24 ?        00:00:00 /oracle/crs/oracle/product/10.2.0/crs/bin/racgmain check
oracle    1078     1  0 Oct20 ?        00:00:00 /oracle/crs/oracle/product/10.2.0/crs/bin/racgmain check
oracle    1148     1  0 Nov07 ?        00:00:00 /oracle/crs/oracle/product/10.2.0/crs/bin/racgmain check
oracle    1159     1  0 Nov04 ?        00:00:00 /oracle/crs/oracle/product/10.2.0/crs/bin/racgmain check

3)正常節點的oracle程式數
secodb1@secodb1 /home/oracle$ ps -ef | grep oracle | wc -l
150

2.問題分析
有關該問題的trace檔案如下。
[root@secodb2 bdump]# vi secodb2_dbw0_8053.trc
/oracle/app/oracle/admin/secodb/bdump/secodb2_dbw0_8053.trc
Oracle Database 10g Enterprise Edition Release 10.2.0.3.0 - 64bit Production
With the Partitioning, Real Application Clusters and Data Mining options
ORACLE_HOME = /oracle/app/oracle/product/10.2.0/db_1
System name:    Linux
Node name:      secodb2
Release:        2.6.18-53.el5xen
Version:        #1 SMP Wed Oct 10 16:48:44 EDT 2007
Machine:        x86_64
Instance name: secodb2
Redo thread mounted by this instance: 2
Oracle process number: 10
Unix process pid: 8053, image: oracle@secodb2 (DBW0)

*** 2010-08-31 22:06:10.227
*** SERVICE NAME:(SYS$BACKGROUND) 2010-08-31 22:06:10.221
*** SESSION ID:(877.1) 2010-08-31 22:06:10.221
WARNING:Oracle process running out of OS kernel I/O resources
*** 2010-09-01 12:33:41.918
WARNING:Oracle process running out of OS kernel I/O resources
*** 2010-09-01 17:02:49.041
WARNING:Oracle process running out of OS kernel I/O resources
*** 2010-10-09 06:02:09.697
WARNING:Oracle process running out of OS kernel I/O resources
WARNING:Oracle process running out of OS kernel I/O resources
WARNING:Oracle process running out of OS kernel I/O resources
*** 2010-10-11 17:41:09.396
WARNING:Oracle process running out of OS kernel I/O resources
WARNING:Oracle process running out of OS kernel I/O resources
WARNING:Oracle process running out of OS kernel I/O resources
*** 2010-10-15 17:41:18.121
WARNING:Oracle process running out of OS kernel I/O resources
…… 省略後面大量重複內容 ……

3.問題原因
有關該問題在MOS中“Bug 6087207 - False WARNING in alert log indicating lack of OS KERNEL I/O RESOURCES [ID 6087207.8]”有所記載,將具體的內容摘錄如下,供參考。

Bug 6087207  False WARNING in alert log indicating lack of OS KERNEL I/O RESOURCES

 This note gives a brief overview of bug 6087207.
 The content was last updated on: 02-APR-2008
 Click for details of each of the sections below.

Affects:

Product (Component) Oracle Server (Rdbms)
Range of versions believed to be affected Versions < 11
Versions confirmed as being affected
Platforms affected
  • Linux 32bit
  • Linux Itanium
  • Linux X86-64bit

Fixed:

This issue is fixed in

Symptoms:

Related To:

  • (None Specified)

Description

Note:  This fix can cause a crash in DBW and 
has been superceeded by the fix for .

The Alert log can contain messages of the form.:
WARNING:ORACLE PROCESS RUNNING OUT OF OS KERNEL I/O RESOURCES
when there is no indication of a resource issue in the OS.

Note:
This problem only affects platforms that preallocate resources
to be used for asynchronous IO (eg: Linux).
Please note: The above is a summary description only. Actual symptoms can vary. Matching to any symptoms here does not confirm that you are encountering this problem. Always consult with Oracle Support for advice.

4.小結
Oracle 10g低版本中有關RAC的Bug比較多。RAC環境將以將版本升級到最新版本。

Good luck.

secooler
10.11.14

-- The End --

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/519536/viewspace-678137/,如需轉載,請註明出處,否則將追究法律責任。

相關文章