資料庫監聽不定期出現異常故障處理
作業系統為:SunOS 5.10
資料庫版本:ORACLE RAC 11.2.0.3.0
該主機上有2個庫,一套RAC的節點1在上面,還有另外一個庫,之所以使用的是DB下的監聽而沒用GRID下的監聽,
是為了避免停掉CRS時影響另外一個庫的使用。
故障現象:
資料庫監聽不定期出現異常 ,從應用tnsping資料庫,時間花費很長甚至連不上
故障分析處理過程:
從應用主機tnsping如下:
racdb1_scenemon$tnsping racdb_new
TNS Ping Utility for Solaris: Version
11.2.0.3.0 - Production on 19-2月
-2014 15:46:23
Copyright (c) 1997, 2011, Oracle. All rights
reserved.
已使用的引數檔案:
已使用 TNSNAMES 介面卡來解析別名
嘗試連線 (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = 11.111.11.1)(PORT =
1521)) (ADDRESS = (PROTOCOL = TCP)(HOST = 11.111.11.2)(PORT = 1521))
(LOAD_BALANCE = yes) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = racdb)
(FAILOVER_MODE = (TYPE = SELECT) (METHOD = BASIC) (RETRIES = 180) (DELAY =
5))))
OK (60000 毫秒)
從資料庫伺服器上檢視監聽狀態:
$ lsnrctl status
LSNRCTL for Solaris: Version 11.2.0.3.0 -
Production on 19-FEB-2014 15:45:58
Copyright (c) 1991, 2011, Oracle. All rights
reserved.
Connecting to (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521))
TNS-12535: TNS:operation timed out
TNS-12560: TNS:protocol adapter error
TNS-00505: Operation timed out
將監聽程式予以kill並重啟監聽:
$ ps -ef |grep tns
oracle 18504 18468 0 15:45:15 pts/10 0:00 grep
tns
oracle 24037 1 0 Nov 28 ? 6:08
/oracle/app/asm/11.2.0/grid/bin/tnslsnr LISTENER -inherit
oracle 9488 1 3 Jan 24 ? 4447:11
/oracle/app/db/product/11.2.0/db/bin/tnslsnr LISTENER -inherit
$
$ kill -9 9488
$ lsnrctl start
LSNRCTL for Solaris: Version 11.2.0.3.0 -
Production on 19-FEB-2014 15:48:22
Copyright (c) 1991, 2011, Oracle. All rights
reserved.
Starting
/oracle/app/db/product/11.2.0/db/bin/tnslsnr: please wait...
TNSLSNR for Solaris: Version 11.2.0.3.0 -
Production
Log messages written to
/oracle/app/db/product/11.2.0/db/log/diag/tnslsnr/racdb/listener/alert/log.xml
Listening on:
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=racdb)(PORT=1521)))
Connecting to
(ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521))
STATUS of the LISTENER
------------------------
Alias LISTENER
Version TNSLSNR for Solaris: Version
11.2.0.3.0 - Production
Start Date 19-FEB-2014 15:48:22
Uptime 0 days 0 hr. 0 min. 0 sec
Trace Level off
Security ON: Local OS Authentication
SNMP OFF
Listener Log File
/oracle/app/db/product/11.2.0/db/log/diag/tnslsnr/racdb/listener/alert/log.xml
Listening Endpoints Summary...
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=racdb)(PORT=1521)))
The listener supports no services
The command completed successfully
檢查listener.log如下:
17-FEB-2014 16:02:26 *
(CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=racdb)(FAILOVER_MODE=(TYPE=SELECT)(METHOD=BASIC)(RETRIES=180)(DELAY=5))(CID=(PROGRAM=sqlplus@racdb1)(HOST=racdb1)(USER=tmn)))
* (ADDRESS=(PROTOCOL=tcp)(HOST=11.111.11.89)(PORT=41603)) * establish * racdb *
0
17-FEB-2014 16:02:26 * (CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=racdb)(FAILOVER_MODE=(TYPE=SELECT)(METHOD=BASIC)(RETRIES=180)(DELAY=5))(CID=(PROGRAM=sqlplus@racdb1)(HOST=racdb1)(USER=tmn)))
* (ADDRESS=(PROTOCOL=tcp)(HOST=11.111.11.89)(PORT=41659)) * establish * racdb *
12518
TNS-12518: TNS:listener could not hand off
client connection
TNS-12560: TNS:protocol adapter error
TNS-00530: Protocol adapter error
Solaris Error: 24: Too many open files
<<<<<<<<<
17-FEB-2014 16:02:26 *
(CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=racdb)(FAILOVER_MODE=(TYPE=SELECT)(METHOD=BASIC)(RETRIES=180)(DELAY=5))(CID=(PROGRAM=sqlplus@racdb1)(HOST=racdb1)(USER=tmn)))
* (ADDRESS=(PROTOCOL=tcp)(HOST=11.111.11.89)(PORT=41652)) * establish * racdb *
0
17-FEB-2014 16:02:26 *
(CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=racdb)(FAILOVER_MODE=(TYPE=SELECT)(METHOD=BASIC)(RETRIES=180)(DELAY=5))(CID=(PROGRAM=sqlplus@racdb1)(HOST=racdb1)(USER=tmn)))
* (ADDRESS=(PROTOCOL=tcp)(HOST=11.111.11.89)(PORT=41669)) * establish * racdb *
0
17-FEB-2014 16:02:26 * (CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=racdb)(FAILOVER_MODE=(TYPE=SELECT)(METHOD=BASIC)(RETRIES=180)(DELAY=5))(CID=(PROGRAM=sqlplus@racdb1)(HOST=racdb1)(USER=tmn)))
* (ADDRESS=(PROTOCOL=tcp)(HOST=11.111.11.89)(PORT=41682)) * establish * racdb *
12518
TNS-12518: TNS:listener could not hand off
client connection
TNS-12560: TNS:protocol adapter error
TNS-00530: Protocol adapter error
Solaris Error: 24: Too many open files
<<<<<<<<<<
Too many open files意味著Maximum Number Of Open Files Per Process 達到了上限。因此listener hang住的原因可能是該limit設定過小,進一步檢視硬體限制如下:
$ ulimit -Ha
time(seconds) unlimited
file(blocks) unlimited
data(kbytes) unlimited
stack(kbytes) unlimited
coredump(blocks) unlimited
nofiles(descriptors) 65536
memory(kbytes) unlimited
檢視軟體限制如下:
$ ulimit -Sa
time(seconds) unlimited
file(blocks) unlimited
data(kbytes) unlimited
stack(kbytes) 8192
coredump(blocks) unlimited
nofiles(descriptors) 256
memory(kbytes) unlimited
檢視/etc/system如下:
*ident "@(#)system 1.18 97/06/27 SMI" /* SVR4 1.5 */
*
* SYSTEM SPECIFICATION FILE
*
* moddir:
*
* Set the search path for modules. This has a format similar to the
* csh path variable. If the module isn't found in the first directory
* it tries the second and so on. The default is /kernel /usr/kernel
*
* Example:
* moddir: /kernel /usr/kernel /other/modules
* root device and root filesystem configuration:
*
* The following may be used to override the defaults provided by
* the boot program:
*
* rootfs: Set the filesystem type of the root.
*
* rootdev: Set the root device. This should be a fully
* expanded physical pathname. The default is the
* physical pathname of the device where the boot
* program resides. The physical pathname is
* highly platform and configuration dependent.
*
* Example:
* rootfs:ufs
* rootdev:/sbus@1,f8000000/esp@0,800000/sd@3,0:a
*
* (Swap device configuration should be specified in /etc/vfstab.)
* exclude:
*
* Modules appearing in the moddir path which are NOT to be loaded,
* even if referenced. Note that `exclude' accepts either a module name,
* or a filename which includes the directory.
*
* Examples:
* exclude: win
* exclude: sys/shmsys
* forceload:
*
* Cause these modules to be loaded at boot time, (just before mounting
* the root filesystem) rather than at first reference. Note that
* forceload expects a filename which includes the directory. Also
* note that loading a module does not necessarily imply that it will
* be installed.
*
* Example:
* forceload: drv/foo
* set:
*
* Set an integer variable in the kernel or a module to a new value.
* This facility should be used with caution. See system(4).
*
* Examples:
*
* To set variables in 'unix':
*
* set nautopush=32
* set maxusers=40
*
* To set a variable named 'debug' in the module named 'test_module'
*
* set test_module:debug = 0x13
* Begin FJSVssf (do not edit)
set ftrace_atboot = 1
set kmem_flags = 0x100
set kmem_lite_maxalign = 8192
* End FJSVssf (do not edit)
forceload: drv/fjpfca
* Begin MDD root info (do not edit)
rootdev:/pseudo/md@0:0,0,blk
* End MDD root info (do not edit)
*** Begin EMCpower added lines *** DO NOT EDIT BELOW THIS LINE ***
forceload: drv/emcpsf
forceload: drv/sd
forceload: drv/ssd
forceload: drv/emcp
forceload: misc/emcpgpx
forceload: misc/emcpmpx
forceload: misc/emcpvlumd
forceload: misc/emcpxcrypt
forceload: misc/emcpdm
forceload: misc/emcpioc
set emcp:bPxEnableInit=1
*** End EMCpower added lines *** DO NOT EDIT ABOVE THIS LINE ***
exec_user_stack = 1
set noexec_user_stack_log = 1
檢視/etc/project如下:
system:0::::
user.root:1::::
noproject:2::::
default:3::::
group.staff:10::::
user.oracle:100::oracle::
綜上所述,從listener.log的輸出看,nofiles的soft limit過小,只有256,這會導致Solaris Error: 24: Too many open files 錯誤。從/etc/system和/etc/project,我們沒有看到設定了soft limit。
故障處理小結及後續建議
將oracle使用者的soft limit提升為至少1024,然後重新oracle使用者登入,檢驗ulimit合格後,重新啟動資料庫和監聽。
具體解決辦法如下:
1、在/etc/system增加以下行
set rlim_fd_max=65536
set rlim_fd_cur=4096
2、重新登入ORACLE並檢驗oracle使用者的限制
su – oracle
ulimit -Ha
ulimit –Sa
3、重新啟動資料庫和監聽
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/30633755/viewspace-2127667/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- Oracle 監聽異常處理Oracle
- 資料庫監聽夯故障分析資料庫
- 資料庫連線異常處理思路資料庫
- spring boot 實現監聽器、過濾器、全域性異常處理Spring Boot過濾器
- [20210722]資料庫異常關閉的處理.txt資料庫
- 資料庫連線異常故障報告資料庫
- Sqoop匯入資料異常處理OOP
- 異常篇——異常處理
- springboot統一異常處理及返回資料的處理Spring Boot
- 處理動態分割槽時出現的異常
- 異常處理
- MapReduce之----往hbase資料庫寫入資料時, 出現資料異常資料庫
- JSP 異常處理如何處理?JS
- 異常錯誤資訊處理
- 異常-throws的方式處理異常
- SpringBoot實現統一異常處理Spring Boot
- React 異常處理React
- JS異常處理JS
- oracle異常處理Oracle
- Python——異常處理Python
- Python異常處理Python
- ThinkPHP 異常處理PHP
- JavaScript 異常處理JavaScript
- JAVA 異常處理Java
- 異常的處理
- golang - 異常處理Golang
- 異常處理2
- 異常處理1
- Java 異常處理Java
- Abp 異常處理
- JAVA異常處理Java
- 08、異常處理
- SpringMVC異常處理SpringMVC
- springBoot資料校驗與統一異常處理Spring Boot
- 啟動資料庫監聽資料庫
- Mysql資料庫監聽binlogMySql資料庫
- 異常處理機制(二)之異常處理與捕獲
- 鴻蒙輕核心M核的故障管家:Fault異常處理鴻蒙
- shell埠監聽異常郵箱告警