一個CRS CRS-5818 gpnpd、mdnsd程式無法啟動案例處理
crsctl start crs 啟動叢集后,資源死活拉不起來,重啟機器後還是這樣。 檢視日誌發現下面資訊:
-
-
2016-05-12 20:22:18.341:
-
[gpnpd(4653074)]CRS-2329:GPNPD on node pos5gpp1 shutdown.
-
2016-05-12 20:24:15.080:
-
[/grid/app/11.2/bin/oraagent.bin(5832974)]CRS-5818:Aborted command 'start' for resource 'ora.gpnpd'. Details at (:CRSAGF00113:) {0:0:
-
2} in /grid/app/11.2/log/pos5gpp1/agent/ohasd/oraagent_grid/oraagent_grid.log.
-
2016-05-12 20:24:19.084:
-
[ohasd(2359742)]CRS-2757:Command 'Start' timed out waiting for response from the resource 'ora.gpnpd'. Details at (:CRSPE00111:) {0:0
-
:2} in /grid/app/11.2/log/pos5gpp1/ohasd/ohasd.log.
-
2016-05-12 20:24:20.560:
-
[mdnsd(6619232)]CRS-5602:mDNS service stopping by request.
-
2016-05-12 20:26:20.471:
-
[/grid/app/11.2/bin/oraagent.bin(5832976)]CRS-5818:Aborted command 'start' for resource 'ora.mdnsd'. Details at (:CRSAGF00113:) {0:0:
-
2} in /grid/app/11.2/log/pos5gpp1/agent/ohasd/oraagent_grid/oraagent_grid.log.
-
2016-05-12 20:26:24.476:
-
[ohasd(2359742)]CRS-2757:Command 'Start' timed out waiting for response from the resource 'ora.mdnsd'. Details at (:CRSPE00111:) {0:0
-
:2} in /grid/app/11.2/log/pos5gpp1/ohasd/ohasd.log.
-
2016-05-12 20:26:29.033:
- [gpnpd(3539086)]CRS-2329:GPNPD on node pos5gpp1 shutdown
-
[ clsdmc][1800]Fail to connect (ADDRESS=(PROTOCOL=ipc)(KEY=pos5gpp1DBG_MDNSD)) with status 9
-
2016-05-12 20:46:05.720: [ora.mdnsd][1800]{0:0:2} [start] Error = error 9 encountered when connecting to MDNSD
-
2016-05-12 20:46:06.720: [ora.mdnsd][1800]{0:0:2} [start] without returnbuf
- 2016-05-12 20:46:06.884: [ COMMCRS][1034]clsc_connect: (111ed9190) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=pos5gpp1DBG_MDNSD))
-
================================================================================
-
2016-05-12 20:47:19.870: [ default][1]gpnpd START pid=6357104 Oracle Grid Plug-and-Play Daemon
-
2016-05-12 20:47:19.870: [ GPNP][1]clsgpnp_Init: [at clsgpnp0.c:586 clsgpnp_Init] '/grid/app/11.2' in effect as GPnP home base.
-
2016-05-12 20:47:19.870: [ GPNP][1]clsgpnp_Init: [at clsgpnp0.c:632 clsgpnp_Init] GPnP pid=6357104, GPNP comp tracelevel=1, depcomp tracelevel=0, tlsrc:ORA_DAEMON_LOGGING_LEVELS, apitl:0, complog:1, tstenv:0, devenv:0, envopt:0, flags=3
-
2016-05-12 20:47:22.896: [ GPNP][1]clsgpnpkwf_initwfloc: [at clsgpnpkwf.c:399 clsgpnpkwf_initwfloc] Using FS Wallet Location : /grid/app/11.2/gpnp/pos5gpp1/wallets/peer/
-
-
[ CLWAL][1]clsw_Initialize: OLR initlevel [70000]
-
2016-05-12 20:47:22.988: [ COMMCRS][1029]clsclisten: Permission denied for (ADDRESS=(PROTOCOL=ipc)(KEY=pos5gpp1DBG_GPNPD))
-
-
2016-05-12 20:47:22.988: [ clsdmt][772]Fail to listen to (ADDRESS=(PROTOCOL=ipc)(KEY=pos5gpp1DBG_GPNPD))
-
2016-05-12 20:47:22.988: [ clsdmt][772]Terminating process
-
2016-05-12 20:47:22.988: [ GPNP][772]CLSDM requested exit
- 2016-05-12 20:47:22.988: [ default][772]GPNPD on node pos5gpp1 shutdown.
透過METALINK 發現解決方案 “診斷 Grid Infrastructure 啟動問題 (文件 ID 1623340.1)”
當網路的 socket 檔案許可權或者屬主設定不正確的時候,我們通常會上面的類似報錯資訊。
網路的 socket 檔案可能位於目錄: /tmp/.oracle, /var/tmp/.oracle or /usr/tmp/.oracle 中。
檢視我們的socket檔案所屬資訊:(所屬使用者基本全是oracle,但我們的CRS的所屬應該grid) <--問題出在這裡
-
grid@aaggpp1:/grid/app/11.2/bin/>ls -l /tmp/.oracle
-
total 8
-
srwxrwxrwx 1 oracle oinstall 0 Sep 1 2015 mdnsd
-
-rw-r--r-- 1 oracle oinstall 8 Sep 1 2015 mdnsd.pid
-
prw-r--r-- 1 oracle oinstall 0 Jun 10 2015 npohasd
-
srwxrwxrwx 1 oracle oinstall 0 Sep 1 2015 ora_gipc_GPNPD_pos5gpp1
-
-rw-r--r-- 1 oracle oinstall 0 Jun 10 2015 ora_gipc_GPNPD_pos5gpp1_lock
-
srwxrwxrwx 1 oracle oinstall 0 Sep 1 2015 ora_gipc_gipcd_pos5gpp1
-
-rw-r--r-- 1 oracle oinstall 0 Sep 1 2015 ora_gipc_gipcd_pos5gpp1_lock
-
srwxrwxrwx 1 root system 0 Jan 4 02:17 ora_gipc_spos5gpp1gridpos5gpp-clusterCRFM_CLIIPC
-
-rw-r--r-- 1 oracle oinstall 0 Jun 10 2015 ora_gipc_spos5gpp1gridpos5gpp-clusterCRFM_CLIIPC_lock
-
srwxrwxrwx 1 root system 0 Jan 4 02:17 ora_gipc_spos5gpp1gridpos5gpp-clusterCRFM_SIPC
-
-rw-r--r-- 1 oracle oinstall 0 Jun 10 2015 ora_gipc_spos5gpp1gridpos5gpp-clusterCRFM_SIPC_lock
-
srwxrwxrwx 1 oracle oinstall 0 Sep 21 2015 s#48758948.1
-
srwxrwxrwx 1 oracle oinstall 0 Jun 8 2015 s#66978252.1
-
srwxrwxrwx 1 oracle oinstall 0 Sep 23 2015 s#8455124.1
-
srwxrwxrwx 1 oracle oinstall 0 Sep 1 2015 sAevm
-
srwxrwxrwx 1 oracle oinstall 0 Sep 1 2015 sCRSD_IPC_SOCKET_11
-
-rw-r--r-- 1 oracle oinstall 0 Jun 10 2015 sCRSD_IPC_SOCKET_11_lock
-
srwxrwxrwx 1 oracle oinstall 0 Sep 1 2015 sCRSD_UI_SOCKET
-
srwxrwxrwx 1 oracle oinstall 0 Sep 1 2015 sCevm
-
srwxrwxrwx 1 oracle oinstall 0 Sep 23 2015 sEXTPROC1521
-
srwxrwxrwx 1 oracle oinstall 0 Sep 21 2015 sLISTENER
-
srwxrwxrwx 1 oracle oinstall 0 Sep 1 2015 sOCSSD_LL_pos5gpp1_
-
-rw-r--r-- 1 oracle oinstall 0 Jun 10 2015 sOCSSD_LL_pos5gpp1__lock
-
srwxrwxrwx 1 oracle oinstall 0 Sep 1 2015 sOCSSD_LL_pos5gpp1_pos5gpp-cluster
-
-rw-r--r-- 1 oracle oinstall 0 Jun 10 2015 sOCSSD_LL_pos5gpp1_pos5gpp-cluster_lock
-
srwxrwxrwx 1 root system 0 May 12 20:13 sOHASD_IPC_SOCKET_11
-
-rw-r--r-- 1 oracle oinstall 0 Jun 10 2015 sOHASD_IPC_SOCKET_11_lock
-
srwxrwxrwx 1 root system 0 May 12 20:13 sOHASD_UI_SOCKET
-
srwxrwxrwx 1 oracle oinstall 0 Sep 1 2015 sOracle_CSS_LclLstnr_pos5gpp-cluster_1
-
-rw-r--r-- 1 oracle oinstall 0 Jun 10 2015 sOracle_CSS_LclLstnr_pos5gpp-cluster_1_lock
-
srwxrwxrwx 1 oracle oinstall 0 Sep 1 2015 sSYSTEM.evm.acceptor.auth
-
srwxrwxrwx 1 oracle oinstall 0 Sep 1 2015 sora_crsqs
-
srwxrwxrwx 1 oracle oinstall 0 Sep 1 2015 spos5gpp1DBG_CRSD
-
srwxrwxrwx 1 oracle oinstall 0 Sep 1 2015 spos5gpp1DBG_CSSD
-
srwxrwxrwx 1 oracle oinstall 0 Sep 1 2015 spos5gpp1DBG_CTSSD
-
srwxrwxrwx 1 oracle oinstall 0 Sep 1 2015 spos5gpp1DBG_EVMD
-
srwxrwxrwx 1 oracle oinstall 0 Sep 1 2015 spos5gpp1DBG_GIPCD
-
srwxrwxrwx 1 oracle oinstall 0 Sep 1 2015 spos5gpp1DBG_GPNPD
-
srwxrwxrwx 1 root system 0 Dec 17 15:17 spos5gpp1DBG_LOGD
-
srwxrwxrwx 1 oracle oinstall 0 Sep 1 2015 spos5gpp1DBG_MDNSD
-
srwxrwxrwx 1 root system 0 Jan 4 02:17 spos5gpp1DBG_MOND
-
srwxrwxrwx 1 root system 0 May 12 20:13 spos5gpp1DBG_OHASD
-
srwxrwxrwx 1 oracle oinstall 0 Sep 1 2015 sprocr_local_conn_0_PROC
-
-rw-r--r-- 1 oracle oinstall 0 Jun 10 2015 sprocr_local_conn_0_PROC_lock
-
srwxrwxrwx 1 root system 0 May 12 20:13 sprocr_local_conn_0_PROL
- -rw-r--r-- 1 oracle oinstall 0 Jun 10 2015 sprocr_local_conn_0_PROL_lock
我們已經找到原因,下面有兩種方法解決。
第一,批次修改許可權
第二,刪除scoket檔案, crs重啟後會重建。
第一種方法:批次修改
-
chown grid:oinstall mdnsd
-
chown grid:oinstall mdnsd.pid
-
chown root:system npohasd
-
chown grid:oinstall ora_gipc_GPNPD_pos5gpp1
-
chown grid:oinstall ora_gipc_GPNPD_pos5gpp1_lock
-
chown grid:oinstall ora_gipc_gipcd_pos5gpp1
-
chown grid:oinstall ora_gipc_gipcd_pos5gpp1_lock
-
chown root:system ora_gipc_spos5gpp1gridpos5gpp-clusterCRFM_CLIIPC
-
chown root:system ora_gipc_spos5gpp1gridpos5gpp-clusterCRFM_CLIIPC_lock
-
chown root:system ora_gipc_spos5gpp1gridpos5gpp-clusterCRFM_SIPC
-
chown root:system ora_gipc_spos5gpp1gridpos5gpp-clusterCRFM_SIPC_lock
-
chown oracle:oinstall s#14287242.1
-
chown grid:oinstall s#48955460.1
-
chown grid:oinstall s#5046366.1
-
chown grid:oinstall sAevm
-
chown root:system sCRSD_IPC_SOCKET_11
-
chown root:system sCRSD_IPC_SOCKET_11_lock
-
chown root:system sCRSD_UI_SOCKET
-
chown grid:oinstall sCevm
-
chown grid:oinstall sLISTENER
-
chown grid:oinstall sLISTENER_SCAN1
-
chown grid:oinstall sOCSSD_LL_pos5gpp1_
-
chown grid:oinstall sOCSSD_LL_pos5gpp1__lock
-
chown grid:oinstall sOCSSD_LL_pos5gpp1_pos5gpp-cluster
-
chown grid:oinstall sOCSSD_LL_pos5gpp1_pos5gpp-cluster_lock
-
chown root:system sOHASD_IPC_SOCKET_11
-
chown root:system sOHASD_IPC_SOCKET_11_lock
-
chown root:system sOHASD_UI_SOCKET
-
chown grid:oinstall sOracle_CSS_LclLstnr_pos5gpp-cluster_2
-
chown grid:oinstall sOracle_CSS_LclLstnr_pos5gpp-cluster_2_lock
-
chown grid:oinstall sSYSTEM.evm.acceptor.auth
-
chown root:system sora_crsqs
-
chown root:system spos5gpp1DBG_CRSD
-
chown grid:oinstall spos5gpp1DBG_CSSD
-
chown root:system spos5gpp1DBG_CTSSD
-
chown grid:oinstall spos5gpp1DBG_EVMD
-
chown grid:oinstall spos5gpp1DBG_GIPCD
-
chown grid:oinstall spos5gpp1DBG_GPNPD
-
chown root:system spos5gpp1DBG_LOGD
-
chown grid:oinstall spos5gpp1DBG_MDNSD
-
chown root:system spos5gpp1DBG_MOND
-
chown root:system spos5gpp1DBG_OHASD
-
chown root:system sprocr_local_conn_0_PROC
-
chown root:system sprocr_local_conn_0_PROC_lock
-
chown root:system sprocr_local_conn_0_PROL
- chown root:system sprocr_local_conn_0_PROL_lock
第二種方法,使用root 使用者停掉 GI,刪除這些 socket 檔案,並重新啟動 GI。
安裝上面解決方法處理後,CRS恢復正常。
-
grid@aadgpp1:/home/grid/>crsctl check crs
-
CRS-4638: Oracle High Availability Services is online
-
CRS-4537: Cluster Ready Services is online
-
CRS-4529: Cluster Synchronization Services is online
-
CRS-4533: Event Manager is online
-
grid@aadgpp1:/home/grid/>ps -ef|grep d.bin
-
grid 4063236 1 0 21:37:23 - 0:00 /grid/app/11.2/bin/mdnsd.bin
-
root 5308482 1 0 21:38:05 - 0:01 /grid/app/11.2/bin/crsd.bin reboot
-
root 6029390 1 0 21:37:26 - 0:01 /grid/app/11.2/bin/osysmond.bin
-
root 6291706 1 0 21:37:07 - 0:01 /grid/app/11.2/bin/ohasd.bin reboot
-
grid 6422532 6815890 0 21:37:28 - 0:01 /grid/app/11.2/bin/ocssd.bin
-
grid 3342780 1 1 21:38:05 - 0:00 /grid/app/11.2/bin/evmd.bin
-
grid 5046692 1 0 21:37:24 - 0:00 /grid/app/11.2/bin/gpnpd.bin
-
grid 6095300 1 0 21:37:26 - 0:00 /grid/app/11.2/bin/gipcd.bin
- root 6488494 1 0 21:37:56 - 0:00 /grid/app/11.2/bin/octssd.bin reboot
需要注意:
-
Caution:
-
After installation is complete, do not remove manually or run cron jobs that remove /tmp/.oracle or /var/tmp/.oracle directories or their files while Oracle software is running on the server. If you remove these files, then the Oracle software can encounter intermittent hangs. Oracle Clusterware installations can fail with the error:
- CRS-0184: Cannot communicate with the CRS daemon.
記住:
如果RAC正在執行千萬不能刪除!
如果檔案刪除了,重新CRS會自動重新建立!
如果目錄刪除了,那就只能參考下面的命令重建了
Create the /var/tmp and /var/tmp/.oracle directory:
/bin/mkdir -p /var/tmp/.oracle
/bin/chmod 01777 /var/tmp/
/bin/chown root /var/tmp/
/bin/chmod 01777 /var/tmp/.oracle
/bin/chown root /var/tmp/.oracle
另外補充說明(隱藏SOCKET檔案):
- The hidden directory '/var/tmp/.oracle' (or /tmp/.oracle on some platforms) or its content was removed while instances & the CRS stack were up and running. Typically this directory contains a number of "special" socket files that are used by local clients to connect via the IPC protocol (sqlnet) to various Oracle processes including the TNS listener, the CSS, CRS & EVM daemons or even database or ASM instances. These files are created when the "listening" process starts.
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/17086096/viewspace-2098909/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- Oracle Rac crs無法啟動Oracle
- 微信(Android 6.2.2.54)無法啟動的一種處理辦法Android
- oracle 11gR2 crs 其中一個節點grid叢集啟動不成功處理案例Oracle
- 【問題處理】因誤修改inittab檔案導致Oracle 10gR2 CRS無法啟動Oracle 10g
- 啟動一個程式並處理程式結束事件 (轉)事件
- 一個簡單的bigfile tablespace無法擴充套件的案例處理套件
- Oracle日常問題處理-資料庫無法啟動Oracle資料庫
- Mac環境下MySQL無法啟動的處理方法MacMySql
- Windows 下處理資料庫無法啟動問題Windows資料庫
- 一則資料庫無法啟動的奇怪案例分析資料庫
- HBase啟動不了的一個原因處理
- Eclipse非正常死亡後無法啟動的處理方法Eclipse
- 幾種常見MySQL無法啟動案例MySql
- oracle 10g crs啟動不成功問題處理Oracle 10g
- oracle listener 監聽啟動不起來處理案例一則Oracle
- CRS啟動的三個主要的後臺程式
- 處理mysql無法啟動且啟動時服務沒報錯的異常情況MySql
- 一次資料庫無法啟動問題的處理-ORA-00845資料庫
- 資料庫異常關閉後無法啟動問題處理一例資料庫
- 隨身碟無法開啟怎樣處理
- smon程式互為死鎖案例--oracle一個bug處理Oracle
- ORACLE RAC 11.2.0.4 FOR RHEL6叢集無法啟動的處理Oracle
- 應用使用JNDI,資料庫無法連線,導致的程序無法啟動問題處理資料庫
- [ORACLE轉載-故障處理]11g的ohasd無法隨機啟動Oracle隨機
- Centreon 無法啟動nagios程式iOS
- 一則資料庫無法重啟的案例分析資料庫
- Oracle日常問題-資料庫無法啟動(案例二)Oracle資料庫
- 恆訊科技教你:雲伺服器無法啟動與關閉如何處理?伺服器
- oracle case處理案例(一)Oracle
- oracle延遲事務無法自動推入處理Oracle
- oracle 10g crs 不能啟動問題解決案例Oracle 10g
- 懷疑私網網路卡多播問題導致crs無法正常啟動
- win10office無法開啟怎麼修復_win10office無法開啟如何處理Win10
- 某省ORACLE10G RAC資料庫CRS啟動失敗問題處理Oracle資料庫
- Windows最佳化大師最佳化後導致監聽無法啟動處理辦法Windows
- win10依賴服務或組無法啟動怎麼辦 win10提示依賴服務或組無法啟動如何處理Win10
- WinXP啟動無法顯示桌面,無法啟動IE,無法關機的解決方案
- 【問題處理】因ASM磁碟組空間不足導致資料庫例項無法啟動的故障處理ASM資料庫