Root.sh failed at Failure at final check of Oracle CRS stack 10 問題

paulyibinyi發表於2010-04-16

      今天在公司和同事一起測試安裝oracle 10g rac for aix6.1+asm安裝,在裝完

crs叢集軟體,在第一節點執行root.sh時最後報以下錯誤:

failed at Failure at final check of Oracle CRS stack

 10

檢視$ORA_CRS_HOME/log/p520/client下的相關日誌報下面錯誤

[root@p520:/crs/app/oracle/product/crs_1/log/p520/client]#more clsc.log
Oracle Database 10g CRS Release 10.2.0.1.0 Production Copyright 1996, 2005 Oracle.  All rights reserved.
2010-04-16 16:55:05.812: [ COMMCRS][1]clsc_connect: (11068c330) no listener at (ADDRESS=(PROTOCOL=IPC)(KEY=CRSD_UI_SOCKET))

2010-04-16 16:55:05.813: [ COMMCRS][1]clsc_connect: (11068b6b0) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=SYSTEM.evm.acceptor.auth)
)

2010-04-16 16:55:05.813: [ default][1]Terminating clsd session
2010-04-16 18:41:53.148: [ COMMCRS][1]clsc_connect: (1101bf070) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_p520_crs))

2010-04-16 18:41:53.149: [ CSSCLNT][1]clsssInitNative: connect failed, rc 9

2010-04-16 18:41:53.149: [ default][1]Terminating clsd session
2010-04-16 17:42:17.881: [ COMMCRS][1]clsc_connect: (1101bf070) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_p520_crs))

2010-04-16 17:42:17.882: [ CSSCLNT][1]clsssInitNative: connect failed, rc 9

2010-04-16 17:42:17.882: [ default][1]Terminating clsd session
2010-04-16 17:45:21.416: [ COMMCRS][1]clsc_connect: (1101bf070) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_p520_crs))

2010-04-16 17:45:21.519: [ CSSCLNT][1]clsssInitNative: connect failed, rc 9

crsctl check boot無報錯

/etc/init.cssd 無報錯

crs程式起不來

沒辦法,只能查metalink,查到有可能是aix gui 工具沒配置,建議在圖形視窗執行 install_assist 命令,進行修改時間的配置,配置完成後重新啟動作業系統,啟動crs,發現有evmd程式沒起來。

[oracle@p520:/oracle/app/oracle]$ps -ef|grep crs
    root 229546      1   0 17:45:37      -  0:01 /crs/app/oracle/product/crs_1/bin/crsd.bin reboot
  oracle 327874 569510   0 19:00:57  pts/1  0:00 grep crs
  oracle 389122 360648   0 17:45:41      -  0:00 /bin/sh -c ulimit -c unlimited; cd /crs/app/oracle/product/crs_1/log/p520/cssd;  /crs/app/oracle/product/crs_1/bin/ocssd  || exit $?
    root 409806      1   0 17:46:06      -  0:00 /bin/sh /etc/init.crsd run
  oracle 454904 389122   0 17:45:41      -  0:00 /crs/app/oracle/product/crs_1/bin/ocssd.bin
    root 524338 418004   0 17:45:40      -  0:00 /crs/app/oracle/product/crs_1/bin/oprocd run -t 1000 -m 500

執行init.crs start 命令  evmd程式正常啟動

[root@p520:/etc]#init.crs start

[root@p520:/etc]#init.crs start
Startup will be queued to init within 30 seconds.
[root@p520:/etc]#ps -ef|grep crs
    root 229546      1   0 17:45:37      -  0:01 /crs/app/oracle/product/crs_1/bin/crsd.bin reboot
  oracle 389122 360648   0 17:45:41      -  0:00 /bin/sh -c ulimit -c unlimited; cd /crs/app/oracle/product/crs_1/log/p520/cssd;  /crs/app/oracle/product/crs_1/bin/ocssd  || exit $?
    root 409806      1   0 17:46:06      -  0:00 /bin/sh /etc/init.crsd run
  oracle 454904 389122   0 17:45:41      -  0:00 /crs/app/oracle/product/crs_1/bin/ocssd.bin
    root 524338 418004   0 17:45:40      -  0:00 /crs/app/oracle/product/crs_1/bin/oprocd run -t 1000 -m 500
    root 569568 421918   0 19:02:19  pts/1  0:00 grep crs
[root@p520:/etc]#ps -ef|grep crs
  oracle 208974 323776   0 19:02:22      -  0:00 /crs/app/oracle/product/crs_1/bin/evmlogger.bin -o /crs/app/oracle/product/crs_1/evm/log/evmlogger.info -l /crs/app/oracle/product/crs_1/evm/log/evmlogger.log
    root 229546      1   1 17:45:37      -  0:01 /crs/app/oracle/product/crs_1/bin/crsd.bin reboot
  oracle 323776      1   0 17:45:53      -  0:00 /crs/app/oracle/product/crs_1/bin/evmd.bin
  oracle 389122 360648   0 17:45:41      -  0:00 /bin/sh -c ulimit -c unlimited; cd /crs/app/oracle/product/crs_1/log/p520/cssd;  /crs/app/oracle/product/crs_1/bin/ocssd  || exit $?
    root 409806      1   0 17:46:06      -  0:00 /bin/sh /etc/init.crsd run
  oracle 454904 389122   0 17:45:41      -  0:00 /crs/app/oracle/product/crs_1/bin/ocssd.bin
    root 524338 418004   0 17:45:40      -  0:00 /crs/app/oracle/product/crs_1/bin/oprocd run -t 1000 -m 500
    root 540718 421918   0 19:02:27  pts/1  0:00 grep crs

檢視crs狀態,都online,正常

[oracle@p520:/oracle/app/oracle]$crs_stat -t
Name           Type           Target    State     Host       
------------------------------------------------------------
ora.p520.gsd   application    ONLINE    ONLINE    p520       
ora.p520.ons   application    ONLINE    ONLINE    p520       
ora.p520.vip   application    ONLINE    ONLINE    p520       
ora.p650.gsd   application    ONLINE    ONLINE    p650       
ora.p650.ons   application    ONLINE    ONLINE    p650       
ora.p650.vip   application    ONLINE    ONLINE    p650  

以下為metalink的詳細解釋

In this Document
  
  
  
  
  


Applies to:

Oracle Server - Enterprise Edition - Version: 10.2.0.1
This problem can occur on any platform.

Symptoms

2 node RAC, performing CRS 10.2.0.1 installation, failure at root.sh, CRS stack not started

WARNING: directory '/oracle' is not owned by root
Checking to see if Oracle CRS stack is already configured
Setting the permissions on OCR backup directory
Setting up NS directories
Oracle Cluster Registry configuration upgraded successfully
WARNING: directory '/oracle' is not owned by root
Successfully accumulated necessary OCR keys.
Using ports: CSS=49895 CRS=49896 EVMC=49898 and EVMR=49897.
node :
node 1: ndb1 ndb1-rac ndb1
node 2: ndb2 ndb2-rac ndb2
Creating OCR keys for user 'root', privgrp 'system'..
Operation successful.
Now formatting voting device: /dev/rhdisk3
Format of 1 voting devices complete.
Startup will be queued to init within 30 seconds.
Adding daemons to inittab
Adding daemons to inittab
Expecting the CRS daemons to be up within 600 seconds.
Failure at final check of Oracle CRS stack.
10

Checking "ps -ef | grep init" during this 600 seconds period, see there is no crs related init processes, eg: init.crsd/init.evmd/init.cssd running. When no processes are detected, then this case apply.

Changes

New installation

Cause

This particular case is caused by the OS init system does not working.

" Failure at final check of Oracle CRS stack.
10"
means CRS daemon did not startup during 600 seconds period.


In root.sh script, it will add CRS related entry in /etc/inittab, run "init q" and expect those 3 CRS related daemon processes to start. With init system problem, none of these daemon processes spawned, this caused CRS process startup failure as they rely on init daemon process to start first.

This can be verified by adding a simple entry in /etc/inittab:
test:2:once:/usr/bin/echo "HELLO TEST" > /tmp/test.log
run "init q" as root user. If the init is working, then there should be a file /tmp/test.log generated.

Solution

Please consult with system administrator for init issue.

e.g. here the solution reference only valid for AIX platform.:
1. Starting the script. install_assist (AIX GUI utility Installation Assistance)
2. Updating for example the date, then exit install_assist properly
3. Reboot the system
After that daemon process in /etc/inittab started, CRS installation completed.

For other platforms, please consult your system admin or vendor for its solution.

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/7199859/viewspace-659985/,如需轉載,請註明出處,否則將追究法律責任。

相關文章