Oracle RAC root.sh 報錯 Timed out waiting for the CRS stack to start 解決方法

小亮520cl發表於2015-07-13

一.問題描述

 

在Oracle Linux 6.1 上安裝11.2.0.1的RAC,在第二個節點執行root.sh時,報time out,如下:

[root@rac2 ~]# /u01/app/11.2.0/grid/root.sh

Running Oracle 11g root.sh script...

 

The following environment variables are setas:

   ORACLE_OWNER= oracle

   ORACLE_HOME=  /u01/app/11.2.0/grid

 

Enter the full pathname of the local bindirectory: [/usr/local/bin]:

  Copying dbhome to /usr/local/bin ...

  Copying oraenv to /usr/local/bin ...

  Copying coraenv to /usr/local/bin ...

 

Entries will be added to the /etc/oratabfile as needed by

Database Configuration Assistant when adatabase is created

Finished running generic part of root.shscript.

Now product-specific root actions will beperformed.

2012-06-27 14:46:35: Parsing the host name

2012-06-27 14:46:35: Checking for superuser privileges

2012-06-27 14:46:35: User has super userprivileges

Using configuration parameter file:/u01/app/11.2.0/grid/crs/install/crsconfig_params

Creating trace directory

LOCAL ADD MODE

Creating OCR keys for user 'root', privgrp'root'..

Operation successful.

Adding daemon to inittab

CRS-4123: Oracle High Availability Serviceshas been started.

ohasd is starting

ADVM/ACFS is not supported onoraclelinux-release-6Server-1.0.2.x86_64

 

 

 

CRS-4402: The CSS daemon was started inexclusive mode but found an active CSS daemon on node rac1, number 1, and isterminating

An active cluster was found duringexclusive startup, restarting to join the cluster

CRS-2672: Attempting to start 'ora.mdnsd'on 'rac2'

CRS-2676: Start of 'ora.mdnsd' on 'rac2'succeeded

CRS-2672: Attempting to start 'ora.gipcd'on 'rac2'

CRS-2676: Start of 'ora.gipcd' on 'rac2'succeeded

CRS-2672: Attempting to start 'ora.gpnpd'on 'rac2'

CRS-2676: Start of 'ora.gpnpd' on 'rac2'succeeded

CRS-2672: Attempting to start'ora.cssdmonitor' on 'rac2'

CRS-2676: Start of 'ora.cssdmonitor' on'rac2' succeeded

CRS-2672: Attempting to start 'ora.cssd' on'rac2'

CRS-2672: Attempting to start 'ora.diskmon'on 'rac2'

CRS-2676: Start of 'ora.diskmon' on 'rac2'succeeded

CRS-2676: Start of 'ora.cssd' on 'rac2'succeeded

CRS-2672: Attempting to start 'ora.ctssd'on 'rac2'

CRS-2676: Start of 'ora.ctssd' on 'rac2'succeeded

CRS-2672: Attempting to start 'ora.asm' on'rac2'

CRS-2676: Start of 'ora.asm' on 'rac2'succeeded

CRS-2672: Attempting to start 'ora.crsd' on'rac2'

CRS-2676: Start of 'ora.crsd' on 'rac2'succeeded

CRS-2672: Attempting to start 'ora.evmd' on'rac2'

CRS-2676: Start of 'ora.evmd' on 'rac2'succeeded

Timed outwaiting for the CRS stack to start.

 

 

檢視相關的狀態:

[oracle@rac1 bin]$ ./crsctl check cluster-all

**************************************************************

rac1:

CRS-4537: Cluster Ready Services is online

CRS-4529: Cluster Synchronization Servicesis online

CRS-4533: EventManager is online

 

[oracle@rac2 bin]$  ./crsctl check cluster -all

**************************************************************

rac2:

CRS-4535: Cannot communicate with ClusterReady Services

CRS-4529: Cluster Synchronization Servicesis online

CRS-4533: EventManager is online

 

[oracle@rac1 bin]$ ./crs_stat -t -v

Name           Type           R/RA   F/FT  Target    State     Host       

----------------------------------------------------------------------

ora.DATA.dg    ora....up.type 0/5    0/    ONLINE    ONLINE    rac1       

ora....N1.lsnr ora....er.type 0/5    0/0   ONLINE    ONLINE    rac1       

ora.asm        ora.asm.type   0/5   0/     ONLINE    ONLINE   rac1       

ora.eons       ora.eons.type  0/3   0/     ONLINE    ONLINE   rac1       

ora.gsd        ora.gsd.type   0/5   0/     OFFLINE   OFFLINE              

ora....network ora....rk.type 0/5    0/    ONLINE    ONLINE    rac1       

ora.oc4j       ora.oc4j.type  0/5   0/0    OFFLINE   OFFLINE              

ora.ons        ora.ons.type   0/3   0/     ONLINE    ONLINE   rac1       

ora....SM1.asm application    0/5   0/0    ONLINE    ONLINE   rac1       

ora.rac1.gsd   application    0/5   0/0    OFFLINE   OFFLINE              

ora.rac1.ons   application    0/3   0/0    ONLINE    ONLINE   rac1       

ora.rac1.vip   ora....t1.type 0/0    0/0   ONLINE    ONLINE    rac1       

ora.scan1.vip  ora....ip.type 0/0    0/0   ONLINE    ONLINE    rac1      

 

[oracle@rac2 bin]$ ./crs_stat -t -v

CRS-0184: Cannot communicate with the CRSdaemon.

 

[oracle@rac2 bin]$

 

在節點2上的命令沒有成功執行的。

 

 

二.MOS 上的說明

Root.Sh Failing with 'Prom_rpc: Clsc SendFailure..Ret Code 6' [ID 745215.1]

 

2.1 Symptoms

During CRS install root.sh fails on thelast node with follow message:

Waiting for the Oracle CRSD and EVMD tostart
Waiting for the Oracle CRSD and EVMD to start
Waiting for the Oracle CRSD and EVMD to start
Waiting for the Oracle CRSD and EVMD to start
Waiting for the Oracle CRSD and EVMD to start
Timed out waiting for the CRS stack to start.


The crsd.log on the first node shows:

2008-10-21 21:04:55.087: [OCRMSG][1325496672]prom_rpc: CLSC send failure..ret 
code 6 
2008-10-21 21:04:55.087: [ OCRMSG][1325496672]prom_rpc: possible OCRretry 
scenario 
2008-10-21 21:04:55.087: [ OCRSRV][1325496672]proas_forward_request:PROM_TIM 
E_OUT or Master Fail 
2008-10-21 21:04:55.296: [ COMMCRS][2540957248]clscsendx: (0xc43bd0)Connection 
not active 

2008-10-21 21:04:55.296: [OCRMSG][2540957248]prom_rpc: CLSC send failure..ret 
code 6 
2008-10-21 21:04:55.296: [ OCRMSG][2540957248]prom_rpc: possible OCRretry 
scenario 
2008-10-21 21:04:55.296: [ OCRCLI][2540957248]proac_open_key:[SYSTEM.crs.debug. 
ist4-db1-3-sfm.COMMNS]: Writer failed. Retval [203] 



The ocssd.log on the first node shows:

2008-10-21 20:33:01.626: [OCRAPI][2540958848]procr_open: Node Failure. 
Attempting retry #0 
2008-10-21 20:33:02.628: [ OCRAPI][2540958848]procr_open: Node Failure. 
Attempting retry #1 
2008-10-21 20:33:03.631: [ OCRAPI][2540958848]procr_open: Node Failure. 
Attempting retry #2 


The ocssd.log on the last node shows:

[ CSSD]2008-10-22 04:46:42.457 [1241577824]>TRACE: clssnmRcfgMgrThread: 
lastleader(1) unique(1224650474) 
[ CSSD]2008-10-22 04:46:43.219 [1168148832] >TRACE:clssnmSendVoteInfo: 
node(1) syncSeqNo(4) 
[ CSSD]2008-10-22 04:46:56.338 [2537955008] >ERROR: clssgmStartNMMon: 
timed out waiting on nested NM reconfig. Self-sacrificing to kick othersawake. 
[ CSSD]2008-10-22 04:46:56.338 [2537955008] >ERROR: StartCMMon(): 
clssnmNMDetach failed - 2 
[ CSSD]2008-10-22 04:46:56.338 [2537955008] >TRACE: clssscctx: dump of 
0x0x5d2360, len 3792 


2.2 Cause


This is due to a failure of communicationbetween crsd.bin on nodes.

 

2.3 Solution


Check the network for the following items:

- Check to see that there is No firewall between the nodes
- Make sure that the MTU size is same. 
- If MTU is larger than 1500, then the switch must be able to support largerMTU size.
- Make sure that you have disabled SELINUX
- Make sure that NICs are using full duplex and not auto negotiate.
- Misconfiguration on the switches will also cause this issue.

 

 

 

三.解決方法

在MOS的文件裡提示的原因和防火牆,時間,SELINUX,網路卡型別有關,基本可以確定就是和網路卡相關的原因導致這類問題,我的的原因是是2個節點的網路卡名稱不一致,所以修改網路卡名一致後,嘗試重新執行一下root.sh 命令。

 

即修改之前:rac1 是eth0和eth1,節點2是:eth5和eth6. 怎麼修改網路卡名,這個google一下,這裡不做說明。

 

解除安裝之前的操作,命令如下:

/u01/app/11.2.0/grid/crs/install/rootcrs.pl-deconfig  -verbose -force

 

注意這裡,Oracle11g與10g中命令的區別。

 

--解除安裝:

[root@rac2 ~]#/u01/app/11.2.0/grid/crs/install/rootcrs.pl -deconfig  -verbose -force

2012-06-27 15:12:30: Parsing the host name

2012-06-27 15:12:30: Checking for superuser privileges

2012-06-27 15:12:30: User has super userprivileges

Using configuration parameter file:/u01/app/11.2.0/grid/crs/install/crsconfig_params

PRCR-1035 : Failed to look up CRS resourceora.cluster_vip.type for 1

PRCR-1068 : Failed to query resources

Cannot communicate with crsd

PRCR-1070 : Failed to check if resourceora.gsd is registered

Cannot communicate with crsd

PRCR-1070 : Failed to check if resourceora.ons is registered

Cannot communicate with crsd

PRCR-1070 : Failed to check if resourceora.eons is registered

Cannot communicate with crsd

 

ADVM/ACFS is not supported onoraclelinux-release-6Server-1.0.2.x86_64

 

ACFS-9201: Not Supported

CRS-2791: Starting shutdown of Oracle HighAvailability Services-managed resources on 'rac2'

CRS-2673: Attempting to stop 'ora.mdnsd' on'rac2'

CRS-2673: Attempting to stop 'ora.gpnpd' on'rac2'

CRS-2673: Attempting to stop'ora.cssdmonitor' on 'rac2'

CRS-2673: Attempting to stop 'ora.ctssd' on'rac2'

CRS-2673: Attempting to stop 'ora.evmd' on'rac2'

CRS-2673: Attempting to stop 'ora.asm' on'rac2'

CRS-2677: Stop of 'ora.cssdmonitor' on 'rac2'succeeded

CRS-2677: Stop of 'ora.gpnpd' on 'rac2'succeeded

CRS-2677: Stop of 'ora.mdnsd' on 'rac2'succeeded

CRS-2677: Stop of 'ora.evmd' on 'rac2'succeeded

CRS-2677: Stop of 'ora.ctssd' on 'rac2'succeeded

CRS-2677: Stop of 'ora.asm' on 'rac2' succeeded

CRS-2673: Attempting to stop 'ora.cssd' on'rac2'

CRS-2677: Stop of 'ora.cssd' on 'rac2'succeeded

CRS-2673: Attempting to stop 'ora.diskmon'on 'rac2'

CRS-2673: Attempting to stop 'ora.gipcd' on'rac2'

CRS-2677: Stop of 'ora.gipcd' on 'rac2'succeeded

CRS-2677: Stop of 'ora.diskmon' on 'rac2'succeeded

CRS-2793: Shutdown of Oracle HighAvailability Services-managed resources on 'rac2' has completed

CRS-4133: Oracle High Availability Serviceshas been stopped.

error: package cvuqdisk is not installed

Successfully deconfigured Oracleclusterware stack on this node

 

--重新執行root.sh,這次成功。

 

[root@rac2 ~]# /u01/app/11.2.0/grid/root.sh

Running Oracle 11g root.sh script...

 

The following environment variables are setas:

   ORACLE_OWNER= oracle

   ORACLE_HOME=  /u01/app/11.2.0/grid

 

Enter the full pathname of the local bindirectory: [/usr/local/bin]:

The file "dbhome" already existsin /usr/local/bin.  Overwrite it? (y/n)

[n]:

The file "oraenv" already existsin /usr/local/bin.  Overwrite it? (y/n)

[n]:

The file "coraenv" already existsin /usr/local/bin.  Overwrite it? (y/n)

[n]:

 

Entries will be added to the /etc/oratabfile as needed by

Database Configuration Assistant when adatabase is created

Finished running generic part of root.shscript.

Now product-specific root actions will beperformed.

2012-06-27 16:21:25: Parsing the host name

2012-06-27 16:21:25: Checking for superuser privileges

2012-06-27 16:21:25: User has super userprivileges

Using configuration parameter file: /u01/app/11.2.0/grid/crs/install/crsconfig_params

LOCAL ADD MODE

Creating OCR keys for user 'root', privgrp'root'..

Operation successful.

Adding daemon to inittab

CRS-4123: Oracle High Availability Serviceshas been started.

ohasd is starting

ADVM/ACFS is not supported onoraclelinux-release-6Server-1.0.2.x86_64

 

 

 

CRS-4402: The CSS daemon was started inexclusive mode but found an active CSS daemon on node rac1, number 1, and isterminating

An active cluster was found duringexclusive startup, restarting to join the cluster

CRS-2672: Attempting to start 'ora.mdnsd'on 'rac2'

CRS-2676: Start of 'ora.mdnsd' on 'rac2'succeeded

CRS-2672: Attempting to start 'ora.gipcd'on 'rac2'

CRS-2676: Start of 'ora.gipcd' on 'rac2'succeeded

CRS-2672: Attempting to start 'ora.gpnpd'on 'rac2'

CRS-2676: Start of 'ora.gpnpd' on 'rac2'succeeded

CRS-2672: Attempting to start'ora.cssdmonitor' on 'rac2'

CRS-2676: Start of 'ora.cssdmonitor' on'rac2' succeeded

CRS-2672: Attempting to start 'ora.cssd' on'rac2'

CRS-2672: Attempting to start 'ora.diskmon'on 'rac2'

CRS-2676: Start of 'ora.diskmon' on 'rac2'succeeded

CRS-2676: Start of 'ora.cssd' on 'rac2'succeeded

CRS-2672: Attempting to start 'ora.ctssd'on 'rac2'

CRS-2676: Start of 'ora.ctssd' on 'rac2'succeeded

CRS-2672: Attempting to start 'ora.asm' on'rac2'

CRS-2676: Start of 'ora.asm' on 'rac2'succeeded

CRS-2672: Attempting to start 'ora.crsd' on'rac2'

CRS-2676: Start of 'ora.crsd' on 'rac2'succeeded

CRS-2672: Attempting to start 'ora.evmd' on'rac2'

CRS-2676: Start of 'ora.evmd' on 'rac2'succeeded

 

rac2    2012/06/27 16:25:16    /u01/app/11.2.0/grid/cdata/rac2/backup_20120627_162516.olr

Preparing packages for installation...

cvuqdisk-1.0.7-1

Configure Oracle Grid Infrastructure for aCluster ... succeeded

Updating inventory properties forclusterware

Starting Oracle Universal Installer...

 

Checking swap space: must be greater than500 MB.   Actual 999 MB    Passed

The inventory pointer is located at/etc/oraInst.loc

The inventory is located at/u01/app/oraInventory

[root@rac2 ~]#

 

 

驗證:

 

[oracle@rac2 bin]$ ./crsctl check cluster-all

**************************************************************

rac1:

CRS-4537: Cluster Ready Services is online

CRS-4529: Cluster Synchronization Servicesis online

CRS-4533: Event Manager is online

**************************************************************

rac2:

CRS-4537: Cluster Ready Services is online

CRS-4529: Cluster Synchronization Servicesis online

CRS-4533: Event Manager is online

**************************************************************

[oracle@rac2 bin]$ ./crs_stat -t -v

Name           Type           R/RA   F/FT  Target    State     Host       

----------------------------------------------------------------------

ora.DATA.dg    ora....up.type 0/5    0/    ONLINE    ONLINE    rac1       

ora....N1.lsnr ora....er.type 0/5    0/0   ONLINE    ONLINE    rac1       

ora.asm        ora.asm.type   0/5   0/     ONLINE    ONLINE   rac1       

ora.eons       ora.eons.type  0/3    0/    ONLINE    ONLINE    rac1       

ora.gsd        ora.gsd.type   0/5   0/     OFFLINE   OFFLINE              

ora....network ora....rk.type 0/5    0/    ONLINE    ONLINE    rac1       

ora.oc4j       ora.oc4j.type  0/5   0/0    OFFLINE   OFFLINE              

ora.ons        ora.ons.type   0/3   0/     ONLINE    ONLINE   rac1       

ora....SM1.asm application    0/5   0/0    ONLINE    ONLINE   rac1       

ora.rac1.gsd   application    0/5   0/0    OFFLINE   OFFLINE              

ora.rac1.ons   application    0/3   0/0    ONLINE    ONLINE   rac1       

ora.rac1.vip   ora....t1.type 0/0    0/0   ONLINE    ONLINE    rac1       

ora....SM2.asm application    0/5   0/0    ONLINE    ONLINE   rac2       

ora.rac2.gsd   application    0/5   0/0    OFFLINE   OFFLINE              

ora.rac2.ons   application    0/3   0/0    ONLINE    ONLINE   rac2       

ora.rac2.vip   ora....t1.type 0/0    0/0   ONLINE    ONLINE    rac2       

ora.scan1.vip  ora....ip.type 0/0    0/0   ONLINE    ONLINE    rac1       

[oracle@rac2 bin]$

 

 

Root.sh 執行成功。

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/29096438/viewspace-1731703/,如需轉載,請註明出處,否則將追究法律責任。

相關文章