HP-unix下安裝11G RAC出現的問題(四)

anycall2010發表於2009-06-21

安裝CRS的時候,比較順利,但是在執行指令碼: ”/home/oracle/crs/root.sh“的時候,始終啟動不了CRS。錯誤現象如下:

Adding daemons to inittab
Expecting the CRS daemons to be up within 600 seconds.
Failure at final check of Oracle CRS stack.
10

察看相關錯誤:

# cat css382.log
Oracle Database 11g CRS Release 11.1.0.6.0 - Production Copyright 1996, 2007 Oracle. All rights reserved.
2009-06-19 16:21:26.807: [ CSSCLNT][1]clsssInitNative: failed to connect to (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_bl870_1_)), rc 9

我一直懷疑是網路卡或者某個通訊有問題。所以,一直對刀片小型機是否能安裝ORACLE存在懷疑。並且刀片小型機的網路卡結構也是比較複雜的。

求助METALINK,METALINK給我一個解決辦法:跟蹤執行過程,察看ORACLE究竟是在等待什麼?或者那個地方出問題。

進入DEBUG模式:

1.對需要除錯檔案進行備份:

cp $ORA_CRS_HOME/install/rootinstall $ORA_CRS_HOME/install/rootinstall.bak
cp $ORA_CRS_HOME/install/rootconfig $ORA_CRS_HOME/install/rootconfig.bak

2.修改2個配置檔案:

配置檔案rootinstall 和rootconfig 指令碼中,新增-X:

比如: 
#!/bin/sh -x
#
# rootinstall.sbs for CRS installs

3.執行指令碼:

script. /tmp/rootsh.log
./root.sh

4.察看系統日誌:

tail -f /var/adm/syslog/syslog.log
# cd /home/oracle/oraInventory/logs
# tail -f installActions2009-06-20_03-23-03PM.log

具體做法如下除錯記錄:

# tail -f syslog.log
Jun 20 15:54:03 bl870_1  above message repeats 47 times
Jun 20 15:54:05 bl870_1 syslog: Oracle Cluster Ready Services waiting for HP-UX Service Guard to start.
Jun 20 15:54:51 bl870_1 vmunix: Dead gateway detection can't ping the last remaining default gateway at 0x23015afe .See ndd -h ip_ire_gw_probe for more info
Jun 20 15:56:25 bl870_1 vmunix: LLT WARNING V-14-1-10497 crossed links? link 0 and link 1 of node 1 on the same network
Jun 20 15:57:03 bl870_1 SQLAnywhere(veritas_dbms3_bl870_1): Starting checkpoint of "vxdbms" (vxdbms.db) at Sat Jun 20 2009 15:57
Jun 20 15:57:04 bl870_1 SQLAnywhere(veritas_dbms3_bl870_1): Finished checkpoint of "vxdbms" (vxdbms.db) at Sat Jun 20 2009 15:57
Jun 20 15:56:25 bl870_1 vmunix: LLT WARNING V-14-1-10497 crossed links? link 0 and link 1 of node 1 on the same network
Jun 20 15:57:51 bl870_1 vmunix: Dead gateway detection can't ping the last remaining default gateway at 0x23015afe .See ndd -h ip_ire_gw_probe for more info
Jun 20 16:01:25 bl870_1 vmunix: LLT WARNING V-14-1-10497 crossed links? link 0 and link 1 of node 1 on the same network
Jun 20 16:07:34 bl870_1 syslog: libtt[23096]: ttdt_Xt_input_handler(): tttk_message_receive(): TT_ERR_NOMP      No ttsession process is running, probably because tt_open() has not been called yet. If this code is returned from tt_open() it means ttsession could not be started, which generally means ToolTalk is not installed on this system
.

根據系統日誌,我發現有2點比較可疑,進行分析:

1、Jun 20 15:54:05 bl870_1 syslog: Oracle Cluster Ready Services waiting for HP-UX Service Guard to start.

oracle究竟在等待什麼?HP的SG叢集軟體?我的環境是用VERTIAS做的併發卷組,沒有用到HP的SG?經過求助有經驗的工程師,後來才知道,VERTIAS做叢集的時候,需要安裝補丁。

cp  /opt/VRTSvcs/rac/patch/init.cssd-11gR1.patch /home/oracle/crs/css/admin/
patch init.cssd < init.cssd-11gR1.patch

結果問題解決。

2、Jun 20 15:56:25 bl870_1 vmunix: LLT WARNING V-14-1-10497 crossed links? link 0 and link 1 of node 1 on the same network

我曾經一直對這個地方懷疑,一直覺得HP的刀片有問題,結果竟然這個地方是正常的。OH!OH!

正確安裝情況記錄:

節點1執行情況:

# tail -f /var/adm/syslog/syslog.log
Jun 20 17:13:41 bl870_1 vmunix: GAB INFO V-15-1-20036 Port f gen   64ba0d membership 01
Jun 20 17:13:41 bl870_1 vmunix: GLM recovery : gen 64ba0d mbr 3 0 0 0
Jun 20 17:13:41 bl870_1 vxfsckd: vxfs: vxfsckd started
Jun 20 17:14:45 bl870_1 vxfsckd: /dev/vx/rdsk/oradg/oravol2: log replay in progress
Jun 20 17:14:45 bl870_1 vxfsckd: /dev/vx/rdsk/oradg/oravol2: replay complete - marking super-block as CLEAN
Jun 20 17:14:45 bl870_1 vxfsckd: /dev/vx/rdsk/oradg/oravol3: log replay in progress
Jun 20 17:14:45 bl870_1 vxfsckd: /dev/vx/rdsk/oradg/oravol3: replay complete - marking super-block as CLEAN
Jun 20 17:14:45 bl870_1 vxfsckd: /dev/vx/rdsk/oradg/oravol1: log replay in progress
Jun 20 17:14:45 bl870_1 vxfsckd: /dev/vx/rdsk/oradg/oravol1: replay complete - marking super-block as CLEAN
Jun 20 17:17:33 bl870_1 vmunix: LLT WARNING V-14-1-10497 crossed links? link 0 and link 1 of node 1 on the same network
Jun 20 17:29:17 bl870_1 sshd[8478]: SSH: Server;Ltype: Version;Remote: 35.1.90.21-50545;Protocol: 2.0;Client: OpenSSH_5.1p1+sftpfilecontrol-v1.2-hpn13v5
Jun 20 17:27:33 bl870_1 vmunix: LLT WARNING V-14-1-10497 crossed links? link 0 and link 1 of node 1 on the same network
Jun 20 17:30:11 bl870_1  above message repeats 5 times
Jun 20 17:32:06 bl870_1 sshd[9015]: SSH: Server;Ltype: Version;Remote: 35.1.90.88-1383;Protocol: 2.0;Client: SecureCRT_5.1.2 (build 274) SecureCRT
Jun 20 17:32:08 bl870_1 sshd[9015]: Accepted password for root from 35.1.90.88 port 1383 ssh2
Jun 20 17:32:33 bl870_1 vmunix: LLT WARNING V-14-1-10497 crossed links? link 0 and link 1 of node 1 on the same network
Jun 20 17:32:47 bl870_1 SQLAnywhere(veritas_dbms3_bl870_1): Starting checkpoint of "vxdbms" (vxdbms.db) at Sat Jun 20 2009 17:32
Jun 20 17:32:47 bl870_1 SQLAnywhere(veritas_dbms3_bl870_1): Finished checkpoint of "vxdbms" (vxdbms.db) at Sat Jun 20 2009 17:32
Jun 20 17:45:29 bl870_1 su: + 2 root-oracle
Jun 20 17:45:29 bl870_1 root: Oracle Cluster Ready Services starting by user request.
Jun 20 17:45:40 bl870_1 su: + tty?? root-oracle
Jun 20 17:42:33 bl870_1 vmunix: LLT WARNING V-14-1-10497 crossed links? link 0 and link 1 of node 1 on the same network
Jun 20 17:45:41 bl870_1  above message repeats 5 times
Jun 20 17:45:41 bl870_1 syslog: Cluster Ready Services completed waiting on dependencies.
Jun 20 17:45:41 bl870_1 syslog: Running CRSD with TZ =
Jun 20 17:45:41 bl870_1 syslog: Oracle CSS Family monitor starting.
Jun 20 17:45:41 bl870_1 syslog: Cluster Ready Services completed waiting on dependencies.
Jun 20 17:45:42 bl870_1  above message repeats 2 times
Jun 20 17:45:42 bl870_1 syslog: Oracle CSS restart. 0, 2
Jun 20 17:47:33 bl870_1 vmunix: LLT WARNING V-14-1-10497 crossed links? link 0 and link 1 of node 1 on the same network
Jun 20 17:45:42 bl870_1 su: + tty?? root-oracle
Jun 20 17:50:11 bl870_1  above message repeats 5 times
Jun 20 17:47:33 bl870_1 vmunix: LLT WARNING V-14-1-10497 crossed links? link 0 and link 1 of node 1 on the same network
Jun 20 17:52:33 bl870_1 vmunix: LLT WARNING V-14-1-10497 crossed links? link 0 and link 1 of node 1 on the same network
Jun 20 17:52:49 bl870_1 SQLAnywhere(veritas_dbms3_bl870_1): Starting checkpoint of "vxdbms" (vxdbms.db) at Sat Jun 20 2009 17:52
Jun 20 17:52:49 bl870_1 SQLAnywhere(veritas_dbms3_bl870_1): Finished checkpoint of "vxdbms" (vxdbms.db) at Sat Jun 20 2009 17:52
Jun 20 17:57:33 bl870_1 vmunix: LLT WARNING V-14-1-10497 crossed links? link 0 and link 1 of node 1 on the same network
Jun 20 18:01:09 bl870_1  above message repeats 3 times
Jun 20 18:02:33 bl870_1 vmunix: LLT WARNING V-14-1-10497 crossed links? link 0 and link 1 of node 1 on the same network
Jun 20 18:03:15 bl870_1 sshd[22125]: SSH: Server;Ltype: Version;Remote: 35.1.90.21-52661;Protocol: 2.0;Client: OpenSSH_5.1p1+sftpfilecontrol-v1.2-hpn13v5
Jun 20 18:04:09 bl870_1 su: + 2 root-oracle
Jun 20 18:07:33 bl870_1 vmunix: LLT WARNING V-14-1-10497 crossed links? link 0 and link 1 of node 1 on the same network
Jun 20 18:10:12 bl870_1  above message repeats 3 times
Jun 20 18:12:33 bl870_1 vmunix: LLT WARNING V-14-1-10497 crossed links? link 0 and link 1 of node 1 on the same network
Jun 20 18:12:51 bl870_1 SQLAnywhere(veritas_dbms3_bl870_1): Starting checkpoint of "vxdbms" (vxdbms.db) at Sat Jun 20 2009 18:12
Jun 20 18:12:51 bl870_1 SQLAnywhere(veritas_dbms3_bl870_1): Finished checkpoint of "vxdbms" (vxdbms.db) at Sat Jun 20 2009 18:1

節點2情況:

# /home/oracle/crs/root.sh
WARNING: directory '/home/oracle' is not owned by root
Checking to see if Oracle CRS stack is already configured
Checking to see if any 9i GSD is up

Setting the permissions on OCR backup directory
Setting up Network socket directories
Oracle Cluster Registry configuration upgraded successfully
The directory '/home/oracle' is not owned by root. Changing owner to root
clscfg: EXISTING configuration version 4 detected.
clscfg: version 4 is 11 Release 1.
Successfully accumulated necessary OCR keys.
Using ports: CSS=49895 CRS=49896 EVMC=49898 and EVMR=49897.
node :
node 0: bl870_1 dbp-priv bl870_1
node 1: bl870_2 dbs-priv bl870_2
clscfg: Arguments check out successfully.

NO KEYS WERE WRITTEN. Supply -force parameter to override.
-force is destructive and will destroy any previous cluster
configuration.
Oracle Cluster Registry for cluster has already been initialized
Startup will be queued to init within 30 seconds.
Adding daemons to inittab
Expecting the CRS daemons to be up within 600 seconds.
Cluster Synchronization Services is active on these nodes.
        bl870_1
        bl870_2
Cluster Synchronization Services is active on all the nodes.
Waiting for the Oracle CRSD and EVMD to start
Oracle CRS stack installed and running under init(1M)
Running vipca(silent) for configuring nodeapps

Creating VIP application resource on (2) nodes...
Creating GSD application resource on (2) nodes...
Creating ONS application resource on (2) nodes...
Starting VIP application resource on (2) nodes...
Starting GSD application resource on (2) nodes1:CRS-0215: Could not start resource 'ora.bl870_1.gsd'.
Check the log file "/home/oracle/crs/log/bl870_1/racg/ora.bl870_1.gsd.log" for more details
.1:CRS-0215: Could not start resource 'ora.bl870_2.gsd'.
Check the log file "/home/oracle/crs/log/bl870_2/racg/ora.bl870_2.gsd.log" for more details
..
Starting ONS application resource on (2) nodes...


Done.
這個好像11G基本都起不來,不用管它好了,就此CRS安裝結束。

 

 



 

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/8334342/viewspace-607062/,如需轉載,請註明出處,否則將追究法律責任。

相關文章