Oracle 10gR2 RAC Clusterware ONS服務的管理
下面透過一個實際的案例討論ONS服務的管理。
在10gR2 RAC環境中,表決磁碟資料丟失,且沒有備份,於是準備清空Clusterware配置資訊,重新執行root.sh指令碼來恢復Clusterware的執行。參考文章:http://space.itpub.net/23135684/viewspace-721081成功執行了/u01/crs/bin/racgons add_config rhel:6251 rhel2:6251命令,之後執行vipca指令碼建立兩個節點的nodeapps(請注意:vipca指令碼會自動建立ons服務,所以之前使用racgons建立ons是沒必要的),但是在建立和啟動過程中發現第二個節點的ons服務無法啟動,檢視第二個節點的ons日誌:
ons日誌的位置是:
/u01/app/oracle/crs/log/rhel2/racg/ora.rhel2.ons.log
格式是: $ORA_CRS_HOME/log//racg/ora..ons.log
跟蹤日誌發現如下資訊:
跟蹤日誌發現如下資訊:
2012-06-12 17:21:05.030: [ RACG][3999896128] [13230][3999896128][ora.rhel2.ons]: GETHOSTBYNAME(rhel): 2
GETHOSTBYNAME(rhel): 2
Remote port for local node in local config does not match that from OCR.
Number of configuration nodes retrieved: 2
0: {node = rhel, port = 6251}
Adding remote host rhel:6251
1: {node = rhel2, port = 6251}
2012-06-12 17:21:05.032: [ RACG][3999896128] [13230][3999896128][ora.rhel2.ons]: GETHOSTBYNAME(rhel): 2
GETHOSTBYNAME(rhel): 2
Remote port for local node in local config does not match that from OCR.
GETHOSTBYNAME(rhel): 2
GETHOSTBYNAME(rhel): 2
Remote port for local node in local config does not match that from OCR.
2012-06-12 17:21:05.032: [ RACG][3999896128] [13230][3999896128][ora.rhel2.ons]: Number of configuration nodes retrieved: 2
0: {node = rhel, port = 6251}
Adding remote host rhel:6251
1: {node = rhel2, port = 6251}
onsctl: ons failed to start
2012-06-12 17:21:05.133: [ RACG][3999896128] [13230][3999896128][ora.rhel2.ons]: clsrcexecut: env ORACLE_CONFIG_HOME=/u01/app/oracle/crs
2012-06-12 17:21:05.133: [ RACG][3999896128] [13230][3999896128][ora.rhel2.ons]: clsrcexecut: cmd = /u01/app/oracle/crs/bin/racgeut -e _USR_ORA_DEBUG=0 540 /u01/app/oracle/crs/bin/onsctl start
2012-06-12 17:21:05.133: [ RACG][3999896128] [13230][3999896128][ora.rhel2.ons]: clsrcexecut: rc = 1, time = 1.650s
2012-06-12 17:21:05.448: [ RACG][3999896128] [13230][3999896128][ora.rhel2.ons]: GETHOSTBYNAME(rhel): 2
GETHOSTBYNAME(rhel): 2
Remote port for local node in local config does not match that from OCR.
Number of configuration nodes retrieved: 2
0: {node = rhel, port = 6251}
Adding remote host rhel:6251
1: {node = rhel2, port = 6251}
2012-06-12 17:21:05.448: [ RACG][3999896128] [13230][3999896128][ora.rhel2.ons]: ons is not running ...
2012-06-12 17:21:05.448: [ RACG][3999896128] [13230][3999896128][ora.rhel2.ons]: clsrcexecut: env ORACLE_CONFIG_HOME=/u01/app/oracle/crs
2012-06-12 17:21:05.448: [ RACG][3999896128] [13230][3999896128][ora.rhel2.ons]: clsrcexecut: cmd = /u01/app/oracle/crs/bin/racgeut -e _USR_ORA_DEBUG=0 540 /u01/app/oracle/crs/bin/onsctl ping
2012-06-12 17:21:05.448: [ RACG][3999896128] [13230][3999896128][ora.rhel2.ons]: clsrcexecut: rc = 1, time = 0.310s
2012-06-12 17:21:05.448: [ RACG][3999896128] [13230][3999896128][ora.rhel2.ons]: end for resource = ora.rhel2.ons, action = start, status = 1, time = 2.060s
2012-06-12 17:21:07.228: [ RACG][740729408] [13260][740729408][ora.rhel2.ons]: onsctl: shutting down ons daemon ...
GETHOSTBYNAME(rhel): 2
GETHOSTBYNAME(rhel): 2
Remote port for local node in local config does not match that from OCR.
Number of configuration nodes retrieved: 2
0: {node = rhel, port = 6251}
2012-06-12 17:21:07.228: [ RACG][740729408] [13260][740729408][ora.rhel2.ons]: Adding remote host rhel:6251
1: {node = rhel2, port = 6251}
onsctl: shutdown of ons failed!
2012-06-12 17:21:07.228: [ RACG][740729408] [13260][740729408][ora.rhel2.ons]: clsrcexecut: env ORACLE_CONFIG_HOME=/u01/app/oracle/crs
2012-06-12 17:21:07.228: [ RACG][740729408] [13260][740729408][ora.rhel2.ons]: clsrcexecut: cmd = /u01/app/oracle/crs/bin/racgeut -e _USR_ORA_DEBUG=0 540 /u01/app/oracle/crs/bin/onsctl stop
2012-06-12 17:21:07.228: [ RACG][740729408] [13260][740729408][ora.rhel2.ons]: clsrcexecut: rc = 3, time = 0.470s
從上面的日誌可以看出應該是兩個節點的埠不匹配導致的問題,手動建立ONS服務使用的是6251埠,使用vipca建立的可能不是6251埠,所以導致兩邊的埠不匹配。
一.onsctl工具
下面是onsctl工具的幫助資訊:
從上面的日誌可以看出應該是兩個節點的埠不匹配導致的問題,手動建立ONS服務使用的是6251埠,使用vipca建立的可能不是6251埠,所以導致兩邊的埠不匹配。
一.onsctl工具
下面是onsctl工具的幫助資訊:
[root@rhel1 bin]# ./onsctl
usage: ./onsctl start|stop|ping|reconfig|debug
start - Start opmn only.
stop - Stop ons daemon
ping - Test to see if ons daemon is running
debug - Display debug information for the ons daemon
reconfig - Reload the ons configuration
help - Print a short syntax description (this).
detailed - Print a verbose syntax description.
在第一個節點執行onsctl ping命令:
在第二個節點執行onsctl ping命令:
發現第二個節點ons因為埠與第一個節點不匹配的原因而沒有啟動。
二.檢視節點程式:
檢視第一個節點的ons程式:
三.ONS配置檔案
執行find命令找到了ons的配置檔案,如下:
顯然配置檔案中的埠與執行racgons配置的6251不匹配。
四.RACGONS工具
RACGONS的幫助資訊如下:
檢視兩個節點的狀態:
[root@rhel1 bin]# ./onsctl detailed
usage: ./onsctl start|stop|ping|reconfig|debug
start
Start ons daemon
stop
Shutdown ons daemon
reconfig
Trigger ons to re-read it's configuration files.
ping
Test to see if ons daemon is alive
debug
Display debug information about the ons daemon
help
Print a short syntax description.
detailed
Print a verbose syntax description (this message).
在第一個節點執行onsctl ping命令:
[root@rhel1 bin]# ./onsctl ping
Number of configuration nodes retrieved: 2
0: {node = rhel, port = 6251}
GETHOSTBYNAME(rhel): 2
Adding remote host rhel:6251
GETHOSTBYNAME(rhel): 2
1: {node = rhel2, port = 6251}
Adding remote host rhel2:6251
ons is running ...
ons在第一個節點已經處於執行狀態。
在第二個節點執行onsctl ping命令:
[root@rhel2 bin]# ./onsctl ping
Number of configuration nodes retrieved: 2
0: {node = rhel, port = 6251}
GETHOSTBYNAME(rhel): 2
Adding remote host rhel:6251
GETHOSTBYNAME(rhel): 2
1: {node = rhel2, port = 6251}
Remote port for local node in local config does not match that from OCR.
ons is not running ...
發現第二個節點ons因為埠與第一個節點不匹配的原因而沒有啟動。
二.檢視節點程式:
檢視第一個節點的ons程式:
[root@rhel1 bin]# ps -ef | grep ons
root 2412 1 0 16:47 ? 00:00:00 sendmail: accepting connections
oracle 13513 1 0 17:17 ? 00:00:00 /u01/app/oracle/crs/opmn/bin/ons -d
oracle 13515 13513 0 17:17 ? 00:00:00 /u01/app/oracle/crs/opmn/bin/ons -d
root 15646 3340 0 17:22 pts/0 00:00:00 grep ons
檢視第二個節點的osn程式:
檢視第二個節點的osn程式:
[root@rhel2 bin]# ps -ef | grep ons
root 2400 1 0 16:45 ? 00:00:00 sendmail: accepting connections
root 13847 3546 0 17:22 pts/0 00:00:00 grep ons
三.ONS配置檔案
執行find命令找到了ons的配置檔案,如下:
./opmn/conf/ons.config.tmp
./opmn/conf/ons.config
./opmn/conf/ons.config.backup.10205
[root@rhel1 crs]# cat ./opmn/conf/ons.config
localport=6113
remoteport=6200
loglevel=3
useocr=on
顯然配置檔案中的埠與執行racgons配置的6251不匹配。
四.RACGONS工具
RACGONS的幫助資訊如下:
[root@rhel1 bin]# ./racgons
To add ONS daemons configuration:
./racgons.bin add_config hostname:port [hostname:port] ...
To remove ONS daemons configuration:
./racgons.bin remove_config hostname[:port] [hostname:port] ...
在OCR中可能配置有兩條ONS的資訊,執行以下的命令刪除原有的6251埠配置:
在OCR中可能配置有兩條ONS的資訊,執行以下的命令刪除原有的6251埠配置:
[root@rhel1 bin]# ./racgons remove_config rhel:6251 rhel2:6251
racgons: Existing key value on rhel = 6251.
racgons: rhel:6251 removed from OCR.
racgons: Existing key value on rhel2 = 6251.
racgons: rhel2:6251 removed from OCR.
重新啟動nodeapps:
重新啟動nodeapps:
[root@rhel1 bin]# ./srvctl start nodeapps -n rhel2
[root@rhel1 bin]# ./srvctl start nodeapps -n rhel1
檢視兩個節點的狀態:
[root@rhel1 bin]# ./crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora.rhel1.gsd application ONLINE ONLINE rhel1
ora.rhel1.ons application ONLINE ONLINE rhel1
ora.rhel1.vip application ONLINE ONLINE rhel1
ora.rhel2.gsd application ONLINE ONLINE rhel2
ora.rhel2.ons application ONLINE ONLINE rhel2
ora.rhel2.vip application ONLINE ONLINE rhel2
恢復正常。
恢復正常。
--end--
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/23135684/viewspace-732562/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- oracle 10GR2 CLUSTERWARE(RAC)中關於OCR和VOTING磁碟的管理資訊Oracle 10g
- ORACLE RAC clusterwareOracle
- OCR And Voting Disk In Oracle 10gR2 Clusterware (RAC) [ID 1092293.1]Oracle 10g
- oracle 10gR2 RAC 的一個BUGOracle 10g
- 【RAC】Oracle Clusterware 診斷收集指令碼Oracle指令碼
- oracle ons(OracleNotificationService)Oracle
- Linux NTP服務配置 for Oracle RACLinuxOracle
- Oracle 11G RAC CTSS服務Oracle
- Oracle11gR2——RAC中的服務Oracle
- Oracle對RAC gsd服務作用的解釋Oracle
- 升級oracle 10g clusterware 和 racOracle 10g
- RAC and Oracle Clusterware and Starter Kit (Platform Independent)-810394.1OraclePlatform
- RAC Assurance Support Team: RAC and Oracle Clusterware Starter Kit and Best Practices (Generic)Oracle
- oracle 10gr2 rac 修改auto_startOracle 10g
- ORACLE 11G RAC--CLUSTERWARE工具集1Oracle
- RAC Assurance Support Team: RAC and Oracle Clusterware Starter Kit and Best Practices (Generic) [IDOracle
- ORACLE 10gR2 RAC升級至10.2.0.4Oracle 10g
- Oracle DG管理Redo Transport服務Oracle
- Master Note for RAC Oracle Clusterware and Oracle Grid Infrastructure 1096952.ASTOracleStruct
- Information Center: Oracle Scalability GI/Clusterware and RAC_1452965.2ORMOracle
- RAC and Oracle Clusterware Best Practices and Starter Kit (Solaris)_811280.1Oracle
- RAC and Oracle Clusterware Best Practices and Starter Kit (Windows)_811271.1OracleWindows
- 補接_oracle rac_node addition and deletion for clusterware or softwareOracle
- Add Node/Instance Remove Node/Instance in 10gR2 11g Clusterware RAC_1332451.1REM
- Oracle Clusterware的心跳Oracle
- oracle 11gR2 RAC安裝與oracle 10gR2 rac 安裝時的不同點Oracle 10g
- oracle clusterwareOracle
- [轉載]Install Oracle 10gR2 RAC on SUSE 10Oracle 10g
- Oracle 10gR2 RAC+ASM 歸檔設定Oracle 10gASM
- 【Oracle】RAC 11.2.0.4.0 OHASD服務無法啟動Oracle
- OPatch/Patch Questions/Issues for Oracle Clusterware/ RAC Environments_1339140.1Oracle
- RAC and Oracle Clusterware Best Practices and Starter Kit (AIX)_811293.1OracleAI
- 新增節點oracle10g rac(rhel4)_clusterwareOracle
- Oracle10g RAC clusterware split-brain - 腦裂OracleAI
- 【RAC】刪除RAC資料庫節點(五)——刪除ONS資料庫
- 10gR2 rac vip服務頻繁重啟及在節點間漂移的一種解決辦法
- oracle 10g rac vip 服務啟動不了的問題Oracle 10g
- Oracle RAC/Clusterware 多種心跳heartbeat機制介紹 RAC超時機制分析Oracle