multicast導致節點無法加入叢集
1.CRS故障描述
只有一個節點可以起CRS,其他3個節點都無法啟動,cssd.log一下錯誤:
2011-11-17 21:14:15.799: [ CSSD][2833]clssnmvDHBValidateNCopy: node 1, raca1, has a disk HB, but no network HB, DHB has rcfg 212474945, wrtcnt, 4392389, LATS 56951623, lastSeqNo 4392388, uniqueness 1321584882, timestamp 1321586055/1228391238
2011-11-17 21:14:15.799: [ CSSD][2833]clssnmvDHBValidateNCopy: node 2, raca2, has a disk HB, but no network HB, DHB has rcfg 212474945, wrtcnt, 4351379, LATS 56951623, lastSeqNo 4351378, uniqueness 1321585233, timestamp 1321586055/1228149573
2011-11-17 21:14:15.799: [ CSSD][2833]clssnmvDHBValidateNCopy: node 4, racb2, has a disk HB, but no network HB, DHB has rcfg 212474928, wrtcnt, 4325534, LATS 56951623, lastSeqNo 4325533, uniqueness 1321585745, timestamp 1321586055/3135065466
2011-11-17 21:14:16.322: [ CSSD][3604]clssgmWaitOnEventValue: after CmInfo State val 3, eval 1 waited 0
2011-11-17 21:14:16.811: [ CSSD][2833]clssnmvDHBValidateNCopy: node 1, raca1, has a disk HB, but no network HB, DHB has rcfg 212474945, wrtcnt, 4392390, LATS 56952634, lastSeqNo 4392389, uniqueness 1321584882, timestamp 1321586056/1228392250
2011-11-17 21:14:16.811: [ CSSD][2833]clssnmvDHBValidateNCopy: node 2, raca2, has a disk HB, but no network HB, DHB has rcfg 212474945, wrtcnt, 4351380, LATS 56952634, lastSeqNo 4351379, uniqueness 1321585233, timestamp 1321586056/1228150582
2011-11-17 21:14:16.811: [ CSSD][2833]clssnmvDHBValidateNCopy: node 4, racb2, has a disk HB, but no network HB, DHB has rcfg 212474928, wrtcnt, 4325535, LATS 56952634, lastSeqNo 4325534, uniqueness 1321585745, timestamp 1321586056/3135066472
"has a disk HB, but no network HB" 沒有網路心跳
2.根據Notes 1212703.1 解釋,11.2開始GRID開始支援冗餘的interconnect,為了支援冗餘的insterconnect,
multicasting必須enabled,否則節點無法加入cluster!
Oracle Grid Infrastructure 11.2.0.2 introduces a new feature called "Redundant Interconnect Usage", which provides an Oracle internal mechanism to make use of physically redundant network interfaces for the Oracle (private) interconnect
As part of this new feature, multicast based communication on the private interconnect is utilized to establish communication with peers in the cluster on each startup of the stack on a node
Multicasting on either of these IPs and the respective port must, however, be enabled and functioning across the network and on each node meant to be part of the cluster.
If multicasting is not enabled as required, nodes will fail to join the cluster with the symptoms discussed.
multicasting測試:
perl mcasttest.pl -n raca1,raca2,racb1,racb2 -i en18
########### Setup for node raca1 ##########
Checking node access 'raca1'
Checking node login 'raca1'
Checking/Creating Directory /tmp/mcasttest for binary on node 'raca1'
Distributing mcast2 binary to node 'raca1'
########### Setup for node raca2 ##########
Checking node access 'raca2'
Checking node login 'raca2'
Checking/Creating Directory /tmp/mcasttest for binary on node 'raca2'
Distributing mcast2 binary to node 'raca2'
########### Setup for node racb1 ##########
Checking node access 'racb1'
Checking node login 'racb1'
Checking/Creating Directory /tmp/mcasttest for binary on node 'racb1'
Distributing mcast2 binary to node 'racb1'
########### Setup for node racb2 ##########
Checking node access 'racb2'
Checking node login 'racb2'
Checking/Creating Directory /tmp/mcasttest for binary on node 'racb2'
Distributing mcast2 binary to node 'racb2'
########### testing Multicast on all nodes ##########
Test for Multicast address 230.0.1.0
Nov 18 15:48:23 | Multicast Failed for en18 using address 230.0.1.0:42000
Test for Multicast address 224.0.0.251
Nov 18 15:48:25 | Multicast Succeeded for en18 using address 224.0.0.251:42001
測試結果:
Multicast在 230.0.1.0地址上沒有enable,所以節點無法加入cluster
the test has failed for the 230.0.1.0 address, but succeeded for the 224.0.0.251 multicast address. In this case, Patch: 9974223 must be applied to enable Oracle Grid Infrastructure to use the 224.0.0.251 multicast address.
Should the mcasttest.pl test-tool have failed for both, the 230.0.1.0 address only
解決方案:
必須要打補丁9974223,11.2.0.2 Grid Infrastructure PSU1以後已經包含了9974223,所以只要只PSU即可
Apply Patch: 9974223 or any subsequent GI Bundle (or PSU) Patch including Patch: 9974223
Patch: 9974223 is included in Oracle Grid Infrastructure Bundle Patch 1 and later and it is recommended to apply Patch: 9974223 via a Bundle Patch rather than applying it individually
打完PSU3 12419353補丁後節點可以正常加入cluster
[@more@]來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/90901/viewspace-1056602/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- RAC節點hang住, oracle bug導致了cpu過高,無法啟動叢集隔離Oracle
- 節點加入k8s 叢集的步驟K8S
- service network restart 命令使用時導致叢集該節點重啟REST
- 【Azure微服務 Service Fabric 】因證書過期導致Service Fabric叢集掛掉(升級無法完成,節點不可用)微服務
- ORACLE 11.2.0.4 for solaris更換硬體後主機時間改變導致一節點叢集服務無法啟動Oracle
- oracle兩節點RAC,由於gipc導致某節點crs無法啟動問題分析Oracle
- 排查 k8s 叢集 master 節點無法正常工作的問題K8SAST
- 在K8S中,Worker節點加入叢集的全過程?K8S
- consul 多節點/單節點叢集搭建
- 4.2 叢集節點初步搭建
- MongoDB叢集搭建(包括隱藏節點,仲裁節點)MongoDB
- 字元校驗集問題導致索引無法正常使用字元索引
- 11gRAC許可權問題導致的叢集及資料庫無法啟動資料庫
- HAC叢集更改IP(單節點更改、全部節點更改)
- Oracle叢集軟體管理-新增和刪除叢集節點Oracle
- linux搭建kafka叢集,多master節點叢集說明LinuxKafkaAST
- 【RAC】處理VIP資源被佔用導致Cluster叢集軟體無法正常部署問題
- hadoop 叢集重啟是出現Journalnode節點目錄無法格式化問題Hadoop
- 解決Spark叢集無法停止Spark
- Redis服務之叢集節點管理Redis
- Redis Manager 叢集管理與節點管理Redis
- redhat安裝雙節點cassandra叢集Redhat
- Jedis操作單節點redis,叢集及redisTemplate操作redis叢集(一)Redis
- 400+節點的 Elasticsearch 叢集運維Elasticsearch運維
- 400+ 節點的 Elasticsearch 叢集運維Elasticsearch運維
- mongodb叢集節點故障的切換方法MongoDB
- Oracle RAC 10g叢集節點增加Oracle
- 【Mongodb】sharding 叢集Add/Remove 節點MongoDBREM
- Oracle RAC日常運維-NetworkManager導致叢集故障Oracle運維
- MongoDB 分片叢集均衡器導致的效能下降MongoDB
- 系統變數group_replication_group_seeds為空導致MySQL節點無法啟動組複製變數MySql
- 關於叢集節點timeline不一致的處理方式
- 從庫轉換成PXC叢集的節點
- kafka系列二:多節點分散式叢集搭建Kafka分散式
- CentOS7 上搭建多節點 Elasticsearch叢集CentOSElasticsearch
- hadoop叢集搭建——單節點(偽分散式)Hadoop分散式
- RAC修改叢集兩個節點public ip地址
- hadoop叢集多節點安裝詳解Hadoop