Linux,Network manager 導致節點異常重啟
推斷是Network manager 導致的,原因待查
今天在VmWare的虛擬機器上裝了個測試RAC,又遇到了一個摸不到頭緒的問題
CRS裝好後,一旦登陸圖形介面,節點就重啟,事情就有這麼巧
不登陸圖形介面,觀察了1個小時沒問題,一旦登陸後,立刻重啟
在OS日誌中,一旦登陸圖形介面,重啟前的日誌如下
Sep 5 19:29:18 dm01db01 nm-system-settings: Loaded plugin ifcfg-rh: (c) 2007 - 2008 Red Hat, Inc. To report bugs please use the NetworkManager mailing list.
Sep 5 19:29:18 dm01db01 nm-system-settings: ifcfg-rh: parsing /etc/sysconfig/network-scripts/ifcfg-lo ...
Sep 5 19:29:18 dm01db01 nm-system-settings: ifcfg-rh: parsing /etc/sysconfig/network-scripts/ifcfg-eth1 ...
Sep 5 19:29:18 dm01db01 nm-system-settings: ifcfg-rh: read connection 'System eth1'
Sep 5 19:29:18 dm01db01 nm-system-settings: ifcfg-rh: parsing /etc/sysconfig/network-scripts/ifcfg-eth0 ...
Sep 5 19:29:18 dm01db01 nm-system-settings: ifcfg-rh: read connection 'System eth0'
叢集件沒有任何日誌,就像機器被人直接重啟了一樣,找不到任何原因
ping心跳,偶爾有200多ms,但是重啟前,ping都在幾ms內
vmstat監控,CPU利用率也沒有問題
測試瞭如下調整:
1.加大 misscount 無效
2.調整 diagwait,也沒有任何日誌
3.關閉了無用的服務,無效
4.重新換了個網段,無效
一直覺得是網路的問題,搜尋關鍵字 ifcfg-rh ,找到了一篇文章 OEL: Error: Missing Or Invalid IP4 Prefix '0' On Linux Server (Doc ID 1522095.1)
雖然現象和我的問題無關,但是抱著死馬當活馬醫的想法,跟著文件關閉了Network manager
1.在/etc/sysconfig/network-scripts/ifcfg-eth* 中增加 NM_CONTROLLED="no"
2.chkconfig NetworkManager off
3.reboot
重啟後主機正常。在OS日誌中看到:
Sep 5 19:41:06 dm01db01 nm-system-settings: Loaded plugin ifcfg-rh: (c) 2007 - 2008 Red Hat, Inc. To report bugs please use the NetworkManager mailing list.
Sep 5 19:41:06 dm01db01 nm-system-settings: ifcfg-rh: parsing /etc/sysconfig/network-scripts/ifcfg-lo ...
Sep 5 19:41:06 dm01db01 nm-system-settings: ifcfg-rh: parsing /etc/sysconfig/network-scripts/ifcfg-eth1 ...
Sep 5 19:41:06 dm01db01 nm-system-settings: ifcfg-rh: read connection 'System eth1'
Sep 5 19:41:06 dm01db01 nm-system-settings: ifcfg-rh: Ignoring connection 'System eth1' and its device because NM_CONTROLLED was false.
Sep 5 19:41:06 dm01db01 nm-system-settings: ifcfg-rh: parsing /etc/sysconfig/network-scripts/ifcfg-eth0 ...
Sep 5 19:41:06 dm01db01 nm-system-settings: ifcfg-rh: read connection 'System eth0'
Sep 5 19:41:06 dm01db01 nm-system-settings: ifcfg-rh: Ignoring connection 'System eth0' and its device because NM_CONTROLLED was false.
可以看到配置被忽略掉了。
先記錄一個,以後在研究
版本資訊
[root@dm01db01 network-scripts]# cat /etc/issue
Oracle Linux Server release 5.9
Kernel \r on an \m
[root@dm01db01 network-scripts]# uname -a
Linux dm01db01 2.6.39-300.26.1.el5uek #1 SMP Thu Jan 3 18:31:38 PST 2013 x86_64 x86_64 x86_64 GNU/Linux
[root@dm01db01 network-scripts]# /u01/app/oracle/product/crs/bin/crsctl query crs activeversion
CRS active version on the cluster is [10.2.0.5.0]
今天在VmWare的虛擬機器上裝了個測試RAC,又遇到了一個摸不到頭緒的問題
CRS裝好後,一旦登陸圖形介面,節點就重啟,事情就有這麼巧
不登陸圖形介面,觀察了1個小時沒問題,一旦登陸後,立刻重啟
在OS日誌中,一旦登陸圖形介面,重啟前的日誌如下
Sep 5 19:29:18 dm01db01 nm-system-settings: Loaded plugin ifcfg-rh: (c) 2007 - 2008 Red Hat, Inc. To report bugs please use the NetworkManager mailing list.
Sep 5 19:29:18 dm01db01 nm-system-settings: ifcfg-rh: parsing /etc/sysconfig/network-scripts/ifcfg-lo ...
Sep 5 19:29:18 dm01db01 nm-system-settings: ifcfg-rh: parsing /etc/sysconfig/network-scripts/ifcfg-eth1 ...
Sep 5 19:29:18 dm01db01 nm-system-settings: ifcfg-rh: read connection 'System eth1'
Sep 5 19:29:18 dm01db01 nm-system-settings: ifcfg-rh: parsing /etc/sysconfig/network-scripts/ifcfg-eth0 ...
Sep 5 19:29:18 dm01db01 nm-system-settings: ifcfg-rh: read connection 'System eth0'
叢集件沒有任何日誌,就像機器被人直接重啟了一樣,找不到任何原因
ping心跳,偶爾有200多ms,但是重啟前,ping都在幾ms內
vmstat監控,CPU利用率也沒有問題
測試瞭如下調整:
1.加大 misscount 無效
2.調整 diagwait,也沒有任何日誌
3.關閉了無用的服務,無效
4.重新換了個網段,無效
一直覺得是網路的問題,搜尋關鍵字 ifcfg-rh ,找到了一篇文章 OEL: Error: Missing Or Invalid IP4 Prefix '0' On Linux Server (Doc ID 1522095.1)
雖然現象和我的問題無關,但是抱著死馬當活馬醫的想法,跟著文件關閉了Network manager
1.在/etc/sysconfig/network-scripts/ifcfg-eth* 中增加 NM_CONTROLLED="no"
2.chkconfig NetworkManager off
3.reboot
重啟後主機正常。在OS日誌中看到:
Sep 5 19:41:06 dm01db01 nm-system-settings: Loaded plugin ifcfg-rh: (c) 2007 - 2008 Red Hat, Inc. To report bugs please use the NetworkManager mailing list.
Sep 5 19:41:06 dm01db01 nm-system-settings: ifcfg-rh: parsing /etc/sysconfig/network-scripts/ifcfg-lo ...
Sep 5 19:41:06 dm01db01 nm-system-settings: ifcfg-rh: parsing /etc/sysconfig/network-scripts/ifcfg-eth1 ...
Sep 5 19:41:06 dm01db01 nm-system-settings: ifcfg-rh: read connection 'System eth1'
Sep 5 19:41:06 dm01db01 nm-system-settings: ifcfg-rh: Ignoring connection 'System eth1' and its device because NM_CONTROLLED was false.
Sep 5 19:41:06 dm01db01 nm-system-settings: ifcfg-rh: parsing /etc/sysconfig/network-scripts/ifcfg-eth0 ...
Sep 5 19:41:06 dm01db01 nm-system-settings: ifcfg-rh: read connection 'System eth0'
Sep 5 19:41:06 dm01db01 nm-system-settings: ifcfg-rh: Ignoring connection 'System eth0' and its device because NM_CONTROLLED was false.
可以看到配置被忽略掉了。
先記錄一個,以後在研究
版本資訊
[root@dm01db01 network-scripts]# cat /etc/issue
Oracle Linux Server release 5.9
Kernel \r on an \m
[root@dm01db01 network-scripts]# uname -a
Linux dm01db01 2.6.39-300.26.1.el5uek #1 SMP Thu Jan 3 18:31:38 PST 2013 x86_64 x86_64 x86_64 GNU/Linux
[root@dm01db01 network-scripts]# /u01/app/oracle/product/crs/bin/crsctl query crs activeversion
CRS active version on the cluster is [10.2.0.5.0]
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/8242091/viewspace-772247/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- service network restart 命令使用時導致叢集該節點重啟REST
- ORACLE 11.2.0.4 rac for linux 鏈路宕導致的單節點異常當機OracleLinux
- eclipse異常關閉 導致tomcat 6.0 無法重啟EclipseTomcat
- 【RAC】處理因ASM例項異常導致RAC第一節點例項異常終止故障ASM
- Oracle RAC啟動因CTSS導致的異常Oracle
- HA異常導致oracle資料庫無法啟動Oracle資料庫
- 當機導致slave異常分析
- 記一次oracle 19c RAC叢集重啟單節點DB啟動異常(二)Oracle
- 同時開啟節點導致資料DDL操作慢 ??
- [重慶思莊每日技術分享]-local_listener導致登入異常
- 記 Laravel Observer 導致 Redis 佇列異常LaravelServerRedis佇列
- 異常程式導致大量資源佔用
- cv::Mat轉QImage導致影像色彩異常
- 序列異常導致災備端應用異常處理一則
- 神奇的DEBUG:因為異常導致MongoDB容器無法啟動MongoDB
- oracle兩節點RAC,由於gipc導致某節點crs無法啟動問題分析Oracle
- Linux主機名修改後導致mysql重啟失敗LinuxMySql
- MySQL Bug導致異常當機的分析流程MySql
- 一次意外斷電導致mysql檔案損壞,啟動異常MySql
- 解決Linux索引節點(inode)用滿導致故障的方法Linux索引
- Linux索引節點(inode)用滿導致的一次故障Linux索引
- oracle 11.2.0.2 版本產生的私網地址不通會導致第2個節點自動重啟Oracle
- eclipse 異常關閉後, 再開啟時閃退,導致無法再開啟Eclipse
- SCN異常增長導致資料庫異常關閉風險的防範資料庫
- multicast導致節點無法加入叢集AST
- 時區不一致導致spring應用異常Spring
- IP地址被清空導致例項重啟
- 解決一次gitlab因異常關機導致啟動失敗Gitlab
- Oracle X7一體機儲存節點重啟導致u01使用率不斷增大Oracle
- 360衛士阻止程式建立,導致各種異常
- A站大流量導致服務崩潰異常分析
- OGG 表結構變化導致同步異常
- 華為交換機LLDP震盪導致網路異常
- Android之點選Home鍵後再次開啟導致APP重啟問題AndroidAPP
- Oracle 資料庫不一致導致異常的恢復Oracle資料庫
- inode節點耗盡導致資料庫OOM資料庫OOM
- vs 啟動提示遇到異常,這可能是某個擴充套件導致的套件
- 伺服器架構導致的SEO收錄異常伺服器架構