關於Oracle 11G RAC雙節點之間存在防火牆導致只能一個節點執行
################關於Oracle 11G RAC雙節點之間存在防火牆導致只能一個節點執行
#問題背景:
深信服雲平臺內部分散式防火牆開啟後,所有的主機之間都會存在防火牆。開始不知道分散式防火牆的啟動。
#問題現象:
故障節點啟動叢集時會出現異常,導致叢集無法完成啟動。alert日誌:
2021-10-28 07:30:56.944:
[ohasd(2313)]CRS-2112:The OLR service started on node cmsorcl2.
2021-10-28 07:30:56.955:
[ohasd(2313)]CRS-1301:Oracle High Availability Service started on node cmsorcl2.
2021-10-28 07:30:56.958:
[ohasd(2313)]CRS-8017:location: /etc/oracle/lastgasp has 2 reboot advisory log files, 0 were announced and 0 errors occurred
2021-10-28 07:30:57.283:
[/u01/app/11.2.0/grid/bin/oraagent.bin(7198)]CRS-5011:Check of resource "+ASM" failed: details at "(:CLSN00006:)" in "/u01/app/11.2.0/grid/log/cmsorcl2/agent/ohasd/oraagent_grid/oraagent_grid.log"
2021-10-28 07:31:00.326:
[/u01/app/11.2.0/grid/bin/orarootagent.bin(7202)]CRS-2302:Cannot get GPnP profile. Error CLSGPNP_NO_DAEMON (GPNPD daemon is not running).
2021-10-28 07:31:02.503:
[ohasd(2313)]CRS-2302:Cannot get GPnP profile. Error CLSGPNP_NO_DAEMON (GPNPD daemon is not running).
2021-10-28 07:31:02.504:
[gpnpd(8934)]CRS-2328:GPNPD started on node cmsorcl2.
2021-10-28 07:31:04.840:
[cssd(10891)]CRS-1713:CSSD daemon is started in clustered mode
2021-10-28 07:31:06.672:
[ohasd(2313)]CRS-2767:Resource state recovery not attempted for 'ora.diskmon' as its target state is OFFLINE
2021-10-28 07:31:06.673:
[ohasd(2313)]CRS-2769:Unable to failover resource 'ora.diskmon'.
2021-10-28 07:31:30.624:
[cssd(10891)]CRS-1707:Lease acquisition for node cmsorcl2 number 2 completed
2021-10-28 07:31:32.008:
[cssd(10891)]CRS-1605:CSSD voting file is online: /dev/asm-diskc; details in /u01/app/11.2.0/grid/log/cmsorcl2/cssd/ocssd.log.
2021-10-28 07:31:32.048:
[cssd(10891)]CRS-1605:CSSD voting file is online: /dev/asm-diskb; details in /u01/app/11.2.0/grid/log/cmsorcl2/cssd/ocssd.log.
2021-10-28 07:31:32.089:
[cssd(10891)]CRS-1605:CSSD voting file is online: /dev/asm-diska; details in /u01/app/11.2.0/grid/log/cmsorcl2/cssd/ocssd.log.
2021-10-28 15:38:34.926:
[/u01/app/11.2.0/grid/bin/cssdagent(10858)]CRS-5818:Aborted command 'start' for resource 'ora.cssd'. Details at (:CRSAGF00113:) {0:0:2} in /u01/app/11.2.0/grid/log/cmsorcl2/agent/ohasd/oracssdagent_root/oracssdagent_root.log.
2021-10-28 15:38:34.926:
[cssd(10891)]CRS-1656:The CSS daemon is terminating due to a fatal error; Details at (:CSSSC00012:) in /u01/app/11.2.0/grid/log/cmsorcl2/cssd/ocssd.log
2021-10-28 15:38:34.926:
[cssd(10891)]CRS-1603:CSSD on node cmsorcl2 shutdown by user.
2021-10-28 15:38:40.254:
[ohasd(2313)]CRS-2765:Resource 'ora.cssdmonitor' has failed on server 'cmsorcl2'.
2021-10-28 15:38:41.668:
[cssd(6922)]CRS-1713:CSSD daemon is started in clustered mode
2021-10-28 15:38:43.435:
[ohasd(2313)]CRS-2767:Resource state recovery not attempted for 'ora.diskmon' as its target state is OFFLINE
2021-10-28 15:38:43.435:
[ohasd(2313)]CRS-2769:Unable to failover resource 'ora.diskmon'.
2021-10-28 15:38:57.400:
[cssd(6922)]CRS-1707:Lease acquisition for node cmsorcl2 number 2 completed
#解決方案:
css異常,一般都是節點間的訪問異常,在ocssd.log中可檢視具體資訊,可看到一些私網埠訪問異常的資訊。
在防火牆中把資料庫pub,priv,vip,scan IP之間的限制全部取消。
如果資料庫不啟用HAIP的話,這時資料庫應該就恢復正常了;如果啟用了HAIP(預設啟用,169段ip),需要把HAIP之間的限制取消掉。如果不取消掉,asm例項無法啟動。現象如下:
Wed Nov 17 22:05:45 2021
MMON started with pid=20, OS id=6979
Wed Nov 17 22:05:45 2021
MMNL started with pid=21, OS id=6981
lmon registered with NM - instance number 2 (internal mem no 1)
Wed Nov 17 22:07:45 2021
PMON (ospid: 6911): terminating the instance due to error 481 <-----問題現象,參考http://blog.itpub.net/29615408/viewspace-1384760/
Wed Nov 17 22:07:45 2021
ORA-1092 : opitsk aborting process
Wed Nov 17 22:07:45 2021
System state dump requested by (instance=2, osid=6911 (PMON)), summary=[abnormal instance termination].
System State dumped to trace file /u01/app/grid/diag/asm/+asm/+ASM2/trace/+ASM2_diag_6949_20211117220745.trc
Dumping diagnostic data in directory=[cdmp_20211117220745], requested by (instance=2, osid=6911 (PMON)), summary=[abnormal instance termination].
Instance terminated by PMON, pid = 6911
Wed Nov 17 22:08:06 2021
NOTE: No asm libraries found in the system
雙節點之間HAIP有防火牆的現象提示不是很明顯,asm的alert日誌如上,手動啟asm到nomount時報ORA-03113: end-of-file on communication channel ,叢集告警日誌和trc檔案中的報錯都沒有明確指出HAIP的問題。PMON (ospid: 6911): terminating the instance due to error 481 的出現原因如下:
Case1: link local IP (169.254.x.x) is being used by other adapter/network
Case2: firewall exists between nodes on private network (iptables etc)
Case3: HAIP is up on some nodes but not on all
Case4: HAIP is up on all nodes but some do not have route info
最終關閉了HAIP之間的防火牆策略後資料庫恢復!
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/69900971/viewspace-2842920/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- oracle 11g rac新增節點前之清除節點資訊Oracle
- oracle兩節點RAC,由於gipc導致某節點crs無法啟動問題分析Oracle
- Oracle 11g RAC重新新增節點Oracle
- Oracle 19c rac安裝,只能啟動一個節點的ASMOracleASM
- Oracle RAC新增節點Oracle
- Oracle RAC命中ORA-7445只能開啟一個節點故障案例分析Oracle
- rac新增節點步驟(11g)
- 11g rac新增節點步驟(11g)
- rac新增節點前之清除節點資訊
- 如何執行一個 Conflux 節點UX
- Vmware linux redhat6.4 安裝11g(11.2.0.1) 雙節點RACLinuxRedhat
- ORACLE 11.2.0.4 rac for linux 鏈路宕導致的單節點異常當機OracleLinux
- oracle11g RAC新增節點Oracle
- 模擬oracle rac節點異常時如何保持ogg正常執行Oracle
- 11g rac新增節點容易遇到的問題
- rac新增節點容易遇到的問題(11g)
- selenium-grid 有多個節點,但 pytest.main 批次執行用例,每次只有一個節點執行用例,不能同時多個節點執行,要怎樣才能多個節點同時執行AI
- runc hang 導致 Kubernetes 節點 NotReady
- oracle 11.2.0.4 rac節點異常當機之ORA-07445Oracle
- oracle11g_RAC新增刪除節點Oracle
- Oracle RAC某一節點異常,你該怎麼辦?Oracle
- RAC節點hang住, oracle bug導致了cpu過高,無法啟動叢集隔離Oracle
- ORACLE RAC 兩節點db_32k_cache_size設定不當導致表truncate失敗之ORA-00379Oracle
- Oracle優化案例-新增RAC節點(二十九)Oracle優化
- 2節點RAC安裝
- 【RAC】Oracle10g rac新增刪除節點命令參考Oracle
- 安裝Oracle 11G RAC 遇到的2個問題——Failed to run "oifcfg" 和 找不到叢集節點OracleAI
- JavaScript學習之DOM(節點、節點層級、節點操作)JavaScript
- 關於叢集節點timeline不一致的處理方式
- 填充每個節點的下一個右側節點指標指標
- RAC二節點啟動異常
- jQuery關於DOM操作節點一些方法jQuery
- 填充每個節點的下一個右側節點指標 II指標
- Windows 11.2.0.4 RAC安裝配置以及RAC新增節點Windows
- 多路徑配置問題和ACFS啟用原因導致rac二節點不能正常啟動
- 網路原因導致rac安裝過程中節點2跑root.sh失敗
- Mac os的防火牆導致開的熱點手機連不上Mac防火牆
- js判斷dom節點是否存在JS