【RAC】如何修改 private ip

楊奇龍發表於2011-11-30
版本:
Clusterware :11.2.0.2
database    :11.2.0.1
#修改前
10.250.7.115          rac1-priv
10.250.7.119          rac2-priv
#修改後
10.10.10.101          rac1-priv
10.10.10.102          rac2-priv
因為在11.2版本的Grid Infrastructure中,CRS 服務依賴於儲存在OCR中的私有網路卡配置資訊。下面是儲存在OCR中的和私有ip相關的條目,顯示crsd服務是依賴於私有內聯ip的
[SYSTEM.crs.e2eport.rac1]
ORATEXT  : (ADDRESS=(PROTOCOL=tcp)(HOST=10.250.7.115)(PORT=11756))
SECURITY : {USER_PERMISSION : PROCR_ALL_ACCESS, GROUP_PERMISSION : PROCR_NONE, OTHER_PERMISSION : PROCR_NONE, USER_NAME : grid, GROUP_NAME : oinstall}
[SYSTEM.crs.e2eport.rac2]
ORATEXT  : (ADDRESS=(PROTOCOL=tcp)(HOST=10.250.7.119)(PORT=27917))
SECURITY : {USER_PERMISSION : PROCR_ALL_ACCESS, GROUP_PERMISSION : PROCR_NONE, OTHER_PERMISSION : PROCR_NONE, USER_NAME : grid, GROUP_NAME : oinstall}
[SYSTEM.CRSD.RESOURCES.ora!net1!network.CONFIG]
ORATEXT  : ACL=owner:root:rwx,pgrp:root:r-x,other::r--,group:oinstall:r-x,user:grid:r-x~ACTION_FAILURE_TEMPLATE=~ACTION_SCRIPT=~AGENT_FILENAME=%CRS_HOME%/bin/orarootagent%CRS_EXE_SUFFIX%~ALIAS_NAME=~AUTO_START=restore~BASE_TYPE=ora.local_resource.type~CHECK_INTERVAL=1~DEFAULT_TEMPLATE=~DEGREE=1~DESCRIPTION=Oracle Network resource~ENABLED=1~LOAD=1~LOGGING_LEVEL=1~NAME=ora.net1.network~NLS_LANG=~NOT_RESTARTING_TEMPLATE=~OFFLINE_CHECK_INTERVAL=60~PROFILE_CHANGE_TEMPLATE=~RESTART_ATTEMPTS=5~SCRIPT_TIMEOUT=60~START_DEPENDENCIES=~START_TIMEOUT=0~STATE_CHANGE_TEMPLATE=~STOP_DEPENDENCIES=~STOP_TIMEOUT=0~TYPE=ora.network.type~TYPE_ACL=owner:root:rwx,pgrp:root:r-x,other::r--~TYPE_VERSION=1.1~UPTIME_THRESHOLD=1d~USR_ORA_AUTO=static~USR_ORA_ENV=~USR_ORA_IF=eth0~USR_ORA_NETMASK=255.255.255.0~USR_ORA_SUBNET=10.250.7.0~VERSION=11.2.0.1.0~
SECURITY : {USER_PERMISSION : PROCR_ALL_ACCESS, GROUP_PERMISSION : PROCR_NONE, OTHER_PERMISSION : PROCR_NONE, USER_NAME : root, GROUP_NAME : root}
[SYSTEM.CRSD.RESOURCES.ora!net1!network.INTERNAL]
ORATEXT  : CREATION_SEED=27~DEGREE_ID=0~DEGREE_ID@SERVERNAME(rac1)=1~DEGREE_ID@SERVERNAME(rac2)=1~ID=ora.net1.network~ID@SERVERNAME(rac1)=ora.net1.network rac1 1~ID@SERVERNAME(rac2)=ora.net1.network rac2 1~LAST_SERVER=~LAST_SERVER@SERVERNAME(rac1)=rac1~LAST_SERVER@SERVERNAME(rac2)=rac2~TARGET=8~TARGET@SERVERNAME(rac1)=7~TARGET@SERVERNAME(rac2)=7~
SECURITY : {USER_PERMISSION : PROCR_ALL_ACCESS, GROUP_PERMISSION : PROCR_NONE, OTHER_PERMISSION : PROCR_NONE, USER_NAME : root, GROUP_NAME : root}

當我們使用oifcfg 配置ip時,上面的資訊會被相應的改變。如果一個作為cluster_interconnect的網路ip不可訪問或者配置不正確,則CRSD 程式無法正常執行,任何對OCR的改變都會失敗
如果私有ip配置有問題,crsd的日誌:
2011-11-28 20:53:59.658: [  OCRAPI][1497940384]clsu_get_private_ip_addresses: no ip addresses found.
[  OCRAPI][1497940384]a_init_clsss: failed to call clsu_get_private_ip_addr (7)
2011-11-28 20:53:59.867: [  OCRAPI][1497940384]a_init:13!: Clusterware init unsuccessful : [44]
2011-11-28 20:53:59.867: [  CRSOCR][1497940384] OCR context init failure.  Error: PROC-44: 網路地址和網路介面操作中出錯 網路地址和網路介面操作錯誤 [7]
2011-11-28 20:53:59.867: [  CRSOCR][1497940384][PANIC] OCR Context is NULL(File: caaocr.cpp, line: 145)
2011-11-28 20:53:59.867: [    CRSD][1497940384][PANIC] CRSD Exiting. OCR Failed
因此修改私有ip時必須小心而謹慎,注意更改私有ip的順序。
1. 確保叢集中所有節點的CRS都正常執行。
grid@rac1:/home/grid>olsnodes -s
rac1    Active
rac2    Active
2. 以grid使用者:找到要修改的私有網址
grid@rac1:/home/grid>oifcfg getif
eth0  10.250.7.0  global  public
eth1  10.250.7.0  global  cluster_interconnect <====私有網址
執行一下命令修改eth1對應的ip:
$ oifcfg setif -global /:cluster_interconnect
oifcfg setif -global   介面名稱/子網:cluster_interconnect
3 修改私有ip網段為10.10.10.0
grid@rac1:/home/grid>oifcfg setif -global  eth1/10.10.10.0:cluster_interconnect
4 確認修改結果:
grid@rac1:/home/grid>oifcfg getif
eth0  10.250.7.0  global  public
eth1  10.250.7.0  global  cluster_interconnect
eth1  10.10.10.0  global  cluster_interconnect
5. 以root身份關閉所有節點上的CRS服務並且disable CRS,如果資料庫正在執行則先關閉資料庫。
oracle@rac1:/home/oracle>srvctl stop database -d rac  -o immediate
oracle@rac1:/home/oracle>
[root@rac2 peer]# /opt/11202/11.2.0/grid/bin/crsctl stop crs    
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'rac2'
CRS-2673: Attempting to stop 'ora.crsd' on 'rac2'
CRS-2790: 正在啟動關閉 'rac2' 上叢集就緒服務管理的資源的操作
CRS-2673: 嘗試停止 'ora.LISTENER.lsnr' (在 'rac2' 上)
CRS-2673: 嘗試停止 'ora.LISTENER_SCAN1.lsnr' (在 'rac2' 上)
CRS-2673: 嘗試停止 'ora.cvu' (在 'rac2' 上)
...省略....
CRS-2673: 嘗試停止 'ora.asm' (在 'rac2' 上)
CRS-2677: 成功停止 'ora.asm' (在 'rac2' 上)
CRS-2673: 嘗試停止 'ora.ons' (在 'rac2' 上)
CRS-2677: 成功停止 'ora.ons' (在 'rac2' 上)
CRS-2673: 嘗試停止 'ora.net1.network' (在 'rac2' 上)
CRS-2677: 成功停止 'ora.net1.network' (在 'rac2' 上)
CRS-2792: 關閉 'rac2' 上叢集就緒服務管理的資源的操作已完成
CRS-2677: Stop of 'ora.crsd' on 'rac2' succeeded
CRS-2673: Attempting to stop 'ora.crf' on 'rac2'
CRS-2673: Attempting to stop 'ora.ctssd' on 'rac2'
CRS-2673: Attempting to stop 'ora.evmd' on 'rac2'
CRS-2673: Attempting to stop 'ora.asm' on 'rac2'
CRS-2673: Attempting to stop 'ora.mdnsd' on 'rac2'
...省略....
CRS-2677: Stop of 'ora.gpnpd' on 'rac2' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'rac2' has completed
CRS-4133: Oracle High Availability Services has been stopped.
[root@rac2 ~]# /opt/11202/11.2.0/grid/bin/crsctl disable crs     
CRS-4621: Oracle High Availability Services autostart is disabled.
6.在叢集中所有節點OS系統上對private ip做相應的改變包括 /etc/sysconfig/network-script/ifcfg-eth1,/etc/hosts 檔案,重啟網路服務
[root@rac2 ~]# service network restart
正在關閉介面 eth0:                                        [確定]
正在關閉介面 eth1:                                        [確定]
關閉環回介面:                                             [確定]
彈出環回介面:                                             [確定]
彈出介面 eth0:                                            [確定]
彈出介面 eth1:                                            [確定]
[root@rac1 ~]# ifconfig -a
eth0      Link encap:Ethernet  HWaddr 00:50:56:8F:25:0A  
          inet addr:10.250.7.225  Bcast:10.250.7.255  Mask:255.255.255.0
  ....省略....

eth1      Link encap:Ethernet  HWaddr 00:50:56:8F:6F:49  
          inet addr:10.10.10.101  Bcast:10.10.10.255  Mask:255.255.255.0
[root@rac1 ~]# ping 10.10.10.101
PING 10.10.10.101 (10.10.10.101) 56(84) bytes of data.
64 bytes from 10.10.10.101: icmp_seq=1 ttl=64 time=0.037 ms
64 bytes from 10.10.10.101: icmp_seq=2 ttl=64 time=0.020 ms
7 enable 並且重啟叢集上的CRS服務
[root@rac1 ~]# /opt/11202/11.2.0/grid/bin/crsctl enable crs    
CRS-4622: Oracle High Availability Services autostart is enabled.
[root@rac1 ~]# /opt/11202/11.2.0/grid/bin/crsctl start crs         
CRS-4640: Oracle High Availability Services is already active
8 由於setif 新加入了私有ip eth1  10.10.10.0  global  cluster_interconnect,要除去舊的eth1  10.250.7.0  global  cluster_interconnect,
grid@rac1:/home/grid>oifcfg delif -global  eth1/10.250.7.0:cluster_interconnect   
grid@rac1:/home/grid>oifcfg getif 
eth0  10.250.7.0  global  public
eth1  10.10.10.0  global  cluster_interconnect
Note 
#1. This step is not required for adding 2nd interface scenario.
#2. 如果加入新的而不去掉舊的網路卡,舊網路卡在CRS重啟時依然可用。進行刪除之後,CRS需要重新啟動確保舊的網路卡不再被使用。

需要注意的問題:
1.如果僅僅在os層面修改了ip地址,比如/etc/hosts 和ifcfg-ethN 裡面修改,而沒有使用OIFCFG 命令修改對應的值,那麼一旦CRS重新啟動,CRSD將不能夠啟動。 
The crsd.log 裡面出現如下內容:
2010-01-30 09:22:47.234: [ default][2926461424] CRS Daemon Starting
..
2010-01-30 09:22:47.273: [ GPnP][2926461424]clsgpnp_Init: [at clsgpnp0.c:837] GPnP client pid=7153, tl=3, f=0
2010-01-30 09:22:47.282: [ OCRAPI][2926461424]clsu_get_private_ip_addresses: no ip addresses found.
2010-01-30 09:22:47.282: [GIPCXCPT][2926461424] gipcShutdownF: skipping shutdown, count 2, from [ clsinet.c : 1732], ret gipcretSuccess (0)
2010-01-30 09:22:47.283: [GIPCXCPT][2926461424] gipcShutdownF: skipping shutdown, count 1, from [ clsgpnp0.c : 1021], ret gipcretSuccess (0)
[ OCRAPI][2926461424]a_init_clsss: failed to call clsu_get_private_ip_addr (7)
2010-01-30 09:22:47.285: [ OCRAPI][2926461424]a_init:13!: Clusterware init unsuccessful : [44]
2010-01-30 09:22:47.285: [ CRSOCR][2926461424] OCR context init failure. Error: PROC-44: Error in network address and interface operations Network address and interface operations error [7]
2010-01-30 09:22:47.285: [ CRSD][2926461424][PANIC] CRSD exiting: Could not init OCR, code: 44
2010-01-30 09:22:47.285: [ CRSD][2926461424] Done.
我的crsd.log的記錄:
2011-11-23 14:18:33.025: [ CRSMAIN][3747852704] Initializing OCR
2011-11-23 14:18:33.027: [ CRSMAIN][1076652352] Policy Engine is not initialized yet!
2011-11-23 14:18:33.079: [  OCRAPI][3747852704]clsu_get_private_ip_addresses: no ip addresses found.
2011-11-23 14:18:33.289: [  OCRAPI][3747852704]a_init:13!: Clusterware init unsuccessful : [44]
2011-11-23 14:18:33.289: [  CRSOCR][3747852704] OCR context init failure.  Error: PROC-44: 網路地址和網路介面操作中出錯 網路地址和網路介面操作錯誤 [7]
2011-11-23 14:18:33.289: [  CRSOCR][3747852704][PANIC] OCR Context is NULL(File: caaocr.cpp, line: 145)
2011-11-23 14:18:33.289: [    CRSD][3747852704][PANIC] CRSD Exiting. OCR Failed
Above errors indicate a mismatch between OS setting (oifcfg iflist) and gpnp profile setting.
上面的錯誤表示系統設定的ip和oifcfg iflist裡面設定的資訊和gpnp profile檔案裡的配置不匹配導致。
解決方法:把os層的ip修復為之前的設定,啟動crs,然後按照正確的方法修改: 

2. 如果叢集中任何一個節點down了, oifcfg 命令返回錯誤:
$ oifcfg setif -global bond0/192.168.0.0:cluster_interconnect
PRIF-26: Error in update the profiles in the cluster
解決辦法:在未執行CRS的節點上啟動CRS,確保所有節點的CRS都能正常執行。

3. 非Grid Infrastructure屬主使用者執行oifcfg 命令,也會返回錯誤
$ oifcfg setif -global bond0/192.168.0.0:cluster_interconnect
PRIF-26: Error in update the profiles in the cluster
解決辦法:確保以grid使用者登入並執行oifcfg 命令

4.從11.2.0.2 版本, 如果嘗試刪除最新的private interface (cluster_interconnect),而沒有先加入一個新的,會報如下錯誤:
PRIF-31: Failed to delete the specified network interface because it is the last private interface
解決辦法: Add new private interface first before deleting the old private interface.

5.如果一個節點down了,會報如下錯誤:
$ oifcfg getif
PRIF-10: failed to initialize the cluster registry
解決辦法: Start the CRS on the node

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/22664653/viewspace-712441/,如需轉載,請註明出處,否則將追究法律責任。

相關文章