記:僅配置單vip連線串,當vip對應的節點down機情況下程式無法連線上正常節點的故障

denglt發表於2012-10-10
問題摘要
  昨天,生產庫RAC的二號節點倒了,但是部分程式無法連線上正常的一號節點,發現程式的連線串僅配置了二號節點的vip.
    在測試環境模擬錯誤發生的情況,並找到在二號節點短時間內無法正常啟動和程式無法修改的情況下的臨時解決方法.
具體實驗情況如下:
一.測試環境如下   
RAC 環境:
172.16.89.229   node1
172.16.90.132   node2
172.16.90.167   node1-vip
172.16.90.168   node2-vip
192.168.126.101 node1-priv
192.168.126.102 node2-priv
節點一listener.ora檔案:
LISTENER_NODE1 =
  (DESCRIPTION_LIST =
    (DESCRIPTION =
      (ADDRESS = (PROTOCOL = TCP)(HOST = node1-vip)(PORT = 1521)(IP = FIRST))
      (ADDRESS = (PROTOCOL = TCP)(HOST = node1)(PORT = 1521)(IP = FIRST))
    )
  )
 
客戶端tnsnames配置如下:
NODE2 =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = 172.16.90.168)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = racdb)   
    )
  )
SQL> select * from v$version;
BANNER
----------------------------------------------------------------
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Prod
PL/SQL Release 10.2.0.4.0 - Production
CORE    10.2.0.4.0      Production
TNS for Linux: Version 10.2.0.4.0 - Production
NLSRTL Version 10.2.0.4.0 - Production

二.模擬錯誤
把節點二機器down機,節點二的vip轉移到節點一,如下
[oracle@node1 ~]$ crs_stat -t
Name           Type           Target    State     Host       
------------------------------------------------------------
ora....B1.inst application    ONLINE    ONLINE    node1      
ora....B2.inst application    OFFLINE   OFFLINE              
ora.RACDB.db   application    ONLINE    ONLINE    node1      
ora....SM1.asm application    ONLINE    ONLINE    node1      
ora....E1.lsnr application    ONLINE    ONLINE    node1      
ora.node1.gsd  application    ONLINE    ONLINE    node1      
ora.node1.ons  application    ONLINE    ONLINE    node1      
ora.node1.vip  application    ONLINE    ONLINE    node1      
ora....SM2.asm application    ONLINE    OFFLINE              
ora....E2.lsnr application    OFFLINE   OFFLINE              
ora.node2.gsd  application    ONLINE    OFFLINE              
ora.node2.ons  application    ONLINE    OFFLINE              
ora.node2.vip  application    ONLINE    ONLINE    node1   --轉到節點二了.
 
SQL> CONN dlt/dlt@NODE2
ERROR:
ORA-12541: TNS:no listener
Warning: You are no longer connected to ORACLE.
 
檢視:節點一的監聽狀態
[oracle@node1 admin]$ lsnrctl status
LSNRCTL for Linux: Version 10.2.0.4.0 - Production on 09-OCT-2012 17:44:37
Copyright (c) 1991, 2007, Oracle.  All rights reserved.
Connecting to (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521))
STATUS of the LISTENER
------------------------
Alias                     LISTENER_NODE1
Version                   TNSLSNR for Linux: Version 10.2.0.4.0 - Production
Start Date                09-OCT-2012 17:39:28
Uptime                    0 days 0 hr. 5 min. 8 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
Listener Parameter File   /opt/ora10g/product/10.2.0/db_1/network/admin/listener.ora
Listener Log File         /opt/ora10g/product/10.2.0/db_1/network/log/listener_node1.log
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=172.16.90.167)(PORT=1521)))  --注意:這兒的host直接是ip
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=172.16.89.229)(PORT=1521)))
Services Summary...
...
The command completed successfully
 
並沒有監聽node2-priv,所以無法通過node2-priv連線到節點一的例項.
 
三.問題可以通過下面的臨時方法解決

方法一:通過lsnrctl命令重啟LISTENER_NODE1(一定不能用crs_stop ,crs_start ):
  [oracle@node1 admin]$ lsnrctl stop
    [oracle@node1 admin]$ lsnrctl start
  [oracle@node1 admin]$ lsnrctl status
  
  LSNRCTL for Linux: Version 10.2.0.4.0 - Production on 09-OCT-2012 17:51:40
  
  Copyright (c) 1991, 2007, Oracle.  All rights reserved.
  
  Connecting to (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521))
  STATUS of the LISTENER
  ------------------------
 ...
  Listening Endpoints Summary...
    (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=node1)(PORT=1521)))  --注意:這兒的host為node1
  Services Summary...
  ....
  The command completed successfully   
  測試連線:
SQL> CONN dlt/dlt@node2;
Connected.
SQL> select instance_name from v$instance;
INSTANCE_NAME
----------------
RACDB1
SQL>
成功連線到節點一的例項
 
方法二:修改節點一listener.ora檔案,增加對node2-priv的監聽.
LISTENER_NODE1 =
  (DESCRIPTION_LIST =
    (DESCRIPTION =
      (ADDRESS = (PROTOCOL = TCP)(HOST = node1-vip)(PORT = 1521)(IP = FIRST))
      (ADDRESS = (PROTOCOL = TCP)(HOST = node1)(PORT = 1521)(IP = FIRST))
      (ADDRESS = (PROTOCOL = TCP)(HOST = node2-vip)(PORT = 1521)(IP = FIRST))  注:新增加內容
    )
  )
 重啟監聽:
 [oracle@node1 admin]$ crs_stop ora.node1.LISTENER_NODE1.lsnr
 [oracle@node1 admin]$ crs_start ora.node1.LISTENER_NODE1.lsnr 
 [oracle@node1 admin]$ lsnrctl status
LSNRCTL for Linux: Version 10.2.0.4.0 - Production on 09-OCT-2012 17:44:37
Copyright (c) 1991, 2007, Oracle.  All rights reserved.
Connecting to (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521))
STATUS of the LISTENER
------------------------
Alias                     LISTENER_NODE1
Version                   TNSLSNR for Linux: Version 10.2.0.4.0 - Production
Start Date                09-OCT-2012 17:39:28
Uptime                    0 days 0 hr. 5 min. 8 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
Listener Parameter File   /opt/ora10g/product/10.2.0/db_1/network/admin/listener.ora
Listener Log File         /opt/ora10g/product/10.2.0/db_1/network/log/listener_node1.log
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=172.16.90.167)(PORT=1521))) 
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=172.16.89.229)(PORT=1521)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=172.16.90.168)(PORT=1521)))  --注意:這兒出現我們需要的ip了
Services Summary...
...
The command completed successfully
測試連線:
SQL> CONN dlt/dlt@node2;
Connected.
SQL> select instance_name from v$instance;
INSTANCE_NAME
----------------
RACDB1
SQL>
成功連線到節點一的例項
 
最後:上面的僅是臨時的解決方法,最好的方法還是把所有的vip都配置在連線串中.

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/195110/viewspace-746049/,如需轉載,請註明出處,否則將追究法律責任。

相關文章