RAC的兩大應用特性FAILOVER和LOAD_BALANCE總結

warehouse發表於2012-03-26

failover容易理解,案例也容易構造

load_balance不太容易弄,還好都清楚了。

[@more@]

FAILOVER特性介紹
1.connect-time failover
8i提出來這個特性,為後來事先TAF打下了基礎。
connect-time failover只是在連線的時侯選擇能連上的例項進行連線
,只是在連線時做判斷。042有個題目考得就是這個知識點。
connect-time failover的實現就是在tns配置檔案裡寫多個地址列表就可以,
本身oracle預設FAILOVER=ON
Your database is not configured for session failover. Your tnsnames.ora file contains the following
details:
test.us.oracle.com=
(DESCRIPTION=
(ADDRESS_LIST=
(LOAD_BALANCE=off)
(FAILOVER=ON)
(ADDRESS=(PROTOCOL=tcp)(HOST=test1server)(PORT=1521))
(ADDRESS=(PROTOCOL=tcp)(HOST=test2server)(PORT=1521)))
(CONNECT_DATA=(SERVICE_NAME=test.us.oracle.com))
)
Which feature is enabled in this case?
Testinside
| English | Chinese(Traditional) | Chinese(Simplified) | 10
TestInside
Information Co., Ltd. All rights reserved.
A. Load balancing
B. Instance failover
C. Database failover
D. Connecttime
failover
E. Transparent Application Failover (TAF)
Answer: D
--=========================
2.CLIENT TAF(Transparent Application Failover)
9i透過在client段的tns裡配置FAILOVER=ON以及相關的屬性來實現
連線例項出現問題之後應用可以透明的切換到其他例項上,具體配置如下:
ASMDB_TAF =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = nodel-vip)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = node2-vip)(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = ASMDB)
(FAILOVER_MODE =
(TYPE = SELECT)
(METHOD = BASIC)
(RETRIES = 180)
(DELAY = 5)
)
)
)
--========================
測試如下:
C:>sqlplus as sysdba

SQL*Plus: Release 10.2.0.1.0 - Production on Mon Mar 26 11:19:01 2012

Copyright (c) 1982, 2005, Oracle. All rights reserved.


Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production
With the Partitioning, Real Application Clusters, OLAP and Data Mining options

SQL> select sid,failover_type,failover_method,failed_over,
2 instance_name from v$session,v$instance where sid=(select distinct sid from v$mystat);

SID FAILOVER_TYPE FAILOVER_M FAI INSTANCE_NAME
---------- ------------- ---------- --- ----------------
131 SELECT BASIC NO asmdb1

SQL>
--=======================
[oracle@node1 admin]$ srvctl stop instance -d asmdb -i asmdb1 -o abort
[oracle@node1 admin]$ srvctl status database -d asmdb
Instance asmdb1 is not running on node node1
Instance asmdb2 is running on node node2
[oracle@node1 admin]$
--================================
SQL> select sid,failover_type,failover_method,failed_over,
2 instance_name from v$session,v$instance where sid=(select distinct sid from v$mystat);

SID FAILOVER_TYPE FAILOVER_M FAI INSTANCE_NAME
---------- ------------- ---------- --- ----------------
131 SELECT BASIC YES asmdb2

SQL>
很顯然shutdown例項1之後,自動連線到了例項2上,同時在tns中的
asmdb_taf的配置資訊如FAILOVER_TYPE,failover_method等特性都體現了出來,
另外failed_over也變成了YES,這就是透過client端tns中配置實現的TAF。
3.SERVER TAF(Transparent Application Failover)
client端TAF需要在client端的TAF中配置,如果client端很多那麼配置起來
比較麻煩,另外如果一旦server端有啥變化需要修改,那麼修改很多client
端的tns也比較麻煩,從10g開始支援在server端設定services的屬性來
實現TAF。測試過程如下:
--==================================
先啟動剛才停掉的例項1
[oracle@node1 admin]$ srvctl start instance -d asmdb -i asmdb1
[oracle@node1 admin]$
修改services的taf屬性:
SQL> begin
2 dbms_service.modify_service(
3 service_name=>'asmdb',
4 failover_method=>dbms_service.failover_method_basic,
5 failover_type=>dbms_service.failover_type_session,
6 failover_retries=>10,
7 failover_delay=>5
8 );
9 end;
10 /

PL/SQL procedure successfully completed.

SQL>
--===========================
tns中asmdb的配置資訊如下,沒有設定任何taf屬性:
ASMDB =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = node1-vip)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = node2-vip)(PORT = 1521))
(LOAD_BALANCE = yes)
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = asmdb)
)
)
--==================================
SQL> select sid,failover_type,failover_method,failed_over,
2 instance_name from v$session,v$instance where sid=(select distinct sid from v$mystat);

SID FAILOVER_TYPE FAILOVER_M FAI INSTANCE_NAME
---------- ------------- ---------- --- ----------------
142 SESSION BASIC NO asmdb1

SQL>
--停止例項1
[oracle@node1 admin]$ srvctl stop instance -d asmdb -i asmdb1 -o abort
[oracle@node1 admin]$
SQL> select sid,failover_type,failover_method,failed_over,
2 instance_name from v$session,v$instance where sid=(select distinct sid from v$mystat);
select sid,failover_type,failover_method,failed_over,
*
ERROR at line 1:
ORA-25408: can not safely replay call


SQL> select sid,failover_type,failover_method,failed_over,
2 instance_name from v$session,v$instance where sid=(select distinct sid from v$mystat);

SID FAILOVER_TYPE FAILOVER_M FAI INSTANCE_NAME
---------- ------------- ---------- --- ----------------
131 SESSION BASIC YES asmdb2

SQL>
很顯然也切換到例項2,不過中間出現了一個錯誤提示ORA-25408
--==================================
LOAD_BALANCE特性介紹
1.CLIENT LOAD_BALANCE
CLIENT LOAD_BALANCE是在client端的tns中透過增加
(LOAD_BALANCE = yes)來實現的,設定CLIENT LOAD_BALANCE
之後listener在路由client端連線請求時會隨機選擇幾個例項。
下面是測試過程,先遮蔽之前測試啟用的load balance advisor功能。
--=============================
client端tns中asm的配置資訊如下:
ASMDB =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = node1-vip)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = node2-vip)(PORT = 1521))
(LOAD_BALANCE = yes)
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = asmdb)
)
)
--=============================
SQL> select name,goal,clb_goal from dba_services where name='asmdb';

NAME GOAL CLB_G
-------------------- ------------ -----
asmdb SERVICE_TIME SHORT

SQL> exec dbms_service.modify_service(service_name=>'asmdb',goal=>dbms_service.goal_none,clb_goal=>dbms_service.clb_goal_long);

PL/SQL procedure successfully completed.

SQL> select name,goal,clb_goal from dba_services where name='asmdb';

NAME GOAL CLB_G
-------------------- ------------ -----
asmdb NONE LONG

SQL>
--=============================
為了不影響測試效果同時遮蔽server端load balance:
SQL> show parameter remote_listener

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
remote_listener string LISTENERS_ASMDB

SQL> alter system set remote_listener='' sid='*';

System altered.

SQL>
--======================
為了保證測試效果,重啟db和listener,正常重啟service和listener就可以了
,但是由於我使用的service是db default的service不能透過srvctl重啟,所以
需要重啟db。
--======================
重啟使用的3個指令碼如下:
[oracle@node1 sess_lb_script]$ more test.sql
select instance_name from v$instance;

exit

[oracle@node1 sess_lb_script]$ more test.sh
#!/bin/sh
#Usage:test rawdb 4000
count=0
while [ $count -lt $2 ]
do
count=`expr $count + 1`
sqlplus -s @/home/oracle/sess_lb_script/test.sql
#sleep 1
done
[oracle@node1 sess_lb_script]$ more sess_count.sh
grep asmdb1 test.log|wc -l

grep asmdb2 test.log|wc -l
[oracle@node1 sess_lb_script]$
--==============================
測試過程如下:
首先清空日誌檔案test.log
[oracle@node1 sess_lb_script]$ cat /dev/null > test.log
[oracle@node1 sess_lb_script]$ ls -l test.log
-rwxrwxrwx 1 oracle oinstall 0 Mar 26 09:51 test.log
[oracle@node1 sess_lb_script]$
同時做400個連線:
[oracle@node1 sess_lb_script]$ ./test.sh asmdb 400 > test.log
[oracle@node1 sess_lb_script]$ ./sess_count.sh
189
211
[oracle@node1 sess_lb_script]$
這裡我們看到連線到asmdb1和asmdb2上的session分部是189和211,
這就是一個隨機的選擇,沒有考慮任何其他負載等因素。
2.SERVER LOAD_BALANCE
SERVER端LOAD_BALANCE如果不啟用10gR2提供的load balance advisor
特性,其實就是一個session balance。SERVER LOAD_BALANCE是透過
設定引數remote_listener來實現的,在測試client load balance時
把預設啟用的SERVER LOAD_BALANCE透過修改引數remote_listener為null值
遮蔽了,下面啟用SERVER LOAD_BALANCE:
SQL> alter system set remote_listener='LISTENERS_ASMDB' sid='*';

System altered.

SQL>
為了保證效果可以考慮重啟listener,透過lsnrctl services
觀查發現不重啟也可以,SERVER LOAD_BALANCE的原理是當listener
理由client連線請求時會判斷到底哪個例項上的session數量多,listener
會把client的連線請求路由到session數少的一個例項上從而儘可能的
平衡session數量,其實是一種假象load balance,因為session數畢竟
不能真實的反應負載,這種測試需要藉助listener log,因為client端
連線db時在本地例項的listener.log裡會記錄下面資訊:
--========================
26-MAR-2012 01:22:43 * (CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=asmdb)(CID=())) * (ADDRESS=(PROTOCOL=tcp)(HOST=143.168.0.14)(PORT=35805)) * establish * asmdb * 0
--========================
注意這裡的establish關鍵字。
如果是其他例項的listener路由過來的連線,那麼會在本地例項的listener.log中
存在如下資訊:
--=================================
26-MAR-2012 01:22:46 * (CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=asmdb)(CID=()) * (ADDRESS=(PROTOCOL=tcp)(HOST=143.168.0.14)(PORT=35837)) * establish * asmdb * 0
--=================================
區別在於多了(INSTANCE_NAME=asmdb1),但是在listener.log中並沒有體現出來
是由哪個例項的listener路由過來的,所以這種辦法僅適用於2個節點的rac測試
SERVER LOAD_BALANCE
--==========================
下面是測試需要的指令碼:
[oracle@node1 sess_lb_script]$ more test.sql
select instance_name from v$instance;

exit

[oracle@node1 sess_lb_script]$ more test.sh
#!/bin/sh
#Usage:test rawdb 4000
count=0
while [ $count -lt $2 ]
do
count=`expr $count + 1`
sqlplus -s @/home/oracle/sess_lb_script/test.sql
#sleep 1
done
[oracle@node1 sess_lb_script]$ more sess_count.sh
grep asmdb1 test.log|wc -l

grep asmdb2 test.log|wc -l
[oracle@node1 sess_lb_script]$ more log_count.sh
grep establish /app/oracle/product/10.2.0/db_1/network/log/listener_node1.log|wc -l

grep INSTANCE_NAME=asmdb1 /app/oracle/product/10.2.0/db_1/network/log/listener_node1.log|wc -l

grep INSTANCE_NAME=asmdb2 /app/oracle/product/10.2.0/db_1/network/log/listener_node1.log|wc -l
[oracle@node1 sess_lb_script]$
--=============================
下面是測試過程:
首先清空2個節點上的listener.log
[root@node1 log]# cat /dev/null > listener_node1.log
[root@node1 log]# ls -l
total 12
-rw-r--r-- 1 oracle oinstall 0 Mar 26 10:19 listener_node1.log
[root@node2 log]# cat /dev/null > listener_node2.log
[root@node2 log]# ls -l
total 0
-rw-r--r-- 1 oracle oinstall 0 Mar 26 10:20 listener_node2.log
[root@node2 log]#
[oracle@node1 sess_lb_script]$ cat /dev/null > test.log
[oracle@node1 sess_lb_script]$ ./test.sh asmdb 400 > test.log
[oracle@node1 sess_lb_script]$ ./sess_count.sh
70
330
--======================
看好了,這是node1上的listener.log
[oracle@node1 sess_lb_script]$ ./log_count.sh
264
64
0
--===================================
這是node2上的listener.log
[oracle@node2 sess_lb_script]$ ./log_count.sh
394
194
0
[oracle@node2 sess_lb_script]$
[oracle@node1 sess_lb_script]$
這裡效果不加,400個請求,asmdb1上連線了70個,asmdb2上連線了330個;
很顯然asmdb1一共接收了264個連線請求,但是asmdb2的listener路由過來64個
連線請求。asmdb2一共連線了394個連線請求,asmdb1的listener路由過來
194個連線請求。這樣:
264 - 194 =70
394 - 64 = 330
沒有得到預想的session balance。
--==========================
重啟listener同時遮蔽client端load balance的影響,再次測試看看效果:
註釋tns中asmdb中的(LOAD_BALANCE = yes)
[oracle@node2 sess_lb_script]$ srvctl stop listener -n node1
[oracle@node2 sess_lb_script]$ srvctl stop listener -n node2
[oracle@node2 sess_lb_script]$ srvctl start listener -n node1
[oracle@node2 sess_lb_script]$ srvctl start listener -n node2
[oracle@node2 sess_lb_script]$
--==========================
清掉listener.ora以及test.log
[oracle@node1 sess_lb_script]$ ./test.sh asmdb 400 > test.log
[oracle@node1 sess_lb_script]$ ./sess_count.sh
62
338
[oracle@node1 sess_lb_script]$ ./log_count.sh
400
0
0
[oracle@node1 sess_lb_script]$
[oracle@node2 sess_lb_script]$ ./log_count.sh
338
338
0
[oracle@node2 sess_lb_script]$
這個更有意思了,node1接收到了400個連線請求路由給node2 338個
自己保留了62個。
3.啟用load balance advisor來觀查一下測試效果
待續

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/19602/viewspace-1057716/,如需轉載,請註明出處,否則將追究法律責任。

相關文章