hds 多路徑軟體failover,failback測試

yangzhangyue發表於2013-08-07
Normal 0 7.8 磅 0 2 false false false EN-US ZH-CN X-NONEhds failover,failback測試

終端1

[16:26:24 root@localhost modprobe.d]# dd if=/dev/zero f=/dev/sddlmaa1

233408833+0 records in

233408833+0 records out

119505322496 bytes (120 GB) copied, 701.094 s, 170 MB/s

 

 

終端2

[16:31:49 root@localhost bin]# ./dlnkmgr view -path

Paths:000002 OnlinePaths:000002

PathStatus   IO-Count    IO-Errors

Online       235218946   0        

 

PathID PathName                        DskName                                    iLU              ChaPort Status     Type IO-Count   IO-Errors  DNum HDevName

000000 0007.0000.0000000000000000.0000 HITACHI .DF600F          .85017915         0217             0A      Online     Own    75625355          0    0 sddlmaa

000001 0008.0000.0000000000000000.0000 HITACHI .DF600F          .85017915         0217             0A      Online     Own   159593591          0    0 sddlmaa

KAPL01001-I The HDLM command completed normally. Operation name = view, completion time = 2013/08/05 16:32:32

 

終端3

Linux 2.6.32-220.el6.x86_64 (localhost.localdomain)     08/05/2013      _x86_64_        (8 CPU)

 

avg-cpu:  %user   %nice %system %iowait  %steal   %idle

           0.09    0.01    2.26    0.42    0.00   97.22

 

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn

sda               1.55        67.96         8.32     476638      58346

sdb             194.79         1.24     10897.81       8728   76431871

sdc             233.01         1.06     22918.89       7416  160741878

 

avg-cpu:  %user   %nice %system %iowait  %steal   %idle

           0.87    0.00   22.01    4.24    0.00   72.88

 

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn

sda               0.20         0.00         2.40          0         24

sdb            1020.40         0.00    206470.20          0    2064702

sdc             966.10         0.00    138000.00          0    1380000

 

avg-cpu:  %user   %nice %system %iowait  %steal   %idle

           0.90    0.00   21.15    4.78    0.00   73.17

 

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn

sda               0.00         0.00         0.00          0          0

sdb             978.30         0.00    149055.20          0    1490552

sdc            1011.50         0.00    189052.40          0    1890524

 

avg-cpu:  %user   %nice %system %iowait  %steal   %idle

           0.90    0.00   21.14    4.08    0.00   73.88

 

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn

sda               2.20       156.00         1.60       1560         16

sdb            1033.80         0.00    152442.40          0    1524424

sdc            1070.40         0.00    190097.60          0    1900976

 

avg-cpu:  %user   %nice %system %iowait  %steal   %idle

           0.84    0.00   22.66    3.86    0.00   72.65

 

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn

sda               1.30        17.58         4.00        176         40

sdb             233.97         0.00     46331.27          0     463776

sdc             680.22         0.00    296334.77          0    2966311

 

終端2

關閉一個光纖卡

[16:32:32 root@localhost bin]# ./dlnkmgr offline -hba 0007.0000

KAPL01055-I All the paths which pass the specified HBA will be changed to the Offline(C) status. Is this OK? [y/n]:y

KAPL01056-I If you are sure that there would be no problem when all the paths which pass the specified HBA are placed in the Offline(C) status, enter y. Otherwise, enter n. [y/n]:y

KAPL01061-I 1 path(s) were successfully placed Offline(C); 0 path(s) were not. Operation name = offline

[16:33:10 root@localhost bin]# ./dlnkmgr view -path

Paths:000002 OnlinePaths:000001

PathStatus   IO-Count    IO-Errors

Reduced      252808586   0        

 

PathID PathName                        DskName                                    iLU              ChaPort Status     Type IO-Count   IO-Errors  DNum HDevName

000000 0007.0000.0000000000000000.0000 HITACHI .DF600F          .85017915         0217             0A      Offline(C) Own    81976455          0    0 sddlmaa

000001 0008.0000.0000000000000000.0000 HITACHI .DF600F          .85017915         0217             0A      Online     Own   170832131          0    0 sddlmaa

KAPL01001-I The HDLM command completed normally. Operation name = view, completion time = 2013/08/05 16:33:24

 

終端3:

檢視iostat情況,可以發現sdb流量為0dsc Blk_wrtn 3421184增加了盡一倍

avg-cpu:  %user   %nice %system %iowait  %steal   %idle

           0.81    0.00   23.19    2.30    0.00   73.70

 

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn

sda               0.20         0.00         2.40          0         24

sdb               0.00         0.00         0.00          0          0

sdc             334.00         0.00    342118.40          0    3421184

 

avg-cpu:  %user   %nice %system %iowait  %steal   %idle

           0.84    0.00   23.48    2.26    0.00   73.42

 

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn

sda               0.20         0.00         2.40          0         24

sdb               0.00         0.00         0.00          0          0

sdc             335.60         0.00    343552.00          0    3435520

 

avg-cpu:  %user   %nice %system %iowait  %steal   %idle

           0.83    0.00   23.18    2.39    0.00   73.60

 

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn

sda               0.60         0.00         4.80          0         48

sdb               0.00         0.00         0.00          0          0

sdc             335.70         0.00    343859.20          0    3438592

 

avg-cpu:  %user   %nice %system %iowait  %steal   %idle

           0.84    0.00   23.33    2.47    0.00   73.36

 

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn

sda               0.00         0.00         0.00          0          0

sdb               0.00         0.00         0.00          0          0

sdc             334.60         0.00    342630.40          0    3426304

 

avg-cpu:  %user   %nice %system %iowait  %steal   %idle

           0.86    0.00   23.03    2.32    0.00   73.80

 

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn

sda               0.00         0.00         0.00          0          0

sdb               0.00         0.00         0.00          0          0

sdc             334.80         0.00    342835.20          0    3428352

 

avg-cpu:  %user   %nice %system %iowait  %steal   %idle

           0.80    0.00   22.68    3.51    0.00   73.01

 

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn

sda               0.00         0.00         0.00          0          0

sdb               0.00         0.00         0.00          0          0

sdc             335.10         0.00    343040.00          0    3430400

 

avg-cpu:  %user   %nice %system %iowait  %steal   %idle

           0.87    0.00   22.52    3.20    0.00   73.41

          

終端2

將關閉的光纖卡置為online

[16:33:24 root@localhost bin]# ./dlnkmgr online -hba 0007.0000

KAPL01057-I All the paths which pass the specified HBA will be changed to the Online status. Is this OK? [y/n]:y

KAPL01061-I 1 path(s) were successfully placed Online; 0 path(s) were not. Operation name = online

[16:34:20 root@localhost bin]# ./dlnkmgr view -path

Paths:000002 OnlinePaths:000002

PathStatus   IO-Count    IO-Errors

Online       272845735   0        

 

PathID PathName                        DskName                                    iLU              ChaPort Status     Type IO-Count   IO-Errors  DNum HDevName

000000 0007.0000.0000000000000000.0000 HITACHI .DF600F          .85017915         0217             0A      Online     Own    82274955          0    0 sddlmaa

000001 0008.0000.0000000000000000.0000 HITACHI .DF600F          .85017915         0217             0A      Online     Own   190570780          0    0 sddlmaa

KAPL01001-I The HDLM command completed normally. Operation name = view, completion time = 2013/08/05 16:34:22

 

終端3

再看看io的情況,io負載分散到sdbsdc上面

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn

sda               3.40       239.20         7.20       2392         72

sdb             801.90         0.00    118380.00          0    1183800

sdc             922.40         0.00    224718.50          0    2247185

 

avg-cpu:  %user   %nice %system %iowait  %steal   %idle

           0.81    0.00   21.67    4.00    0.00   73.52

 

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn

sda               0.00         0.00         0.00          0          0

sdb            1127.90         0.00    170607.40          0    1706074

sdc            1105.10         0.00    147145.60          0    1471456

 

avg-cpu:  %user   %nice %system %iowait  %steal   %idle

           0.86    0.00   22.77    2.70    0.00   73.68

 

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn

sda               0.00         0.00         0.00          0          0

sdb            1952.10         0.00    176125.40          0    1761254

sdc            1992.30         0.00    184086.40          0    1840864

 

avg-cpu:  %user   %nice %system %iowait  %steal   %idle

           0.73    0.00   23.05    3.05    0.00   73.17

 

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn

sda               0.60         0.00         4.80          0         48

sdb            2100.40         0.00    174668.40          0    1746684

sdc            2152.80         0.00    176666.80          0    1766668

 

avg-cpu:  %user   %nice %system %iowait  %steal   %idle

           0.88    0.00   22.60    3.07    0.00   73.45

 

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn

sda               3.40        48.00        11.20        480        112

sdb            1108.10         0.00    155167.60          0    1551676

sdc            1196.50         0.00    188496.00          0    1884960

 

avg-cpu:  %user   %nice %system %iowait  %steal   %idle

           0.84    0.00   23.66    2.62    0.00   72.88

 

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn

sda               0.00         0.00         0.00          0          0

sdb            1174.20         0.00    185929.40          0    1859294

sdc            1074.30         0.00    155898.00          0    1558980

 

avg-cpu:  %user   %nice %system %iowait  %steal   %idle

           0.88    0.00   23.17    2.48    0.00   73.47

 

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn

sda               0.20         0.00         2.40          0         24

sdb            1189.70         0.00    185251.80          0    1852518

sdc            1100.80         0.00    157490.00          0    1574900

 

avg-cpu:  %user   %nice %system %iowait  %steal   %idle

           0.88    0.00   23.83    2.45    0.00   72.84

 

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn

sda               1.00         0.00       205.60          0       2056

sdb            1249.40         0.00    187183.10          0    1871831

sdc            1113.00         0.00    155370.00          0    1553700

 

avg-cpu:  %user   %nice %system %iowait  %steal   %idle

           0.78    0.00   22.82    3.03    0.00   73.38

 

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn

sda               1.10         0.00        10.40          0        104

sdb            1541.30         0.00    176036.70          0    1760367

sdc            1441.80         0.00    151576.90          0    1515769

 

手動切換是不受影響的

但是如果拔掉光纖卡,讀寫在check完成之前,還是有影響的

終端1

[16:39:45 root@localhost modprobe.d]# dd if=/dev/zero f=/dev/sddlmaa1

 

終端2

[16:48:05 root@localhost ~]# iostat 10 50 >iostat.log

 

拔掉一個光纖

終端3

[16:49:38 root@localhost bin]# ./dlnkmgr view -path

Paths:000002 OnlinePaths:000002

PathStatus   IO-Count    IO-Errors

Online       387300351   0        

 

PathID PathName                        DskName                                    iLU              ChaPort Status     Type IO-Count   IO-Errors  DNum HDevName

000000 0007.0000.0000000000000000.0000 HITACHI .DF600F          .85017915         0217             0A      Online     Own   137667240          0    0 sddlmaa

000001 0008.0000.0000000000000000.0000 HITACHI .DF600F          .85017915         0217             0A      Online     Own   249633111          0    0 sddlmaa

KAPL01001-I The HDLM command completed normally. Operation name = view, completion time = 2013/08/05 16:49:38

[16:49:38 root@localhost bin]# ./dlnkmgr view -path

Paths:000002 OnlinePaths:000001

PathStatus   IO-Count    IO-Errors

Reduced      387337873   22185    

 

PathID PathName                        DskName                                    iLU              ChaPort Status     Type IO-Count   IO-Errors  DNum HDevName

000000 0007.0000.0000000000000000.0000 HITACHI .DF600F          .85017915         0217             0A      Online     Own   137704762          0    0 sddlmaa

000001 0008.0000.0000000000000000.0000 HITACHI .DF600F          .85017915         0217             0A      Offline(E) Own   249633111      22185    0 sddlmaa

KAPL01001-I The HDLM command completed normally. Operation name = view, completion time = 2013/08/05 16:49:39

[16:49:39 root@localhost bin]# ./dlnkmgr view -path

Paths:000002 OnlinePaths:000001

PathStatus   IO-Count    IO-Errors

Reduced      387450196   24029    

 

PathID PathName                        DskName                                    iLU              ChaPort Status     Type IO-Count   IO-Errors  DNum HDevName

000000 0007.0000.0000000000000000.0000 HITACHI .DF600F          .85017915         0217             0A      Online     Own   137817085          0    0 sddlmaa

000001 0008.0000.0000000000000000.0000 HITACHI .DF600F          .85017915         0217             0A      Offline(E) Own   249633111      24029    0 sddlmaa

KAPL01001-I The HDLM command completed normally. Operation name = view, completion time = 2013/08/05 16:49:40

 

過一段時間,多路徑軟體會檢測到一個鏈路變為Offline(E)

檢視iostat情況,大概經過40-50s時間,io流量將為0了,之後檢測到一個鏈路是正常的,io才正常

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn

sda               1.00         0.00         8.80          0         88

sdb             836.10         0.00    114120.00          0    1141200

sdc             850.80         0.00    160197.20          0    1601972

 

avg-cpu:  %user   %nice %system %iowait  %steal   %idle

           0.00    0.00    0.07   25.38    0.00   74.55

 

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn

sda               0.50         0.80         6.40          8         64

sdb               0.00         0.00         0.00          0          0

sdc               0.00         0.00         0.00          0          0

 

avg-cpu:  %user   %nice %system %iowait  %steal   %idle

           0.00    0.00    0.09   24.59    0.00   75.32

 

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn

sda               0.30         0.00         3.20          0         32

sdb               0.00         0.00         0.00          0          0

sdc               0.00         0.00         0.00          0          0

 

avg-cpu:  %user   %nice %system %iowait  %steal   %idle

           0.01    0.00    0.11   25.46    0.00   74.41

 

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn

sda              10.90       525.60         4.00       5256         40

sdb               0.00         0.00         0.00          0          0

sdc               0.00         0.00         0.00          0          0

 

avg-cpu:  %user   %nice %system %iowait  %steal   %idle

           0.16    0.00    0.25   24.48    0.00   75.11

 

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn

sda               1.30        51.20        11.20        512        112

sdb               0.00         0.00         0.00          0          0

sdc               0.00         0.00         0.00          0          0

 

avg-cpu:  %user   %nice %system %iowait  %steal   %idle

           0.93    0.00   16.15   11.85    0.00   71.07

 

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn

sda               5.90       163.20       210.40       1632       2104

sdb           19102.90         0.00    125551.50          0    1255515

 

插上光纖

[16:50:51 root@localhost bin]# ./dlnkmgr view -path

Paths:000002 OnlinePaths:000001

PathStatus   IO-Count    IO-Errors

Reduced      410617324   24029    

 

PathID PathName                        DskName                                    iLU              ChaPort Status     Type IO-Count   IO-Errors  DNum HDevName

000000 0007.0000.0000000000000000.0000 HITACHI .DF600F          .85017915         0217             0A      Online     Own   160984213          0    0 sddlmaa

000001 0008.0000.0000000000000000.0000 HITACHI .DF600F          .85017915         0217             0A      Offline(E) Own   249633111      24029    0 sddlmaa

KAPL01001-I The HDLM command completed normally. Operation name = view, completion time = 2013/08/05 16:50:52

[16:50:52 root@localhost bin]# ./dlnkmgr view -path

Paths:000002 OnlinePaths:000002

PathStatus   IO-Count    IO-Errors

Online       415619590   24029    

 

PathID PathName                        DskName                                    iLU              ChaPort Status     Type IO-Count   IO-Errors  DNum HDevName

000000 0007.0000.0000000000000000.0000 HITACHI .DF600F          .85017915         0217             0A      Online     Own   164203113          0    0 sddlmaa

000001 0008.0000.0000000000000000.0000 HITACHI .DF600F          .85017915         0217             0A      Online     Own   251416477      24029    0 sddlmaa

KAPL01001-I The HDLM command completed normally. Operation name = view, completion time = 2013/08/05 16:51:07

 

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn

sda               1.10         0.00        56.80          0        568

sdb            3234.50         0.00    381491.50          0    3814915

 

avg-cpu:  %user   %nice %system %iowait  %steal   %idle

           0.87    0.00   21.18   10.45    0.00   67.50

 

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn

sda               7.60       426.40        16.00       4264        160

sdb             335.70         0.00    343756.80          0    3437568

sdd               1.10         8.80         0.00         88          0

 

avg-cpu:  %user   %nice %system %iowait  %steal   %idle

           0.83    0.00   22.01   14.26    0.00   62.90

 

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn

sda               6.30       249.60       208.80       2496       2088

sdb             335.10         0.00    343142.40          0    3431424

sdd              11.70        95.70         0.00        957          0

 

avg-cpu:  %user   %nice %system %iowait  %steal   %idle

           0.80    0.00   22.89    7.45    0.00   68.87

 

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn

sda               0.60        10.40         5.60        104         56

sdb             336.00         0.00    344064.00          0    3440640

sdd              12.20        99.70         0.00        997          0

 

avg-cpu:  %user   %nice %system %iowait  %steal   %idle

           0.87    0.00   23.14    2.71    0.00   73.29

 

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn

sda               2.90       219.20         8.80       2192         88

sdb             335.20         0.00    343347.20          0    3433472

sdd               0.00         0.00         0.00          0          0

 

avg-cpu:  %user   %nice %system %iowait  %steal   %idle

           0.84    0.00   21.66    4.12    0.00   73.38

 

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn

sda               0.70         0.00         9.60          0         96

sdb             976.40         0.00    188648.40          0    1886484

sdd             993.40         0.00    153716.60          0    1537166

 

avg-cpu:  %user   %nice %system %iowait  %steal   %idle

           0.78    0.00   22.12    3.73    0.00   73.37

 

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn

sda               0.30         0.00         4.00          0         40

sdb            1269.00         0.00    163058.60          0    1630586

sdd            1428.80         0.00    180634.50          0    1806345

 

由上面的內容看,io並沒有收到影響,io又回覆到負載均衡狀態

 

從上面看,failover是需要時間的,對於一些要求比較高的應用,比如如果資料庫負載比較高,這都是比較危險的,這與我們潛意思中雙光纖卡冗餘,如果其中一條壞掉,正常的那條鏈路是正常工作的。

hds專業解釋:

Normal 0 7.8 磅 0 2 false false false EN-US ZH-CN X-NONE

HDLM預設的負載均衡方式是RR輪詢,例如主機IOABCDEFGH…..寫下來,如果分在兩條路徑上,則路徑一傳ACEG……,路徑二傳BDFH…….,儲存控制器在從兩條路徑收到資料後,再組合成ABCDEFGH,按順序寫到磁碟上。因為每個HBA卡的埠都有IO排隊,即有佇列深度可調。所以主機的IO會事先分配到兩個HBA卡埠排隊,如果路徑一突然中斷了,則主機會HOLD住所有的IO,將原先排隊在路徑一上等待傳輸的ACEG與路徑二上的BDFH重新按序組合成ABCDEFGH,並重新排隊到路徑二上,再透過路徑二傳送到儲存端。

 

所以中間無IO的時間,就是主機重新對HBA上的待傳送IO的重新排序時間。


來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/29033984/viewspace-767948/,如需轉載,請註明出處,否則將追究法律責任。

相關文章