TCP segmentaion 和 checksum offload

myownstars發表於2015-01-27

為什麼要offloading?

TCP的分片和校驗和計算原本都由CPU完成,最新的網路介面卡實現了硬體級別的分片和校驗和計算,減輕了主機CPU的負擔從而提升本機網路包的吞吐量;

 

原理

當資料包被核心tcp/ip 棧傳遞到驅動的時候, tcp/ip棧並不會利用CPU來進行tcp checksum的計算。所以在這個時刻,這個tcp checksum的值是隨機的髒資料。只要資料包傳遞到驅動後,由於網路卡驅動執行在tcp checksum offload 模式,所以驅動會呼叫晶片的tcp checksum計算功能來完成高效能tcp checksum重新計算。並將資料包放到線路上

TCP segmentaion同理;

 

Snoopsolaris自帶的原生網路抓包工具,工作在核心和網路驅動之間,它抓的包發生在被NIC晶片重新計算tcp checksum之前,所以這個髒資料會導致wireshark報告tcp checksum incorrect的告警。

http://wdqfirst.blog.163.com/blog/static/1133474112011510101844959/

Checksum Offload的設定有四種:是否對RxTx有效,也可以為對兩者都有效。

對於Tx,設定Checksum Offload有效之後,傳輸層將隨機填充TCP校驗和,因此在本機上抓取的資料包是Bad CheckSum。然後,網路卡會自動計算正確的校驗碼然後傳送,因此對方收到的仍然是正確的TCP包。

對於Rx,設定Checksum Offload有效之後,網路卡在接收資料時,會填充一個NDIS_TCP_IP_CHECKSUM_PACKET_INFO 結構並設定標誌位;

http://blog.csdn.net/wangqi0079/article/details/9064557 

 

 

 

TCP Segmentaion offload

10.10.1.22採用netcat接收資料

sgordon@basil$ nc -l 5001

192.168.1.2則傳送10000位元組的資料

sgordon@ginger$ nc -p 5002 10.10.1.22 5001 < 10000bytes.txt

 

開啟GSO

sgordon@ginger$ sudo ethtool -k eth0

Offload parameters for eth0:

Cannot get device flags: Operation not supported

rx-checksumming: on

tx-checksumming: on

scatter-gather: on

tcp segmentation offload: off

udp fragmentation offload: off

generic segmentation offload: on

large receive offload: off

 

sgordon@ginger$ sudo tcpdump -i eth0 -n 'not port 22'

tcpdump: verbose output suppressed, use -v or -vv for full protocol decode

listening on eth0, link-type EN10MB (Ethernet), capture size 96 bytes

18:30:24.899687 IP 192.168.1.2.5002 > 10.10.1.22.5001: S 679249855:679249855(0) win 5840

18:30:24.900583 IP 10.10.1.22.5001 > 192.168.1.2.5002: S 1420594303:1420594303(0) ack 679249856 win 5792

18:30:24.900612 IP 192.168.1.2.5002 > 10.10.1.22.5001: . ack 1 win 92

18:30:24.900713 IP 192.168.1.2.5002 > 10.10.1.22.5001: . 1:2897(2896) ack 1 win 92

18:30:24.900735 IP 192.168.1.2.5002 > 10.10.1.22.5001: . 2897:4345(1448) ack 1 win 92

18:30:24.902575 IP 10.10.1.22.5001 > 192.168.1.2.5002: . ack 1449 win 68

18:30:24.902591 IP 192.168.1.2.5002 > 10.10.1.22.5001: P 4345:7241(2896) ack 1 win 92

18:30:24.903597 IP 10.10.1.22.5001 > 192.168.1.2.5002: . ack 2897 win 91

18:30:24.903607 IP 192.168.1.2.5002 > 10.10.1.22.5001: . 7241:8689(1448) ack 1 win 92

18:30:24.903613 IP 192.168.1.2.5002 > 10.10.1.22.5001: P 8689:10001(1312) ack 1 win 92

18:30:24.903617 IP 10.10.1.22.5001 > 192.168.1.2.5002: . ack 4345 win 114

18:30:24.905573 IP 10.10.1.22.5001 > 192.168.1.2.5002: . ack 5793 win 136

18:30:24.905587 IP 10.10.1.22.5001 > 192.168.1.2.5002: . ack 7241 win 159

18:30:24.906628 IP 10.10.1.22.5001 > 192.168.1.2.5002: . ack 8689 win 181

18:30:24.906637 IP 10.10.1.22.5001 > 192.168.1.2.5002: . ack 10001 win 204

10000個位元組被分成5segment,最大的為2896,可乙太網的MSS1460(啟用SACKtimestamp後為1448),這便是Generic Segmentaion Offloading在工作;

 

 

關閉GSO

sgordon@ginger$ sudo ethtool -K eth0 gso off

 

sgordon@ginger$ sudo tcpdump -i eth0 -n 'not port 22'

tcpdump: verbose output suppressed, use -v or -vv for full protocol decode

listening on eth0, link-type EN10MB (Ethernet), capture size 96 bytes

18:33:02.644356 IP 192.168.1.2.5002 > 10.10.1.22.5001: S 3144010294:3144010294(0) win 5840

18:33:02.645427 IP 10.10.1.22.5001 > 192.168.1.2.5002: S 3901655238:3901655238(0) ack 3144010295 win 5792

18:33:02.645471 IP 192.168.1.2.5002 > 10.10.1.22.5001: . ack 1 win 92

18:33:02.645542 IP 192.168.1.2.5002 > 10.10.1.22.5001: . 1:1449(1448) ack 1 win 92

18:33:02.645558 IP 192.168.1.2.5002 > 10.10.1.22.5001: . 1449:2897(1448) ack 1 win 92

18:33:02.645567 IP 192.168.1.2.5002 > 10.10.1.22.5001: P 2897:4345(1448) ack 1 win 92

18:33:02.647415 IP 10.10.1.22.5001 > 192.168.1.2.5002: . ack 1449 win 68

18:33:02.647433 IP 192.168.1.2.5002 > 10.10.1.22.5001: . 4345:5793(1448) ack 1 win 92

18:33:02.647439 IP 192.168.1.2.5002 > 10.10.1.22.5001: . 5793:7241(1448) ack 1 win 92

18:33:02.648437 IP 10.10.1.22.5001 > 192.168.1.2.5002: . ack 2897 win 91

18:33:02.648446 IP 192.168.1.2.5002 > 10.10.1.22.5001: . 7241:8689(1448) ack 1 win 92

18:33:02.648451 IP 192.168.1.2.5002 > 10.10.1.22.5001: P 8689:10001(1312) ack 1 win 92

18:33:02.648460 IP 10.10.1.22.5001 > 192.168.1.2.5002: . ack 4345 win 114

18:33:02.650414 IP 10.10.1.22.5001 > 192.168.1.2.5002: . ack 5793 win 136

18:33:02.650428 IP 10.10.1.22.5001 > 192.168.1.2.5002: . ack 7241 win 159

18:33:02.651469 IP 10.10.1.22.5001 > 192.168.1.2.5002: . ack 8689 win 181

18:33:02.651476 IP 10.10.1.22.5001 > 192.168.1.2.5002: . ack 10001 win 204

此時TCP segment在核心棧便參照MSS分割

 

 

 

 

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/15480802/viewspace-1416446/,如需轉載,請註明出處,否則將追究法律責任。

相關文章