From:http://machael.blog.51cto.com/829462/211989

在一個空閒的（idle）TCP連線上，沒有任何的資料流，許多TCP/IP的初學者都對此感到驚奇。也就是說，如果TCP連線兩端沒有任何一個程式在向對方傳送資料，那麼在這兩個TCP模組之間沒有任何的資料交換。你可能在其它的網路協議中發現有輪詢（polling），但在TCP中它不存在。言外之意就是我們只要啟動一個客戶端程式，同伺服器建立了TCP連線，不管你離開幾小時，幾天，幾星期或是幾個月，連線依舊存在。中間的路由器可能崩潰或者重啟，電話線可能go down或者back up，只要連線兩端的主機沒有重啟，連線依舊保持建立。

這就可以認為不管是客戶端的還是伺服器端的應用程式都沒有應用程式級（application-level）的定時器來探測連線的不活動狀態（inactivity），從而引起任何一個應用程式的終止。然而有的時候，伺服器需要知道客戶端主機是否已崩潰並且關閉，或者崩潰但重啟。許多實現提供了存活定時器來完成這個任務。

存活定時器是一個包含爭議的特徵。許多人認為，即使需要這個特徵，這種對對方的輪詢也應該由應用程式來完成，而不是由TCP中實現。此外，如果兩個終端系統之間的某個中間網路上有連線的暫時中斷，那麼存活選項（option）就能夠引起兩個程式間一個良好連線的終止。例如，如果正好在某個中間路由器崩潰、重啟的時候傳送存活探測，TCP就將會認為客戶端主機已經崩潰，但事實並非如此。

存活（keepalive）並不是TCP規範的一部分。在Host Requirements RFC羅列有不使用它的三個理由：（1）在短暫的故障期間，它們可能引起一個良好連線（good connection）被釋放（dropped），（2）它們消費了不必要的寬頻，（3）在以資料包計費的網際網路上它們（額外）花費金錢。然而，在許多的實現中提供了存活定時器。

一些伺服器應用程式可能代表客戶端佔用資源，它們需要知道客戶端主機是否崩潰。存活定時器可以為這些應用程式提供探測服務。Telnet伺服器和Rlogin伺服器的許多版本都預設提供存活選項。

個人計算機使用者使用TCP/IP協議透過Telnet登入一臺主機，這是能夠說明需要使用存活定時器的一個常用例子。如果某個使用者在使用結束時只是關掉了電源，而沒有登出（log off），那麼他就留下了一個半開啟（half-open）的連線。在圖18.16，我們看到如何在一個半開啟連線上透過傳送資料，得到一個復位（reset）返回，但那是在客戶端，是由客戶端傳送的資料。如果客戶端消失，留給了伺服器端半開啟的連線，並且伺服器又在等待客戶端的資料，那麼等待將永遠持續下去。存活特徵的目的就是在伺服器端檢測這種半開啟連線。

二、keepalive如何工作？[1]

在此描述中，我們稱使用存活選項的那一段為伺服器，另一端為客戶端。也可以在客戶端設定該選項，且沒有不允許這樣做的理由，但通常設定在伺服器。如果連線兩端都需要探測對方是否消失，那麼就可以在兩端同時設定（比如NFS）。

若在一個給定連線上，兩小時之內無任何活動，伺服器便向客戶端傳送一個探測段。（我們將在下面的例子中看到探測段的樣子。）客戶端主機必須是下列四種狀態之一：

1) 客戶端主機依舊活躍（up）執行，並且從伺服器可到達。從客戶端TCP的正常響應，伺服器知道對方仍然活躍。伺服器的TCP為接下來的兩小時復位存活定時器，如果在這兩個小時到期之前，連線上發生應用程式的通訊，則定時器重新為往下的兩小時復位，並且接著交換資料。

2) 客戶端已經崩潰，或者已經關閉（down），或者正在重啟過程中。在這兩種情況下，它的TCP都不會響應。伺服器沒有收到對其發出探測的響應，並且在75秒之後超時。伺服器將總共傳送10個這樣的探測，每個探測75秒。如果沒有收到一個響應，它就認為客戶端主機已經關閉並終止連線。

3) 客戶端曾經崩潰，但已經重啟。這種情況下，伺服器將會收到對其存活探測的響應，但該響應是一個復位，從而引起伺服器對連線的終止。

4) 客戶端主機活躍執行，但從伺服器不可到達。這與狀態2類似，因為TCP無法區別它們兩個。它所能表明的僅是未收到對其探測的回覆。

伺服器不必擔心客戶端主機被關閉然後重啟的情況（這裡指的是操作員執行的正常關閉，而不是主機的崩潰）。當系統被操作員關閉時，所有的應用程式程式（也就是客戶端程式）都將被終止，客戶端TCP會在連線上傳送一個FIN。收到這個FIN後，伺服器TCP向伺服器程式報告一個檔案結束，以允許伺服器檢測這種狀態。

在第一種狀態下，伺服器應用程式不知道存活探測是否發生。凡事都是由TCP層處理的，存活探測對應用程式透明，直到後面2，3，4三種狀態發生。在這三種狀態下，透過伺服器的TCP，返回給伺服器應用程式錯誤資訊。（通常伺服器向網路發出一個讀請求，等待客戶端的資料。如果存活特徵返回一個錯誤資訊，則將該資訊作為讀操作的返回值返回給伺服器。）在狀態2，錯誤資訊類似於“連線超時”。狀態3則為“連線被對方復位”。第四種狀態看起來像連線超時，或者根據是否收到與該連線相關的ICMP錯誤資訊，而可能返回其它的錯誤資訊。

三、在Linux中如何使用keepalive？[2]

Linux has built-in support for keepalive. You need to enable TCP/IP networking in order to use it. You also need procfs support andsysctl support to be able to configure the kernel parameters at runtime.

The procedures involving keepalive use three user-driven variables:

tcp_keepalive_time

the interval between the last data packet sent (simple ACKs are not considered data) and the first keepalive probe; after the connection is marked to need keepalive, this counter is not used any further

tcp_keepalive_intvl

the interval between subsequential keepalive probes, regardless of what the connection has exchanged in the meantime

tcp_keepalive_probes

the number of unacknowledged probes to send before considering the connection dead and notifying the application layer

Remember that keepalive support, even if configured in the kernel, is not the default behavior in Linux. Programs must request keepalive control for their sockets using the setsockopt interface. There are relatively few programs implementing keepalive, but you can easily add keepalive support for most of them following the instructions.

上面一段話已經說得很明白，linux核心包含對keepalive的支援。其中使用了三個引數：tcp_keepalive_time（開啟keepalive的閒置時長）tcp_keepalive_intvl（keepalive探測包的傳送間隔）和tcp_keepalive_probes （如果對方不予應答，探測包的傳送次數）；如何配置這三個引數呢？

There are two ways to configure keepalive parameters inside the kernel via userspace commands:

procfs interface
sysctl interface

We mainly discuss how this is accomplished on the procfs interface because it's the most used, recommended and the easiest to understand. The sysctl interface, particularly regarding the sysctl(2) syscall and not the sysctl(8) tool, is only here for the purpose of background knowledge.

The procfs interface

This interface requires both sysctl and procfs to be built into the kernel, and procfs mounted somewhere in the filesystem (usually on/proc, as in the examples below). You can read the values for the actual parameters by "catting" files in /proc/sys/net/ipv4/directory:

 # cat /proc/sys/net/ipv4/tcp_keepalive_time 7200 # cat /proc/sys/net/ipv4/tcp_keepalive_intvl 75 # cat /proc/sys/net/ipv4/tcp_keepalive_probes 9

The first two parameters are expressed in seconds, and the last is the pure number. This means that the keepalive routines wait for two hours (7200 secs) before sending the first keepalive probe, and then resend it every 75 seconds. If no ACK response is received for nine consecutive times, the connection is marked as broken.

Modifying this value is straightforward: you need to write new values into the files. Suppose you decide to configure the host so that keepalive starts after ten minutes of channel inactivity, and then send probes in intervals of one minute. Because of the high instability of our network trunk and the low value of the interval, suppose you also want to increase the number of probes to 20.

Here's how we would change the settings:

 # echo 600 > /proc/sys/net/ipv4/tcp_keepalive_time # echo 60 > /proc/sys/net/ipv4/tcp_keepalive_intvl # echo 20 > /proc/sys/net/ipv4/tcp_keepalive_probes

To be sure that all succeeds, recheck the files and confirm these new values are showing in place of the old ones.

這樣，上面的三個引數配置完畢。使這些引數重啟時保持不變的方法請閱讀參考文獻[2]。

四、在程式中如何使用keepalive？[2]-[4]

All you need to enable keepalive for a specific socket is to set the specific socket option on the socket itself. The prototype of the function is as follows:

int setsockopt(int s, int level, int optname,
                 const void *optval, socklen_t optlen)

The first parameter is the socket, previously created with the socket(2); the second one must be SOL_SOCKET, and the third must beSO_KEEPALIVE . The fourth parameter must be a boolean integer value, indicating that we want to enable the option, while the last is the size of the value passed before.

According to the manpage, 0 is returned upon success, and -1 is returned on error (and errno is properly set).

There are also three other socket options you can set for keepalive when you write your application. They all use the SOL_TCP level instead of SOL_SOCKET, and they override system-wide variables only for the current socket. If you read without writing first, the current system-wide parameters will be returned.

TCP_KEEPCNT: overrides tcp_keepalive_probes

TCP_KEEPIDLE: overrides tcp_keepalive_time

TCP_KEEPINTVL: overrides tcp_keepalive_intvlint keepAlive = 1; // 開啟keepalive屬性

我們看到keepalive是一個開關選項，可以透過函式來使能。具體地說，可以使用以下程式碼：

setsockopt(rs, SOL_SOCKET, SO_KEEPALIVE, (void *)&keepAlive, sizeof(keepAlive));

上面英文資料中提到的第二個引數可以取為SOL_TCP，以設定keepalive的三個引數（具體程式碼參考文獻[3]），在程式中實現需要標頭檔案“netinet/tcp.h”。當然，在實際程式設計時也可以採用系統呼叫的方式配置的keepalive引數。

關於setsockopt的其他引數可以參考文獻[4]。

五、如何判斷TCP連線是否斷開？[3]

當tcp檢測到對端socket不再可用時(不能發出探測包,或探測包沒有收到ACK的響應包),select會返回socket可讀,並且在recv時返回-1,同時置上errno為ETIMEDOUT。

linux下使用TCP存活(keepalive)定時器

相關文章