MQTT協議 paho.mqtt.golang keepAlive原始碼淺析

Andre930發表於2020-12-13

MQTT協議 paho.mqtt.golang keepAlive原始碼淺析

# 閱讀本文,你將瞭解:
# 1.MQTT 協議KeepAlive機制
# 2.MQTT 協議KeepAlive機制 golang實現原理
# 3.關於KeepAlive 時長設定建議

前言

最近在公司做mqtt協議壓測的時候,發現少量mqtt裝置在執行publish過程中,connect連線被協議層主動斷掉了。而客戶端catch到的errors只有一句var ErrNotConnected = errors.New("Not Connected"),但已知網路通訊沒有問題,且研發大佬表示MQTT broker負載並沒有打滿。因此自己便開始嘗試看paho.mqtt.golang包的相關原始碼,從中獲取一些有用的資訊,便有了此文。

1. MQTT 協議 Keep ALive機制

MQTT Keep Alive

MQTT includes a keep alive function that provides a workaround for the issue of half-open connections (or at least makes it possible to assess if the connection is still open).

MQTT包括一個保持活動功能,該功能為半開連線的問題提供了一種解決方法(或者至少使評估連線是否仍處於開啟狀態成為可能)。

Keep alive ensures that the connection between the broker and client is still open and that the broker and the client are aware of being connected. When the client establishes a connection to the broker, the client communicates a time interval in seconds to the broker. This interval defines the maximum length of time that the broker and client may not communicate with each other.

**保持活動狀態可確保broker和客戶端之間的連線仍處於開啟狀態,並確保broker和客戶端知道已連線。**當客戶端建立與broker的連線時,客戶端將以秒為單位的時間間隔傳達給broker。此時間間隔定義了broker和客戶端可能無法相互通訊的最大時間長度。

The MQTT specification says the following:

“The Keep Alive … is the maximum time interval that is permitted to elapse between the point at which the Client finishes transmitting one Control Packet and the point it starts sending the next. It is the responsibility of the Client to ensure that the interval between Control Packets being sent does not exceed the Keep Alive value. In the absence of sending any other Control Packets, the Client MUST send a PINGREQ Packet.”

As long as messages are exchanged frequently and the keep-alive interval is not exceeded, there is no need to send an extra message to establish whether the connection is still open.

只要頻繁交換訊息且不超過保持連線間隔,就無需傳送額外的訊息來確定連線是否仍處於開啟狀態。

If the client does not send a messages during the keep-alive period, it must send a PINGREQ packet to the broker to confirm that it is available and to make sure that the broker is also still available.

如果客戶端在保持活動期間未傳送訊息,則必須將PINGREQ資料包傳送給broker,以確認該訊息可用,並確保broker仍然可用。

The broker must disconnect a client that does not send a message or a PINGREQ packet in one and a half times the keep alive interval. Likewise, the client is expected to close the connection if it does not receive a response from the broker in a reasonable amount of time.

**Broker必須斷開不傳送訊息或PINGREQ資料包的客戶端的保持時間間隔的一半。**同樣,如果客戶端在合理的時間內未收到broker的響應,則期望該客戶端關閉連線。

以上內容,在來至:https://www.hivemq.com/blog/mqtt-essentials-part-10-alive-client-take-over/

通過上文資訊,我們可以知道,mqtt協議層支援keepAlive機制,使之client與broker之間能維持一個長連線,而要保持一個長連線,就需client與broker之間存在間隔通訊,或者payload 等操作。mqtt協議是基於TCP協議之上的,因此也就不難理解client與broker之間的keepAlive通訊機制了。

2.MQTT keepAlive golang原始碼淺析

話不多說,直接上原始碼:

// 這是[paho.mqtt.golang](https://github.com/eclipse/paho.mqtt.golang)包中 ping的原始碼
func keepalive(c *client) {
	defer c.workers.Done()
	DEBUG.Println(PNG, "keepalive starting")
	var checkInterval int64
	var pingSent time.Time
//從 client 物件中,獲取設定的keepAlive值,即opts.SetKeepAlive(time.Duration(ktime) * time.Second)
	if c.options.KeepAlive > 10 {
		checkInterval = 5
	} else {
		checkInterval = c.options.KeepAlive / 2
	}
  // 建立一個timeTicker(),來輪巡的檢查client的keepAlive 值
  // 注意: 這裡可以看出,輪巡機制的newTicker()時間間隔是在1~5s內做一次輪巡
	intervalTicker := time.NewTicker(time.Duration(checkInterval * int64(time.Second)))
	defer intervalTicker.Stop()
	for {
		select {
		case <-c.stop:
			DEBUG.Println(PNG, "keepalive stopped")
			return
		case <-intervalTicker.C:
			lastSent := c.lastSent.Load().(time.Time)//最近一次client 傳送pingrep包時間
			lastReceived := c.lastReceived.Load().(time.Time)//最近一次client 接受pingresp包的時間

			DEBUG.Println(PNG, "ping check", time.Since(lastSent).Seconds())
      // 如果符合,上次傳送間隔>=keepAlive 或則 最近接受pingresp包時間>=keepalive 進入client 傳送pingrep邏輯中
			if time.Since(lastSent) >= time.Duration(c.options.KeepAlive*int64(time.Second)) || time.Since(lastReceived) >= time.Duration(c.options.KeepAlive*int64(time.Second)) {
				if atomic.LoadInt32(&c.pingOutstanding) == 0 {
					DEBUG.Println(PNG, "keepalive sending ping")
					ping := packets.NewControlPacket(packets.Pingreq).(*packets.PingreqPacket)
					//We don't want to wait behind large messages being sent, the Write call
					//will block until it it able to send the packet.
					atomic.StoreInt32(&c.pingOutstanding, 1) //將 c.pingOutstanding更新為1
					ping.Write(c.conn)// 向broker 傳送ping包
					c.lastSent.Store(time.Now())//更新最近一次client  send pingrep time
					pingSent = time.Now()// 更新pingrep send time 
				}
			}
      
      //  判斷client 是否接受到pingresp包
      // 	注意預設的 PingTimeout: 10 * time.Second
      //如果c.pingOutstanding>0 且 pingSent間隔>PingTimeout 說明client 沒有接受到broker的 pingresp包
			if atomic.LoadInt32(&c.pingOutstanding) > 0 && time.Now().Sub(pingSent) >= c.options.PingTimeout {
				CRITICAL.Println(PNG, "pingresp not received, disconnecting")
				c.errors <- errors.New("pingresp not received, disconnecting")
				return
			}
		}
	}
}




3. 總結

通過檢視ping包原始碼,我們可以看出keepalive實現機制其實比較簡單,其實就是一個timeTicker()實現的輪訓機制,來定期的檢查client是否傳送了ping包,以及是否收到pingresp包。

  • timeTricker()輪巡週期的巧妙處理

    if c.options.KeepAlive > 10 {
    	checkInterval = 5
    } else {
    	checkInterval = c.options.KeepAlive / 2
    }
    intervalTicker := time.NewTicker(time.Duration(checkInterval * int64(time.Second)))
    

    我認為timeticker()時間週期太短,會導致cpu頻繁切換上下文。因此我個人建議,keepAlive time的間隔時長設定在(30~60)s之間;不太建議將keepAlive 時間間隔設定為<10s

相關文章