Pulsar客戶端消費模式揭秘：Go 語言實現 ZeroQueueConsumer

crossoverJie發表於2024-07-29

原文網址 : https://www.cnblogs.com/crossoverJie/p/18329716

前段時間在 pulsar-client-go 社群裡看到這麼一個 issue：

import "github.com/apache/pulsar-client-go/pulsar"

client, err := pulsar.NewClient(pulsar.ClientOptions{
    URL: "pulsar://localhost:6650",
})
if err != nil {
    log.Fatal(err)
}
consumer, err := client.Subscribe(pulsar.ConsumerOptions{
    Topic:             "persistent://public/default/mq-topic-1",
    SubscriptionName:  "sub-1",
    Type:              pulsar.Shared,
    ReceiverQueueSize: 0,
})
if err != nil {
    log.Fatal(err)
}


// 小於等於 0 時會設定為 1000
const (  
    defaultReceiverQueueSize = 1000  
)
if options.ReceiverQueueSize <= 0 {  
    options.ReceiverQueueSize = defaultReceiverQueueSize  
}

他發現手動將 pulsar-client-go 客戶端的 ReceiverQueueSize 設定為 0 的時候，客戶端在初始化時會再將其調整為 1000.

if options.ReceiverQueueSize < 0 {  
    options.ReceiverQueueSize = defaultReceiverQueueSize  
}

而如果手動將原始碼修改為可以設定為 0 時，卻不能正常消費，消費者會一直處於 waiting 狀態，獲取不到任何資料。

經過我的排查發現是 Pulsar 的 Go 客戶端缺少了一個 ZeroQueueConsumerImpl的實現類，這個類主要用於可以精細控制消費邏輯。

If you'd like to have tight control over message dispatching across consumers, set the consumers' receiver queue size very low (potentially even to 0 if necessary). Each consumer has a receiver queue that determines how many messages the consumer attempts to fetch at a time. For example, a receiver queue of 1000 (the default) means that the consumer attempts to process 1000 messages from the topic's backlog upon connection. Setting the receiver queue to 0 essentially means ensuring that each consumer is only doing one thing at a time.

https://pulsar.apache.org/docs/next/cookbooks-message-queue/#client-configuration-changes

正如官方文件裡提到的那樣，可以將 ReceiverQueueSize 設定為 0；這樣消費者就可以一條條的消費資料，而不會將訊息堆積在客戶端佇列裡。

客戶端消費邏輯

藉此機會需要再回顧下 pulsar 客戶端的消費邏輯，這樣才能理解 ReceiverQueueSize 的作用以及如何在 pulsar-client-go 如何實現這個 ZeroQueueConsumerImpl。

Pulsar 客戶端的消費模式是基於推拉結合的：

如這張圖所描述的流程，消費者在啟動的時候會主動向服務端傳送一個 Flow 的命令，告訴服務端需要下發多少條訊息給客戶端。

同時會使用剛才的那個 ReceiverQueueSize引數作為內部佇列的大小，將客戶端下發的訊息儲存在內部佇列裡。

然後在呼叫 receive 函式的時候會直接從這個佇列裡獲取資料。

每次消費成功後都會將內部的一個 AvailablePermit+1，直到大於 MaxReceiveQueueSize / 2 就會再次向 broker 傳送 flow 命令，告訴 broker 再次下發訊息。

所以這裡有一個很關鍵的事件：就是向 broker 傳送 flow 命令，這樣才會有新的訊息下發給客戶端。

之前經常都會有研發同學讓我排查無法消費的問題，最終定位到的原因幾乎都是消費緩慢，導致這裡的 AvailablePermit 沒有增長，從而也就不會觸發 broker 給客戶端推送新的訊息。

看到的現象就是消費非常緩慢。

ZeroQueueConsumerImpl 原理

下面來看看 ZeroQueueConsumerImpl 是如何實現佇列大小為 0 依然是可以消費的。

在構建 consumer 的時候，就會根據佇列大小從而來建立普通消費者還是 ZeroQueueConsumerImpl 消費者。

@Override  
protected CompletableFuture<Message<T>> internalReceiveAsync() {  
    CompletableFuture<Message<T>> future = super.internalReceiveAsync();  
    if (!future.isDone()) {  
        // We expect the message to be not in the queue yet  
        increaseAvailablePermits(cnx());  
    }  
    return future;  
}

這是 ZeroQueueConsumerImpl 重寫的一個消費函式，其中關鍵的就是 increaseAvailablePermits(cnx());.

    void increaseAvailablePermits(ClientCnx currentCnx) {
        increaseAvailablePermits(currentCnx, 1);
    }

    protected void increaseAvailablePermits(ClientCnx currentCnx, int delta) {
        int available = AVAILABLE_PERMITS_UPDATER.addAndGet(this, delta);
        while (available >= getCurrentReceiverQueueSize() / 2 && !paused) {
            if (AVAILABLE_PERMITS_UPDATER.compareAndSet(this, available, 0)) {
                sendFlowPermitsToBroker(currentCnx, available);
                break;
            } else {
                available = AVAILABLE_PERMITS_UPDATER.get(this);
            }
        }
    }

從原始碼裡可以得知這裡的邏輯就是將 AvailablePermit 自增，達到閾值後請求 broker 下發訊息。

因為在 ZeroQueueConsumerImpl 中佇列大小為 0，所以 available >= getCurrentReceiverQueueSize() / 2永遠都會為 true。

也就是說每消費一條訊息都會請求 broker 讓它再下發一條訊息，這樣就達到了每一條訊息都精確控制的效果。

pulsar-client-go 中的實現

為了在 pulsar-client-go 實現這個需求，我提交了一個 PR 來解決這個問題。

其實從上面的分析已經得知為啥手動將 ReceiverQueueSize 設定為 0 無法消費訊息了。

根本原因還是在初始化的時候優於佇列為 0，導致不會給 broker 傳送 flow 命令，這樣就不會有訊息推送到客戶端，也就無法消費到資料了。

所以我們依然得參考 Java 的 ZeroQueueConsumerImpl 在每次消費的時候都手動增加 availablePermits。

為此我也新增了一個消費者 zeroQueueConsumer。

// EnableZeroQueueConsumer, if enabled, the ReceiverQueueSize will be 0.  
// Notice: only non-partitioned topic is supported.  
// Default is false.  
EnableZeroQueueConsumer bool

consumer, err := client.Subscribe(ConsumerOptions{  
    Topic:                   topicName,  
    SubscriptionName:        "sub-1",  
    Type:                    Shared,  
    NackRedeliveryDelay:     1 * time.Second,  
    EnableZeroQueueConsumer: true,  
})

if options.EnableZeroQueueConsumer {  
    options.ReceiverQueueSize = 0  
}

在建立消費者的時候需要指定是否開啟 ZeroQueueConsumer，當開啟後會手動將 ReceiverQueueSize 設定為 0.

// 可以設定預設值。
private int receiverQueueSize = 1000;

在 Go 中無法像 Java 那樣在結構體初始化化的時候就指定預設值，再加上 Go 的 int 型別具備零值（也就是0），所以無法區分出 ReceiverQueueSize=0 是使用者主動設定的，還是沒有傳入這個引數使用的零值。

所以才需要新增一個引數來手動區分是否使用 ZeroQueueConsumer。

之後在建立 consumer 的時候進行判斷，只有使用的是單分割槽的 topic 並且開啟了 EnableZeroQueueConsumer 才能建立 zeroQueueConsumer。

使用 PARTITIONED_METADATA 命令可以讓 broker 返回分割槽數量。

func (z *zeroQueueConsumer) Receive(ctx context.Context) (Message, error) {
	if state := z.pc.getConsumerState(); state == consumerClosed || state == consumerClosing {
		z.log.WithField("state", state).Error("Failed to ack by closing or closed consumer")
		return nil, errors.New("consumer state is closed")
	}
	z.Lock()
	defer z.Unlock()
	z.pc.availablePermits.inc()
	for {
		select {
		case <-z.closeCh:
			return nil, newError(ConsumerClosed, "consumer closed")
		case cm, ok := <-z.messageCh:
			if !ok {
				return nil, newError(ConsumerClosed, "consumer closed")
			}
			return cm.Message, nil
		case <-ctx.Done():
			return nil, ctx.Err()
		}
	}

}

其中的關鍵程式碼：z.pc.availablePermits.inc()

消費時的邏輯其實和 Java 的 ZeroQueueConsumerImpl 邏輯保持了一致，也是每消費一條資料之前就增加一次 availablePermits。

pulsar-client-go 的執行原理與 Java 客戶端的類似，也是將訊息存放在了一個內部佇列裡，所以每次消費訊息只需要從這個佇列 messageCh 裡獲取即可。

值得注意的是， pulsar-client-go 版本的 zeroQueueConsumer 就不支援直接讀取內部的佇列了。

func (z *zeroQueueConsumer) Chan() <-chan ConsumerMessage {  
    panic("zeroQueueConsumer cannot support Chan method")  
}

會直接 panic，因為直接消費 channel 在客戶端層面就沒法幫使用者主動傳送 flow 命令了，所以這個功能就只能遮蔽掉了，只可以主動的 receive 訊息。

許久之前我也畫過一個關於 pulsar client 的消費流程圖，後續考慮會再寫一篇關於 pulsar client 的原理分析文章。

參考連結：

https://github.com/apache/pulsar-client-go/issues/1223
https://cloud.tencent.com/developer/article/2307608
https://pulsar.apache.org/docs/next/cookbooks-message-queue/#client-configuration-changes
https://github.com/apache/pulsar-client-go/pull/1225

客戶決策 | Go語言設計模式實戰
2020-06-02
Go設計模式
Pulsar客戶端
2018-12-03
客戶端
C語言透過socket實現TCP客戶端
2024-06-06
C語言TCP客戶端
kafka消費者客戶端
2019-06-25
Kafka客戶端
Go 實現簡易的 Redis 客戶端
2019-04-05
GoRedis客戶端
Go語言實現設計模式之命令模式
2024-03-09
Go設計模式
訊息中介軟體客戶端消費控制實踐
2018-10-11
客戶端
Golang 實現 Redis(6): 實現 pipeline 模式的 redis 客戶端
2020-11-24
GolangRedis模式客戶端
Go基於gRPC實現客戶端連入服務端
2022-03-07
GoRPC客戶端服務端
博文推薦｜Pulsar 客戶端編碼最佳實踐
2021-11-21
客戶端
Go語言實現GoF設計模式：介面卡模式
2023-12-12
Go設計模式
快收藏！最全GO語言實現設計模式
2022-11-28
Go設計模式
01 . Go語言實現SSH遠端終端及WebSocket
2020-11-06
GoWeb
Go語言實現RPC
2018-08-17
GoRPC
客戶端骨架屏實現
2019-01-04
客戶端
用Go語言實現一個簡單生產者消費者模型，如何做？
2022-01-24
Go模型
go語言實現掃雷
2024-03-02
Go
jQuery實現客戶端CheckAll功能
2021-09-09
jQuery客戶端
TCP實現公網伺服器和內網客戶端一對多訪問(C語言實現)
2024-06-06
TCP伺服器內網客戶端C語言
Go語言實現的23種設計模式之結構型模式
2021-06-21
Go設計模式
go語言實現ssh打隧道
2019-01-21
Go
Go語言interface底層實現
2018-10-10
Go
go語言依賴注入實現
2020-05-25
Go依賴注入
Go語言實現TCP通訊
2023-03-28
GoTCP
GO語言實現埠掃描
2020-12-04
Go
golang實現tcp客戶端服務端程式
2020-12-27
GolangTCP客戶端服務端
Redis的Pub/Sub客戶端實現
2019-01-07
Redis客戶端
網頁SSH客戶端的實現
2022-02-28
網頁客戶端
使用Java客戶端傳送訊息和消費的應用
2022-07-15
Java客戶端
RabbitMQ Go客戶端教程5——topic
2020-12-06
MQGo客戶端
Go語言實現手機與電腦（windows）無線檔案快速互傳方案（跨平臺無需安裝客戶端）
2022-05-04
GoWindows客戶端
Pulsar 也會重複消費?
2022-03-23
go語言實現自己的RPC：go rpc codec
2019-02-24
GoRPC
Go語言map的底層實現
2018-09-29
Go
Go語言實現的Java Stream API
2019-09-05
GoJavaAPI
Go語言實現HTTPS加密協議
2020-01-08
GoHTTP加密協議
檔案複製（Go語言實現）
2020-10-05
Go
線性迴歸 go 語言實現
2020-04-16
Go

Pulsar客戶端消費模式揭秘：Go 語言實現 ZeroQueueConsumer

客戶端消費邏輯

ZeroQueueConsumerImpl 原理

pulsar-client-go 中的實現

相關文章