KafkaConsumer對於事務訊息的處理

devos發表於2018-08-30

原文網址 : https://www.cnblogs.com/devos/p/9562993.html

Kafka新增了事務機制以後，consumer端有個需要解決的問題就是怎麼樣從收到的訊息中濾掉aborted的訊息。Kafka通過broker和consumer端的協作，利用一系列優化手段極大地降低了這部分工作的開銷。

問題

首先來看一下這部分工作的難點在哪。

對於isolation.level為read_committed的消費者來說，它只想獲取committed的訊息。但是在伺服器端的儲存中，committed的訊息、aborted的訊息、以及正在進行中的事務的訊息在Log裡是緊挨在一起的，而且這些狀態的訊息可能源於不同的producerId。所以，如果broker對FetchRequest的處理和加入事務機制前一樣，那麼consumer就需要做很多地清理工作，而且需要buffer訊息直到control marker的到來。那麼，就無故浪費了很多流量，而且consumer端的記憶體管理也很成問題。

解決方法

Kafka大體採用了三個措施一起來解決這個問題。

LSO

Kafka新增了一個很重要概念，叫做LSO，即last stable offset。對於同一個TopicPartition，其offset小於LSO的所有transactional message的狀態都已確定，要不就是committed，要不就是aborted。而broker對於read_committed的consumer，只提供offset小於LSO的訊息。這樣就避免了consumer收到狀態不確定的訊息，而不得不buffer這些訊息。

Aborted Transaction Index

對於每個LogSegment(對應於一個log檔案)，broker都維護一個aborted transaction index. 這是一個append only的檔案，每當有事務被abort時，就會有一個entry被append進去。這個entry的格式是：

TransactionEntry =>
    Version => int16
    PID => int64
    FirstOffset => int64
    LastOffset => int64
    LastStableOffset => int64

為什麼要有這個index?

這涉及到FetchResponse的訊息格式的變化，在FetchResponse裡包含了其中每個TopicPartition的記錄裡的aborted transactions的資訊，consumer使用這些資訊，可以更高效地從FetchResponse裡包含的訊息裡過濾掉被abort的訊息。

// FetchResponse v4
FetchResponse => ThrottleTime [TopicName [Partition ErrorCode HighwaterMarkOffset LastStableOffset AbortedTransactions MessageSetSize MessageSet]]
ThrottleTime => int32
TopicName => string
Partition => int32
ErrorCode => int16
HighwaterMarkOffset => int64
LastStableOffset => int64
AbortedTransactions => [PID FirstOffset]
    PID => int64
    FirstOffset => int64
  MessageSetSize => int32

Consumer端根據aborted transactions的訊息過濾

(以下對只針對read_committed的consumer)

consumer端會根據fetch response裡提供的aborted transactions裡過濾掉aborted的訊息，只返回給使用者committed的訊息。

其核心邏輯是這樣的：

首先，由於broker只返回LSO之前的訊息給consumer，所以consumer拉取的訊息只有兩種可能的狀態：committed和aborted。

活躍的aborted transaction的pid集合

然後, 對於每個在被fetch的訊息裡包含的TopicPartition, consumer維護一個producerId的集合，這個集合就是當前活躍的aborted transaction所使用的pid。一個aborted transaction是“活躍的”，是說：在過濾過程中，當前的待處理的訊息的offset處於這個這個aborted transaction的initial offset和last offset之間。有了這個活躍的aborted transaction對應的PID的集合(以下簡稱"pid集合")，在過濾訊息時，只要看一下這個訊息的PID是否在此集合中，如果是，那麼訊息就肯定是aborted的，如果不是，那就是committed的。

這個pid集合在過濾的過程中，是不斷變化的，為了維護這個集合，consumer端還會對於每個在被fetch的訊息裡包含的TopicPartition 維護一個aborted transaction構成的mini heap, 這個heap是以aborted transaction的intial offset排序的。

    public static final class AbortedTransaction {
        public final long producerId;
        public final long firstOffset;

        ...
   }


private class PartitionRecords {
        private final TopicPartition partition;
        private final CompletedFetch completedFetch;
        private final Iterator<? extends RecordBatch> batches;
        private final Set<Long> abortedProducerIds;
        private final PriorityQueue<FetchResponse.AbortedTransaction> abortedTransactions;
       
        ...

}


//這個heap的初始化過程，可以看出是按offset排序的

private PriorityQueue<FetchResponse.AbortedTransaction> abortedTransactions(FetchResponse.PartitionData partition) {
    if (partition.abortedTransactions == null || partition.abortedTransactions.isEmpty())
        return null;

    PriorityQueue<FetchResponse.AbortedTransaction> abortedTransactions = new PriorityQueue<>(
            partition.abortedTransactions.size(),
            new Comparator<FetchResponse.AbortedTransaction>() {
                @Override
                public int compare(FetchResponse.AbortedTransaction o1, FetchResponse.AbortedTransaction o2) {
                    return Long.compare(o1.firstOffset, o2.firstOffset);
                }
            }
    );
    abortedTransactions.addAll(partition.abortedTransactions);
    return abortedTransactions;
}

按照Kafka文件裡的說法：

If the message is a transaction control message, and the status is ABORT, then remove the corresponding PID from the set of PIDs with active aborted transactions. If the status is COMMIT, ignore the message.

If the message is a normal message, compare the offset and PID with the head of the aborted transaction minheap. If the PID matches and the offset is greater than or equal to the corresponding initial offset from the aborted transaction entry, remove the head from the minheap and insert the PID into the set of PIDs with aborted transactions.

Check whether the PID is contained in the aborted transaction set. If so, discard the record set; otherwise, add it to the records to be returned to the user.

如果收到了一個abort marker（它本身是一個訊息，而且單獨一個batch），那麼就從pid集合裡移除這個pid。因為此時這個pid對應的aborted transaction不再是“活躍”的了
如果是普通訊息，那就根據這個訊息和aborted transaction所在的heap，來更新pid集合

如果訊息的pid跟堆頂的pid一樣，而且這個訊息的offset >= 堆頂的AbortedTransaction裡的offset(這是此pid對應的aborted transaction的initial offset)，那麼當前這個pid對應的transaction就可以判斷為一個活躍的aborted transaction，那就堆頂的這個AbortedTransaction移除，把它的pid放入pid集合裡
如果不是，就不變更pid集合
然後再次判斷這個訊息的pid是否在pid集合裡，如果是的話，就不把這條訊息放在返回給使用者的訊息集裡。

但是實際上考慮到batch的問題，情況會比這簡單一些。在producer端傳送的時候，同一個TopicPartition的不同transaction的訊息是不可能在同一個message batch裡的，而且committed的訊息和aborted的訊息也不可能在同一batch裡。因為在不同transaction的訊息之間，肯定會有transaction marker, 而transaction marker是單獨的一個batch。這就使得，一個batch要不全部被aborted了，要不全部被committed了。所以過濾aborted transaction時就可以一次過濾一個batch，而非一條訊息。

相關程式碼為PartitionRecords#nextFetchedRecord()中：

                    if (isolationLevel == IsolationLevel.READ_COMMITTED && currentBatch.hasProducerId()) {
                        // remove from the aborted transaction queue all aborted transactions which have begun
                        // before the current batch's last offset and add the associated producerIds to the
                        // aborted producer set
                        //從aborted transaction裡移除那些其inital offset在當前的batch的末尾之前的那些。
                        //因為這些transaction開始於當前batch之前，而在處理這個batch之前沒有結束，所以它要不是活躍的aborted transaction，要不當前的batch就是control batch
　　　　　　　　　　　　　　 //這裡需要考慮到aborted transaction可能開始於這次fetch到的所有records之前
                        consumeAbortedTransactionsUpTo(currentBatch.lastOffset());

                        long producerId = currentBatch.producerId();
                        if (containsAbortMarker(currentBatch)) {
                            abortedProducerIds.remove(producerId); //如果當前batch是abort marker, 那麼它對應的transaction就結束了，所以從pid集合裡移除它對應的pid。
                        } else if (isBatchAborted(currentBatch)) {  //如果當前batch被abort了，那就跳過它
                            log.debug("Skipping aborted record batch from partition {} with producerId {} and " +
                                          "offsets {} to {}",
                                      partition, producerId, currentBatch.baseOffset(), currentBatch.lastOffset());
                            nextFetchOffset = currentBatch.nextOffset(); 
                            continue;
                        }
                    }

結論

通過對aborted transaction index和LSO的使用，Kafka使得consumer端可以高效地過濾掉aborted transaction裡的訊息，從而減小了事務機制的效能開銷。

RocketMQ的事務訊息處理【half-message】
2020-11-22
MQ
RabbitMQ，RocketMQ，Kafka 事務性，訊息丟失和訊息重複傳送的處理策略
2021-12-30
MQKafka
老生常談——利用訊息佇列處理分散式事務
2018-12-18
佇列分散式
分散式事務對於兩階段提交的錯誤處理
2022-01-17
分散式
解析 RocketMQ 業務訊息——“事務訊息”
2022-08-11
MQ
分散式事務：基於可靠訊息服務
2018-12-07
分散式
基於可靠訊息方案的分散式事務（二）：Java中的事務
2018-05-31
分散式Java
分散式事務處理方案，微服事務處理方案
2019-05-04
分散式
mysqli 事務處理
2024-04-04
MySql
MySQL事務處理
2020-06-14
MySql
springboot事務處理
2022-09-21
Spring Boot
分散式事務利器——RocketMQ事務訊息的啟示
2019-07-01
分散式MQ
基於可靠訊息方案的分散式事務（四）：接入Lottor服務
2019-03-02
分散式
基於可靠訊息方案的分散式事務：Lottor介紹
2018-05-03
分散式
RocketMQ 分散式事務訊息
2020-08-28
MQ分散式
MPLS RSVP訊息處理——Vecloud
2021-03-12
Cloud
RocketMQ訊息丟失解決方案：事務訊息
2020-10-13
MQ
分散式訊息佇列RocketMQ--事務訊息--解決分散式事務的最佳實踐
2019-01-10
分散式佇列MQ
MQ收到無序的訊息時如何進行業務處理
2023-02-22
MQ行業
關於資料庫事務併發的理解和處理
2019-02-18
資料庫
Spring Boot和Apache Kafka結合實現錯誤處理，訊息轉換和事務支援？
2019-03-13
Spring BootApacheKafka
深入理解 RocketMQ -事務訊息
2020-11-18
MQ
RocketMQ與MYSQL事務訊息整合
2019-07-09
MQMySql
如何處理錯誤訊息PleaseinstalltheLinuxkernelheaderfiles
2018-10-15
LinuxHeader
.net core 訊息流處理流程
2020-11-18
Laravel 分散式事務處理
2018-11-11
Laravel分散式
事務處理基本概念
2024-03-25
Spring事務專題（三）事務的基本概念，Mysql事務處理原理
2020-08-01
SpringMySql
sql server對於日期的處理
2019-11-18
SQLServer
Android中的非同步訊息處理機制
2019-02-27
Android非同步
分散式事務：訊息可靠傳送
2019-02-22
分散式
rocketmq事務訊息入門介紹
2018-07-30
MQ
以事務方式傳送 Kafka 訊息
2022-07-21
Kafka
阿里是如何處理分散式事務的
2019-05-20
阿里分散式
Springboot資料庫事務處理——Spring宣告式事務
2018-11-18
Spring Boot資料庫
springcloud分散式事務處理 LCN
2019-02-28
SpringGCCloud分散式
如何處理RabbitMQ 訊息堆積和訊息丟失問題
2021-07-17
MQ
訊息中介軟體消費到的訊息處理失敗怎麼辦？
2019-05-21

KafkaConsumer對於事務訊息的處理

問題

解決方法

LSO

Aborted Transaction Index

為什麼要有這個index?

Consumer端根據aborted transactions的訊息過濾

結論

相關文章