Kafka之Producer原始碼

weixin_33782386發表於2018-08-14

原文網址 : https://blog.csdn.net/weixin_33782386/article/details/87615492

Kafka原始碼

簡介

Kafka是一個分散式的流處理平臺：

釋出和訂閱資料流，類似於訊息佇列或者企業訊息系統
容錯方式儲存資料流
資料流到即處理

Kafka主要用於以下兩種型別的應用：

建立從系統或者應用中獲取可靠實時的資料流管道
建立轉換資料流的實時流應用

Kafka有以下4個核心API：

Producer API釋出一個資料流到一個或多個Kafka topic。
Consumer API訂閱一個或多個topic，並且處理topic中的資料。
Streams API作為一個流處理器，消費來自一個或多個topic的輸入流，同時產生輸出流到一個或多個topic。
Connector API建立執行一個可複用的生產者或者消費者用來連線存在於應用或者資料系統中的topics。

kafka_Intro.png

本文主要從原始碼的角度解析一下Producer。

Producer

Producer釋出資料到指定的topics。Producer主要負責資料被分發到對應topic的哪個分割槽。最簡單的負載均衡是通過輪詢來進行分割槽，也可以通過其他的分割槽函式(根據資料中的key等)。

下面的程式碼是通過KafkaTemplate模版建立的一個kafka例項，然後呼叫了send方法把訊息傳送到"abc123"這個topic上去。

    @Autowired
    private KafkaTemplate<String, String> kafkaTemplate;
    public void send() {
        String message = "2018-08-07 08:21:47578|1|18701046390|001003|0|2|NULL|2018-08-07 08:21:47:544|2018-08-07 08:21:47:578|0|10.200.1.85|10.200.1.147:7022|";
        kafkaTemplate.send("abc123", message);
    }

其內部實現主要是依靠doSend方法。首先進來判斷是否設定了支援事務，接著獲取了一個producer例項，然後呼叫其send方法。在send的回撥結束後呼叫了closeProducer方法來關閉producer。

    protected ListenableFuture<SendResult<K, V>> doSend(final ProducerRecord<K, V> producerRecord) {
        if (this.transactional) {
            Assert.state(inTransaction(),
                    "No transaction is in process; "
                        + "possible solutions: run the template operation within the scope of a "
                        + "template.executeInTransaction() operation, start a transaction with @Transactional "
                        + "before invoking the template method, "
                        + "run in a transaction started by a listener container when consuming a record");
        }
        final Producer<K, V> producer = getTheProducer();
        if (this.logger.isTraceEnabled()) {
            this.logger.trace("Sending: " + producerRecord);
        }
        final SettableListenableFuture<SendResult<K, V>> future = new SettableListenableFuture<>();
        producer.send(producerRecord, new Callback() {

        @Override
        public void onCompletion(RecordMetadata metadata, Exception exception) {
            try {
                if (exception == null) {
                    future.set(new SendResult<>(producerRecord, metadata));
                    if (KafkaTemplate.this.producerListener != null) {
                        KafkaTemplate.this.producerListener.onSuccess(producerRecord, metadata);
                    }
                    if (KafkaTemplate.this.logger.isTraceEnabled()) {
                        KafkaTemplate.this.logger.trace("Sent ok: " + producerRecord + ", metadata: " + metadata);
                    }
                }
                else {
                    future.setException(new KafkaProducerException(producerRecord, "Failed to send", exception));
                    if (KafkaTemplate.this.producerListener != null) {
                        KafkaTemplate.this.producerListener.onError(producerRecord, exception);
                    }
                    if (KafkaTemplate.this.logger.isDebugEnabled()) {
                        KafkaTemplate.this.logger.debug("Failed to send: " + producerRecord, exception);
                    }
                }
            }
            finally {
                if (!KafkaTemplate.this.transactional) {
                    closeProducer(producer, false);
                }
            }
        }

    });
    if (this.autoFlush) {
        flush();
    }
    if (this.logger.isTraceEnabled()) {
        this.logger.trace("Sent: " + producerRecord);
    }
    return future;
}

producer中的doSend方法實現非同步傳送資料到topic。

確認topic的後設資料是可用的，並設定等待超時時間。
序列化record的key，topic和header。
序列化record的value，topic，header。
設定record的分割槽。這邊如果在最開始傳入時設定了分割槽，就用設定的分割槽，如果沒有，就用輪詢的方式計算。
檢查序列化後要傳輸的record是否超過限制；
把前面設定好的分割槽、序列化的key，value、超時時間、header等引數放入到累加器中。
如果返回的結果顯示批佇列已經滿了或者新建立了一個批佇列，那麼就喚醒這個sender傳送資料。
返回result的future給上層。

    private Future<RecordMetadata> doSend(ProducerRecord<K, V> record, Callback callback) {
        TopicPartition tp = null;
        try {
            // first make sure the metadata for the topic is available
            ClusterAndWaitTime clusterAndWaitTime = waitOnMetadata(record.topic(), record.partition(), maxBlockTimeMs);
            long remainingWaitMs = Math.max(0, maxBlockTimeMs - clusterAndWaitTime.waitedOnMetadataMs);
            Cluster cluster = clusterAndWaitTime.cluster;
            byte[] serializedKey;
            try {
                serializedKey = keySerializer.serialize(record.topic(), record.headers(), record.key());
            } catch (ClassCastException cce) {
                throw new SerializationException("Can't convert key of class " + record.key().getClass().getName() +
                        " to class " + producerConfig.getClass(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG).getName() +
                        " specified in key.serializer", cce);
            }
            byte[] serializedValue;
            try {
                serializedValue = valueSerializer.serialize(record.topic(), record.headers(), record.value());
            } catch (ClassCastException cce) {
                throw new SerializationException("Can't convert value of class " + record.value().getClass().getName() +
                        " to class " + producerConfig.getClass(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG).getName() +
                        " specified in value.serializer", cce);
            }
            int partition = partition(record, serializedKey, serializedValue, cluster);
            tp = new TopicPartition(record.topic(), partition);

            setReadOnly(record.headers());
            Header[] headers = record.headers().toArray();

            int serializedSize = AbstractRecords.estimateSizeInBytesUpperBound(apiVersions.maxUsableProduceMagic(),
                    compressionType, serializedKey, serializedValue, headers);
            ensureValidRecordSize(serializedSize);
            long timestamp = record.timestamp() == null ? time.milliseconds() : record.timestamp();
            log.trace("Sending record {} with callback {} to topic {} partition {}", record, callback, record.topic(), partition);
            // producer callback will make sure to call both 'callback' and interceptor callback
            Callback interceptCallback = this.interceptors == null ? callback : new InterceptorCallback<>(callback, this.interceptors, tp);

            if (transactionManager != null && transactionManager.isTransactional())
                transactionManager.maybeAddPartitionToTransaction(tp);

            RecordAccumulator.RecordAppendResult result = accumulator.append(tp, timestamp, serializedKey,
                    serializedValue, headers, interceptCallback, remainingWaitMs);
            if (result.batchIsFull || result.newBatchCreated) {
                log.trace("Waking up the sender since topic {} partition {} is either full or getting a new batch", record.topic(), partition);
                this.sender.wakeup();
            }
            return result.future;
            // handling exceptions and record the errors;
            // for API exceptions return them in the future,
            // for other exceptions throw directly
        } catch (ApiException e) {
            log.debug("Exception occurred during message send:", e);
            if (callback != null)
                callback.onCompletion(null, e);
            this.errors.record();
            if (this.interceptors != null)
                this.interceptors.onSendError(record, tp, e);
            return new FutureFailure(e);
        } catch (InterruptedException e) {
            this.errors.record();
            if (this.interceptors != null)
                this.interceptors.onSendError(record, tp, e);
            throw new InterruptException(e);
        } catch (BufferExhaustedException e) {
            this.errors.record();
            this.metrics.sensor("buffer-exhausted-records").record();
            if (this.interceptors != null)
                this.interceptors.onSendError(record, tp, e);
            throw e;
        } catch (KafkaException e) {
            this.errors.record();
            if (this.interceptors != null)
                this.interceptors.onSendError(record, tp, e);
            throw e;
        } catch (Exception e) {
            // we notify interceptor about all exceptions, since onSend is called before anything else in this method
            if (this.interceptors != null)
                this.interceptors.onSendError(record, tp, e);
            throw e;
        }
    }

RecordAccumulator中的append方法用於把record新增到累加器中，並返回累加的結果。

首先它檢查是否有一個在處理的batch。如果有，直接嘗試增加序列化後的record到累加器中。如果沒有，則建立一個帶有緩衝區的新的batch，然後嘗試增加序列化後的record到batch中的緩衝區內，接著增加batch到佇列中。最終返回累加的結果。

    public RecordAppendResult append(TopicPartition tp,
                                     long timestamp,
                                     byte[] key,
                                     byte[] value,
                                     Header[] headers,
                                     Callback callback,
                                     long maxTimeToBlock) throws InterruptedException {
        // We keep track of the number of appending thread to make sure we do not miss batches in
        // abortIncompleteBatches().
        appendsInProgress.incrementAndGet();
        ByteBuffer buffer = null;
        if (headers == null) headers = Record.EMPTY_HEADERS;
        try {
            // check if we have an in-progress batch
            Deque<ProducerBatch> dq = getOrCreateDeque(tp);
            synchronized (dq) {
                if (closed)
                    throw new IllegalStateException("Cannot send after the producer is closed.");
                RecordAppendResult appendResult = tryAppend(timestamp, key, value, headers, callback, dq);
                if (appendResult != null)
                    return appendResult;
            }

            // we don't have an in-progress record batch try to allocate a new batch
            byte maxUsableMagic = apiVersions.maxUsableProduceMagic();
            int size = Math.max(this.batchSize, AbstractRecords.estimateSizeInBytesUpperBound(maxUsableMagic, compression, key, value, headers));
            log.trace("Allocating a new {} byte message buffer for topic {} partition {}", size, tp.topic(), tp.partition());
            buffer = free.allocate(size, maxTimeToBlock);
            synchronized (dq) {
                // Need to check if producer is closed again after grabbing the dequeue lock.
                if (closed)
                    throw new IllegalStateException("Cannot send after the producer is closed.");

                RecordAppendResult appendResult = tryAppend(timestamp, key, value, headers, callback, dq);
                if (appendResult != null) {
                    // Somebody else found us a batch, return the one we waited for! Hopefully this doesn't happen often...
                    return appendResult;
                }

                MemoryRecordsBuilder recordsBuilder = recordsBuilder(buffer, maxUsableMagic);
                ProducerBatch batch = new ProducerBatch(tp, recordsBuilder, time.milliseconds());
                FutureRecordMetadata future = Utils.notNull(batch.tryAppend(timestamp, key, value, headers, callback, time.milliseconds()));

                dq.addLast(batch);
                incomplete.add(batch);

                // Don't deallocate this buffer in the finally block as it's being used in the record batch
                buffer = null;

                return new RecordAppendResult(future, dq.size() > 1 || batch.isFull(), true);
            }
        } finally {
            if (buffer != null)
                free.deallocate(buffer);
            appendsInProgress.decrementAndGet();
        }
    }

真正傳送record到叢集的的類是Sender類，它是一個bachground thread。它在run方法中呼叫sendProducerData方法。

而sendProducerData方法做了以下事情：

從累加器中獲取可以準備傳送的record
如果有任何分割槽的leader還不知道，強制後設資料更新
移除還沒有準備好傳送的節點
建立一個request請求用於傳送batch
對於過期的batch進行reset producer id
傳送batch request

    private long sendProducerData(long now) {
        Cluster cluster = metadata.fetch();

        // get the list of partitions with data ready to send
        RecordAccumulator.ReadyCheckResult result = this.accumulator.ready(cluster, now);

        // if there are any partitions whose leaders are not known yet, force metadata update
        if (!result.unknownLeaderTopics.isEmpty()) {
            // The set of topics with unknown leader contains topics with leader election pending as well as
            // topics which may have expired. Add the topic again to metadata to ensure it is included
            // and request metadata update, since there are messages to send to the topic.
            for (String topic : result.unknownLeaderTopics)
                this.metadata.add(topic);
            this.metadata.requestUpdate();
        }

        // remove any nodes we aren't ready to send to
        Iterator<Node> iter = result.readyNodes.iterator();
        long notReadyTimeout = Long.MAX_VALUE;
        while (iter.hasNext()) {
            Node node = iter.next();
            if (!this.client.ready(node, now)) {
                iter.remove();
                notReadyTimeout = Math.min(notReadyTimeout, this.client.connectionDelay(node, now));
            }
        }

        // create produce requests
        Map<Integer, List<ProducerBatch>> batches = this.accumulator.drain(cluster, result.readyNodes,
                this.maxRequestSize, now);
        if (guaranteeMessageOrder) {
            // Mute all the partitions drained
            for (List<ProducerBatch> batchList : batches.values()) {
                for (ProducerBatch batch : batchList)
                    this.accumulator.mutePartition(batch.topicPartition);
            }
        }

        List<ProducerBatch> expiredBatches = this.accumulator.expiredBatches(this.requestTimeout, now);
        // Reset the producer id if an expired batch has previously been sent to the broker. Also update the metrics
        // for expired batches. see the documentation of @TransactionState.resetProducerId to understand why
        // we need to reset the producer id here.
        if (!expiredBatches.isEmpty())
            log.trace("Expired {} batches in accumulator", expiredBatches.size());
        for (ProducerBatch expiredBatch : expiredBatches) {
            failBatch(expiredBatch, -1, NO_TIMESTAMP, expiredBatch.timeoutException(), false);
            if (transactionManager != null && expiredBatch.inRetry()) {
                // This ensures that no new batches are drained until the current in flight batches are fully resolved.
                transactionManager.markSequenceUnresolved(expiredBatch.topicPartition);
            }
        }

        sensors.updateProduceRequestMetrics(batches);

        // If we have any nodes that are ready to send + have sendable data, poll with 0 timeout so this can immediately
        // loop and try sending more data. Otherwise, the timeout is determined by nodes that have partitions with data
        // that isn't yet sendable (e.g. lingering, backing off). Note that this specifically does not include nodes
        // with sendable data that aren't ready to send since they would cause busy looping.
        long pollTimeout = Math.min(result.nextReadyCheckDelayMs, notReadyTimeout);
        if (!result.readyNodes.isEmpty()) {
            log.trace("Nodes with data ready to send: {}", result.readyNodes);
            // if some partitions are already ready to be sent, the select time would be 0;
            // otherwise if some partition already has some data accumulated but not ready yet,
            // the select time will be the time difference between now and its linger expiry time;
            // otherwise the select time will be the time difference between now and the metadata expiry time;
            pollTimeout = 0;
        }
        sendProduceRequests(batches, now);

        return pollTimeout;
    }

總結

本文簡單介紹了Kafka的基本情況，包含Producer、Consumer、Streams、Connector4個API。接著從原始碼入手分析了Producer傳送資料到叢集的過程，其主要是把資料放入緩衝，然後再從緩衝區傳送資料。

原始碼分析Kafka之Producer
2018-08-27
原始碼Kafka
Kafka學習（四）-------- Kafka核心之Producer
2019-08-06
Kafka
我花了一週讀了Kafka Producer的原始碼
2019-08-27
Kafka原始碼
詳解Kafka Producer
2019-11-15
Kafka
alpakka-kafka(1)-producer
2021-02-20
Kafka
kafka原始碼剖析(二)之kafka-server的啟動
2018-03-15
Kafka原始碼Server
RocketMQ中Producer的啟動原始碼分析
2019-07-31
MQ原始碼
Kafka 原始碼剖析(一)
2018-03-15
Kafka原始碼
Kafka原始碼篇 --- 你一定能get到的Producer的初始化及後設資料獲取流程
2020-01-01
Kafka原始碼
最佳實踐｜從Producer 到 Consumer，如何有效監控 Kafka
2022-05-27
Kafka
Flink kafka source & sink 原始碼解析
2020-04-03
Kafka原始碼
Kafka原始碼分析(二) - 生產者
2021-05-12
Kafka原始碼
RocketMQ基礎概念剖析，並分析一下Producer的底層原始碼
2021-02-26
MQ原始碼
圖解 Kafka 原始碼之 NetworkClient 網路通訊元件架構設計
2023-03-15
圖解Kafka原始碼client元件架構
插曲：Kafka原始碼預熱篇--- Java NIO
2019-11-29
Kafka原始碼Java
Kafka Broker原始碼：網路層設計
2020-08-31
Kafka原始碼
kafka原始碼剖析(三)之日誌管理-LogManager
2018-03-19
Kafka原始碼
linux 原始碼搭建Kafka叢集，100%有效
2021-08-17
Linux原始碼Kafka
kafka叢集Producer基本資料結構及工作流程深入剖析-kafka 商業環境實戰
2018-12-02
Kafka資料結構
kafka_2.11-0.10.2.1 的生產者消費者的示例（new producer api）
2019-01-15
KafkaAPI
【生產者原始碼分析系列第八篇】圖解 Kafka 原始碼之 Sender 執行緒架構設計
2023-02-22
原始碼圖解Kafka執行緒架構
kafka生產者Producer引數設定及引數調優建議-kafka 商業環境實戰
2018-10-28
Kafka
4天如何完爆Kafka原始碼核心流程！
2020-06-10
Kafka原始碼
原始碼|jdk原始碼之HashMap分析(一)
2019-01-19
原始碼JDKHashMap
原始碼|jdk原始碼之HashMap分析(二)
2019-01-19
原始碼JDKHashMap
Guava 原始碼分析之 EventBus 原始碼分析
2018-08-01
Guava原始碼
Android 原始碼分析之 AsyncTask 原始碼分析
2019-03-04
Android原始碼
Kafka原始碼分析(三) - Server端 - 訊息儲存
2021-06-14
Kafka原始碼Server
死磕 jdk原始碼之HashMap原始碼分析
2019-04-13
JDK原始碼HashMap
Spring原始碼之IOC（一）BeanDefinition原始碼解析
2018-12-04
Spring原始碼Bean
Android 原始碼分析之 EventBus 的原始碼解析
2018-08-06
Android原始碼
原始碼分析之 HashMap
2019-03-04
原始碼HashMap
原始碼分析之 LinkedList
2019-01-23
原始碼
async原始碼之series
2019-03-04
原始碼
Srping原始碼之XMLBeanFactory
2021-03-16
原始碼XMLBean
Kafka原始碼分析(四) - Server端-請求處理框架
2024-05-06
Kafka原始碼Server框架
A Prototype of Producer-Consumer
2019-04-13
理解 Paimon changelog producer
2023-12-17
AI

Kafka之Producer原始碼

簡介

Producer

總結

相關文章