Kafka之Producer原始碼
簡介
Kafka是一個分散式的流處理平臺:
- 釋出和訂閱資料流,類似於訊息佇列或者企業訊息系統
- 容錯方式儲存資料流
- 資料流到即處理
Kafka主要用於以下兩種型別的應用:
- 建立從系統或者應用中獲取可靠實時的資料流管道
- 建立轉換資料流的實時流應用
Kafka有以下4個核心API:
- Producer API釋出一個資料流到一個或多個Kafka topic。
- Consumer API訂閱一個或多個topic,並且處理topic中的資料。
- Streams API作為一個流處理器,消費來自一個或多個topic的輸入流,同時產生輸出流到一個或多個topic。
- Connector API建立執行一個可複用的生產者或者消費者用來連線存在於應用或者資料系統中的topics。
本文主要從原始碼的角度解析一下Producer。
Producer
Producer釋出資料到指定的topics。Producer主要負責資料被分發到對應topic的哪個分割槽。最簡單的負載均衡是通過輪詢來進行分割槽,也可以通過其他的分割槽函式(根據資料中的key等)。
下面的程式碼是通過KafkaTemplate模版建立的一個kafka例項,然後呼叫了send方法把訊息傳送到"abc123"這個topic上去。
@Autowired
private KafkaTemplate<String, String> kafkaTemplate;
public void send() {
String message = "2018-08-07 08:21:47578|1|18701046390|001003|0|2|NULL|2018-08-07 08:21:47:544|2018-08-07 08:21:47:578|0|10.200.1.85|10.200.1.147:7022|";
kafkaTemplate.send("abc123", message);
}
其內部實現主要是依靠doSend方法。首先進來判斷是否設定了支援事務,接著獲取了一個producer例項,然後呼叫其send方法。在send的回撥結束後呼叫了closeProducer方法來關閉producer。
protected ListenableFuture<SendResult<K, V>> doSend(final ProducerRecord<K, V> producerRecord) {
if (this.transactional) {
Assert.state(inTransaction(),
"No transaction is in process; "
+ "possible solutions: run the template operation within the scope of a "
+ "template.executeInTransaction() operation, start a transaction with @Transactional "
+ "before invoking the template method, "
+ "run in a transaction started by a listener container when consuming a record");
}
final Producer<K, V> producer = getTheProducer();
if (this.logger.isTraceEnabled()) {
this.logger.trace("Sending: " + producerRecord);
}
final SettableListenableFuture<SendResult<K, V>> future = new SettableListenableFuture<>();
producer.send(producerRecord, new Callback() {
@Override
public void onCompletion(RecordMetadata metadata, Exception exception) {
try {
if (exception == null) {
future.set(new SendResult<>(producerRecord, metadata));
if (KafkaTemplate.this.producerListener != null) {
KafkaTemplate.this.producerListener.onSuccess(producerRecord, metadata);
}
if (KafkaTemplate.this.logger.isTraceEnabled()) {
KafkaTemplate.this.logger.trace("Sent ok: " + producerRecord + ", metadata: " + metadata);
}
}
else {
future.setException(new KafkaProducerException(producerRecord, "Failed to send", exception));
if (KafkaTemplate.this.producerListener != null) {
KafkaTemplate.this.producerListener.onError(producerRecord, exception);
}
if (KafkaTemplate.this.logger.isDebugEnabled()) {
KafkaTemplate.this.logger.debug("Failed to send: " + producerRecord, exception);
}
}
}
finally {
if (!KafkaTemplate.this.transactional) {
closeProducer(producer, false);
}
}
}
});
if (this.autoFlush) {
flush();
}
if (this.logger.isTraceEnabled()) {
this.logger.trace("Sent: " + producerRecord);
}
return future;
}
producer中的doSend方法實現非同步傳送資料到topic。
- 確認topic的後設資料是可用的,並設定等待超時時間。
- 序列化record的key,topic和header。
- 序列化record的value,topic,header。
- 設定record的分割槽。這邊如果在最開始傳入時設定了分割槽,就用設定的分割槽,如果沒有,就用輪詢的方式計算。
- 檢查序列化後要傳輸的record是否超過限制;
- 把前面設定好的分割槽、序列化的key,value、超時時間、header等引數放入到累加器中。
- 如果返回的結果顯示批佇列已經滿了或者新建立了一個批佇列,那麼就喚醒這個sender傳送資料。
- 返回result的future給上層。
private Future<RecordMetadata> doSend(ProducerRecord<K, V> record, Callback callback) {
TopicPartition tp = null;
try {
// first make sure the metadata for the topic is available
ClusterAndWaitTime clusterAndWaitTime = waitOnMetadata(record.topic(), record.partition(), maxBlockTimeMs);
long remainingWaitMs = Math.max(0, maxBlockTimeMs - clusterAndWaitTime.waitedOnMetadataMs);
Cluster cluster = clusterAndWaitTime.cluster;
byte[] serializedKey;
try {
serializedKey = keySerializer.serialize(record.topic(), record.headers(), record.key());
} catch (ClassCastException cce) {
throw new SerializationException("Can't convert key of class " + record.key().getClass().getName() +
" to class " + producerConfig.getClass(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG).getName() +
" specified in key.serializer", cce);
}
byte[] serializedValue;
try {
serializedValue = valueSerializer.serialize(record.topic(), record.headers(), record.value());
} catch (ClassCastException cce) {
throw new SerializationException("Can't convert value of class " + record.value().getClass().getName() +
" to class " + producerConfig.getClass(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG).getName() +
" specified in value.serializer", cce);
}
int partition = partition(record, serializedKey, serializedValue, cluster);
tp = new TopicPartition(record.topic(), partition);
setReadOnly(record.headers());
Header[] headers = record.headers().toArray();
int serializedSize = AbstractRecords.estimateSizeInBytesUpperBound(apiVersions.maxUsableProduceMagic(),
compressionType, serializedKey, serializedValue, headers);
ensureValidRecordSize(serializedSize);
long timestamp = record.timestamp() == null ? time.milliseconds() : record.timestamp();
log.trace("Sending record {} with callback {} to topic {} partition {}", record, callback, record.topic(), partition);
// producer callback will make sure to call both 'callback' and interceptor callback
Callback interceptCallback = this.interceptors == null ? callback : new InterceptorCallback<>(callback, this.interceptors, tp);
if (transactionManager != null && transactionManager.isTransactional())
transactionManager.maybeAddPartitionToTransaction(tp);
RecordAccumulator.RecordAppendResult result = accumulator.append(tp, timestamp, serializedKey,
serializedValue, headers, interceptCallback, remainingWaitMs);
if (result.batchIsFull || result.newBatchCreated) {
log.trace("Waking up the sender since topic {} partition {} is either full or getting a new batch", record.topic(), partition);
this.sender.wakeup();
}
return result.future;
// handling exceptions and record the errors;
// for API exceptions return them in the future,
// for other exceptions throw directly
} catch (ApiException e) {
log.debug("Exception occurred during message send:", e);
if (callback != null)
callback.onCompletion(null, e);
this.errors.record();
if (this.interceptors != null)
this.interceptors.onSendError(record, tp, e);
return new FutureFailure(e);
} catch (InterruptedException e) {
this.errors.record();
if (this.interceptors != null)
this.interceptors.onSendError(record, tp, e);
throw new InterruptException(e);
} catch (BufferExhaustedException e) {
this.errors.record();
this.metrics.sensor("buffer-exhausted-records").record();
if (this.interceptors != null)
this.interceptors.onSendError(record, tp, e);
throw e;
} catch (KafkaException e) {
this.errors.record();
if (this.interceptors != null)
this.interceptors.onSendError(record, tp, e);
throw e;
} catch (Exception e) {
// we notify interceptor about all exceptions, since onSend is called before anything else in this method
if (this.interceptors != null)
this.interceptors.onSendError(record, tp, e);
throw e;
}
}
RecordAccumulator中的append方法用於把record新增到累加器中,並返回累加的結果。
首先它檢查是否有一個在處理的batch。如果有,直接嘗試增加序列化後的record到累加器中。如果沒有,則建立一個帶有緩衝區的新的batch,然後嘗試增加序列化後的record到batch中的緩衝區內,接著增加batch到佇列中。最終返回累加的結果。
public RecordAppendResult append(TopicPartition tp,
long timestamp,
byte[] key,
byte[] value,
Header[] headers,
Callback callback,
long maxTimeToBlock) throws InterruptedException {
// We keep track of the number of appending thread to make sure we do not miss batches in
// abortIncompleteBatches().
appendsInProgress.incrementAndGet();
ByteBuffer buffer = null;
if (headers == null) headers = Record.EMPTY_HEADERS;
try {
// check if we have an in-progress batch
Deque<ProducerBatch> dq = getOrCreateDeque(tp);
synchronized (dq) {
if (closed)
throw new IllegalStateException("Cannot send after the producer is closed.");
RecordAppendResult appendResult = tryAppend(timestamp, key, value, headers, callback, dq);
if (appendResult != null)
return appendResult;
}
// we don't have an in-progress record batch try to allocate a new batch
byte maxUsableMagic = apiVersions.maxUsableProduceMagic();
int size = Math.max(this.batchSize, AbstractRecords.estimateSizeInBytesUpperBound(maxUsableMagic, compression, key, value, headers));
log.trace("Allocating a new {} byte message buffer for topic {} partition {}", size, tp.topic(), tp.partition());
buffer = free.allocate(size, maxTimeToBlock);
synchronized (dq) {
// Need to check if producer is closed again after grabbing the dequeue lock.
if (closed)
throw new IllegalStateException("Cannot send after the producer is closed.");
RecordAppendResult appendResult = tryAppend(timestamp, key, value, headers, callback, dq);
if (appendResult != null) {
// Somebody else found us a batch, return the one we waited for! Hopefully this doesn't happen often...
return appendResult;
}
MemoryRecordsBuilder recordsBuilder = recordsBuilder(buffer, maxUsableMagic);
ProducerBatch batch = new ProducerBatch(tp, recordsBuilder, time.milliseconds());
FutureRecordMetadata future = Utils.notNull(batch.tryAppend(timestamp, key, value, headers, callback, time.milliseconds()));
dq.addLast(batch);
incomplete.add(batch);
// Don't deallocate this buffer in the finally block as it's being used in the record batch
buffer = null;
return new RecordAppendResult(future, dq.size() > 1 || batch.isFull(), true);
}
} finally {
if (buffer != null)
free.deallocate(buffer);
appendsInProgress.decrementAndGet();
}
}
真正傳送record到叢集的的類是Sender類,它是一個bachground thread。它在run方法中呼叫sendProducerData方法。
而sendProducerData方法做了以下事情:
- 從累加器中獲取可以準備傳送的record
- 如果有任何分割槽的leader還不知道,強制後設資料更新
- 移除還沒有準備好傳送的節點
- 建立一個request請求用於傳送batch
- 對於過期的batch進行reset producer id
- 傳送batch request
private long sendProducerData(long now) {
Cluster cluster = metadata.fetch();
// get the list of partitions with data ready to send
RecordAccumulator.ReadyCheckResult result = this.accumulator.ready(cluster, now);
// if there are any partitions whose leaders are not known yet, force metadata update
if (!result.unknownLeaderTopics.isEmpty()) {
// The set of topics with unknown leader contains topics with leader election pending as well as
// topics which may have expired. Add the topic again to metadata to ensure it is included
// and request metadata update, since there are messages to send to the topic.
for (String topic : result.unknownLeaderTopics)
this.metadata.add(topic);
this.metadata.requestUpdate();
}
// remove any nodes we aren't ready to send to
Iterator<Node> iter = result.readyNodes.iterator();
long notReadyTimeout = Long.MAX_VALUE;
while (iter.hasNext()) {
Node node = iter.next();
if (!this.client.ready(node, now)) {
iter.remove();
notReadyTimeout = Math.min(notReadyTimeout, this.client.connectionDelay(node, now));
}
}
// create produce requests
Map<Integer, List<ProducerBatch>> batches = this.accumulator.drain(cluster, result.readyNodes,
this.maxRequestSize, now);
if (guaranteeMessageOrder) {
// Mute all the partitions drained
for (List<ProducerBatch> batchList : batches.values()) {
for (ProducerBatch batch : batchList)
this.accumulator.mutePartition(batch.topicPartition);
}
}
List<ProducerBatch> expiredBatches = this.accumulator.expiredBatches(this.requestTimeout, now);
// Reset the producer id if an expired batch has previously been sent to the broker. Also update the metrics
// for expired batches. see the documentation of @TransactionState.resetProducerId to understand why
// we need to reset the producer id here.
if (!expiredBatches.isEmpty())
log.trace("Expired {} batches in accumulator", expiredBatches.size());
for (ProducerBatch expiredBatch : expiredBatches) {
failBatch(expiredBatch, -1, NO_TIMESTAMP, expiredBatch.timeoutException(), false);
if (transactionManager != null && expiredBatch.inRetry()) {
// This ensures that no new batches are drained until the current in flight batches are fully resolved.
transactionManager.markSequenceUnresolved(expiredBatch.topicPartition);
}
}
sensors.updateProduceRequestMetrics(batches);
// If we have any nodes that are ready to send + have sendable data, poll with 0 timeout so this can immediately
// loop and try sending more data. Otherwise, the timeout is determined by nodes that have partitions with data
// that isn't yet sendable (e.g. lingering, backing off). Note that this specifically does not include nodes
// with sendable data that aren't ready to send since they would cause busy looping.
long pollTimeout = Math.min(result.nextReadyCheckDelayMs, notReadyTimeout);
if (!result.readyNodes.isEmpty()) {
log.trace("Nodes with data ready to send: {}", result.readyNodes);
// if some partitions are already ready to be sent, the select time would be 0;
// otherwise if some partition already has some data accumulated but not ready yet,
// the select time will be the time difference between now and its linger expiry time;
// otherwise the select time will be the time difference between now and the metadata expiry time;
pollTimeout = 0;
}
sendProduceRequests(batches, now);
return pollTimeout;
}
總結
本文簡單介紹了Kafka的基本情況,包含Producer、Consumer、Streams、Connector4個API。接著從原始碼入手分析了Producer傳送資料到叢集的過程,其主要是把資料放入緩衝,然後再從緩衝區傳送資料。
相關文章
- 原始碼分析Kafka之Producer原始碼Kafka
- Kafka學習(四)-------- Kafka核心之ProducerKafka
- 我花了一週讀了Kafka Producer的原始碼Kafka原始碼
- 詳解Kafka ProducerKafka
- alpakka-kafka(1)-producerKafka
- kafka原始碼剖析(二)之kafka-server的啟動Kafka原始碼Server
- RocketMQ中Producer的啟動原始碼分析MQ原始碼
- Kafka 原始碼剖析(一)Kafka原始碼
- Kafka原始碼篇 --- 你一定能get到的Producer的初始化及後設資料獲取流程Kafka原始碼
- 最佳實踐|從Producer 到 Consumer,如何有效監控 KafkaKafka
- Flink kafka source & sink 原始碼解析Kafka原始碼
- Kafka原始碼分析(二) - 生產者Kafka原始碼
- RocketMQ基礎概念剖析,並分析一下Producer的底層原始碼MQ原始碼
- 圖解 Kafka 原始碼之 NetworkClient 網路通訊元件架構設計圖解Kafka原始碼client元件架構
- 插曲:Kafka原始碼預熱篇--- Java NIOKafka原始碼Java
- Kafka Broker原始碼:網路層設計Kafka原始碼
- kafka原始碼剖析(三)之日誌管理-LogManagerKafka原始碼
- linux 原始碼搭建Kafka叢集,100%有效Linux原始碼Kafka
- kafka叢集Producer基本資料結構及工作流程深入剖析-kafka 商業環境實戰Kafka資料結構
- kafka_2.11-0.10.2.1 的生產者 消費者的示例(new producer api)KafkaAPI
- 【生產者原始碼分析系列第八篇】圖解 Kafka 原始碼之 Sender 執行緒架構設計原始碼圖解Kafka執行緒架構
- kafka生產者Producer引數設定及引數調優建議-kafka 商業環境實戰Kafka
- 4天如何完爆Kafka原始碼核心流程!Kafka原始碼
- 原始碼|jdk原始碼之HashMap分析(一)原始碼JDKHashMap
- 原始碼|jdk原始碼之HashMap分析(二)原始碼JDKHashMap
- Guava 原始碼分析之 EventBus 原始碼分析Guava原始碼
- Android 原始碼分析之 AsyncTask 原始碼分析Android原始碼
- Kafka原始碼分析(三) - Server端 - 訊息儲存Kafka原始碼Server
- 死磕 jdk原始碼之HashMap原始碼分析JDK原始碼HashMap
- Spring原始碼之IOC(一)BeanDefinition原始碼解析Spring原始碼Bean
- Android 原始碼分析之 EventBus 的原始碼解析Android原始碼
- 原始碼分析之 HashMap原始碼HashMap
- 原始碼分析之 LinkedList原始碼
- async原始碼之series原始碼
- Srping原始碼之XMLBeanFactory原始碼XMLBean
- Kafka原始碼分析(四) - Server端-請求處理框架Kafka原始碼Server框架
- A Prototype of Producer-Consumer
- 理解 Paimon changelog producerAI