Kafka Broker原始碼：網路層設計

nlskyfree發表於2020-08-31

原文網址 : https://www.cnblogs.com/nlskyfree/p/13590598.html

一、整體架構

1.1 核心邏輯

1個Acceptor執行緒+N個Processor執行緒(network.threads)+M個Request Handle執行緒(io threads)
多執行緒多Reactor模型，Acceptor獨佔一個selector，每個Processor有自己的selector
每個Processor都有一個名為newConnections的ConcurrentLinkedQueue[SocketChannel]()，Acceptor會round-robin輪詢Processor，將新的連線放入對應Processor的佇列裡
每個Processor有自己的selector，監聽網路IO讀寫事件的發生
IO讀事件發生時，所有Processor會將組包完成後的Request放入RequestChannel中預設大小500的全域性ArrayBlockingQueue中
Request Handle完成kafka內部邏輯後，將Response寫到處理Request的Processor執行緒內的LinkedBlockingQueue中
IO寫事件發生時，將資料寫回Client

1.2 核心類、方法介紹

SocketServer                           //kafka網路層的封裝
   |-- Acceptor                        //Acceptor執行緒的封裝
   |-- Processor                       //Processor執行緒的封裝
Selector                               //對java selector的封裝，封裝了核心的poll，selectionkeys的遍歷，事件的註冊等操作
KafkaChannel                           //對java SocketChannel的封裝，封裝是實際的讀寫IO操作
TransportLayer                         //對KafkaChannel遮蔽了底層是使用Plaintext不加密通訊還是ssl加密通訊
RequestChannel                         //和API層通訊的通道層，封裝了和API層通訊的Request、Response以及相應的通訊佇列
  |-- Request                          //傳遞給API層的Requst
  |-- Response                         //API層返回的Response

二、核心流程分析

2.1 啟動流程

// 1. Kafka.scala
def main(args: Array[String]): Unit = {
  val serverProps = getPropsFromArgs(args)
  val kafkaServerStartable = KafkaServerStartable.fromProps(serverProps)
  // 啟動Server
  kafkaServerStartable.startup()
  // 通過countDownLatch阻塞主執行緒，直到kafka關閉
  kafkaServerStartable.awaitShutdown()
}

// 2. KafkaServerStartable.scala
private val server = new KafkaServer(staticServerConfig, kafkaMetricsReporters = reporters)
def startup() {
  // 啟動Kafka Server
  server.startup()
}

// 3. KafkaServer.scala
def startup() {
  // 啟動socketServer，即Acceptor執行緒,processor會得到KafkaServer啟動完後延遲啟動  
  socketServer = new SocketServer(config, metrics, time, credentialProvider)
  socketServer.startup(startupProcessors = false)
  // 啟動各種其他元件
  ······
  // 啟動socketServer中的Processor，開始進行網路IO
  socketServer.startProcessors()
}

// 4. SocketServer.scala
def startup(startupProcessors: Boolean = true) {
  this.synchronized {
    // 建立並啟動Acceptor，建立Processor
    createAcceptorAndProcessors(config.numNetworkThreads, config.listeners)
    if (startupProcessors) {
      // 是否立即啟動Processor，預設為false
      startProcessors()
    }
  }
}

private def createAcceptorAndProcessors(processorsPerListener: Int,
                                          endpoints: Seq[EndPoint]): Unit = synchronized {
    val sendBufferSize = config.socketSendBufferBytes
    val recvBufferSize = config.socketReceiveBufferBytes
    val brokerId = config.brokerId
    // 處理每個Endpoint，一般就是一個
    endpoints.foreach { endpoint =>
      val listenerName = endpoint.listenerName
      val securityProtocol = endpoint.securityProtocol
      // 建立Acceptor執行緒
      val acceptor = new Acceptor(endpoint, sendBufferSize, recvBufferSize, brokerId, connectionQuotas)
      // 這裡只是建立Processor並不啟動
      addProcessors(acceptor, endpoint, processorsPerListener)
      // 非daemon模式啟動執行緒
      KafkaThread.nonDaemon(s"kafka-socket-acceptor-$listenerName-$securityProtocol-${endpoint.port}", acceptor).start()
      // 阻塞直至執行緒啟動成功
      acceptor.awaitStartup()
      acceptors.put(endpoint, acceptor)
    }
}

def startProcessors(): Unit = synchronized {
    // 遍歷所有Processor並啟動
    acceptors.values.asScala.foreach { _.startProcessors() }
}

private[network] def startProcessors(): Unit = synchronized {
    // 確保只啟動一次
    if (!processorsStarted.getAndSet(true)) {
      startProcessors(processors)
    }
}

// 非Daemon模式啟動Processor
private def startProcessors(processors: Seq[Processor]): Unit = synchronized {
    processors.foreach { processor =>
      KafkaThread.nonDaemon(s"kafka-network-thread-$brokerId-${endPoint.listenerName}-${endPoint.securityProtocol}-${processor.id}",
        processor).start()
    }
}

KafkaServer啟動時，初始化並啟動SocketServer

建立並執行Acceptor執行緒，從全連線佇列中獲取連線，並round-robin交給Processor處理
所有元件啟動完成後，會啟動一定數目的Processor，實際管理SocketChannel進行IO讀寫

2.2 Acceptor.run流程

Acceptor執行緒對一個Endpoint只啟動一個，核心程式碼位於Socketserver.scala中的Acceptor類中，此類實現了runnable方法，會由單獨執行緒執行

def run() {
  // 註冊
  serverChannel.register(nioSelector, SelectionKey.OP_ACCEPT)
  var currentProcessor = 0
  while (isRunning) {
      val ready = nioSelector.select(500)
      if (ready > 0) {
        val keys = nioSelector.selectedKeys()
        val iter = keys.iterator()
        while (iter.hasNext && isRunning) {
            val key = iter.next
            // 處理完需要從集合中移除掉
            iter.remove()
            // round-robin選一個processor
            val processor = synchronized {
              currentProcessor = currentProcessor % processors.size
              processors(currentProcessor)
            }
            // channel初始化，放入對應processor的newConnection佇列
            accept(key, processor)
            // round robin to the next processor thread, mod(numProcessors) will be done later
            currentProcessor = currentProcessor + 1
        }
      }
  }
}

def accept(key: SelectionKey, processor: Processor) {
    val serverSocketChannel = key.channel().asInstanceOf[ServerSocketChannel]
    val socketChannel = serverSocketChannel.accept()
    connectionQuotas.inc(socketChannel.socket().getInetAddress)
    // channel初始化
    socketChannel.configureBlocking(false)
    socketChannel.socket().setTcpNoDelay(true)
    socketChannel.socket().setKeepAlive(true)
    if (sendBufferSize != Selectable.USE_DEFAULT_BUFFER_SIZE)
      socketChannel.socket().setSendBufferSize(sendBufferSize)
    // 將連線放入processor的新連線佇列
    processor.accept(socketChannel)
}

def accept(socketChannel: SocketChannel) {
    // accept將新連線放入processor的ConcurrentLinkedQueue中
    newConnections.add(socketChannel)
    // 喚醒該processor的多路複用器
    wakeup()
}

Acceptor做的事情很簡單，概括起來就是監聽連線，將新連線輪詢交給processor：

使用多路複用器監聽全連線佇列裡的連線
有連線到達後，round-robin輪詢processors陣列，選擇一個processor
初始化socketChannel，開啟keepalive、禁用nagle演算法、設定send buffer
將socketchannel放入選中的processor的新連線佇列裡

2.3 Processor.run流程

Processor執行緒根據num.network.threads啟動對應的執行緒數，從每個Processor獨佔的新連線佇列中取出新的連線並初始化並註冊IO事件。每個Processor有單獨的selector，監聽IO事件，讀事件組包後寫入全域性requestQueue，寫事件從每個Processor獨佔的responseQueue中獲取，再寫回Client。

override def run() {
  while (isRunning) {
      // setup any new connections that have been queued up
      // acceptor執行緒會將新來的連線對應的SocketChannel放入佇列，此時消費並向selector註冊這些連線，註冊讀IO事件
      configureNewConnections()
      // register any new responses for writing
      // 從responseQueue中讀取準備傳送給client的response，封裝成send放入channel中，並註冊IO寫事件
      processNewResponses()
    
      /**
        * 1. 發生OP_READ事件的channel，若包全部到達，則形成NetworkReceives寫入到completedReceives(每個channel只會有一條在completedReceives中)
        * 2. 發生OP_WRITE事件的channel，會將channel中暫存的send發出，若傳送完成則會寫入completedSends
        */
      poll()
      // 將網路層組包完成後的NetworkReceive轉換成Request放入到requestQueue中（後面IO Thread讀取）同時mute channel(登出OP_READ事件)，保證一個channel同時只有一個請求在處理
      processCompletedReceives()
      // unmute channel(註冊OP_READ事件)，之前的request處理完成，此channel開始接受下一個request
      processCompletedSends()
      // 處理關閉的連線，維護些集合，更新統計資訊
      processDisconnected()
  }
}

Processor run方法的核心邏輯做了很好的封裝，從run方法來看執行緒會一直迴圈處理以下6個邏輯：

從newConenctions佇列裡取出新的連線，初始化socketChannel，註冊OP_READ事件
遍歷responseQueue所有RequestChannel.Response，封裝寫入KafkaChannel，做為該Channel下一個待傳送的Send，然後在對應的SelectionKey上註冊OP_WRITE事件
poll方法執行核心的NIO邏輯，呼叫select方法，遍歷有事件發生的selectionKeys
- 發生OP_READ事件的channel，若包全部到達，則形成NetworkReceives寫入到completedReceives(每個channel只會有一條在completedReceives中)
- 發生OP_WRITE事件的channel，會將channel中暫存的send發出，若傳送完成則會寫入completedSends
遍歷completedReceives中的結果，封裝成Request，寫入全域性requestQueue並取消Channel的OP_READ事件監聽，待後續IO Thread處理完Response傳送成功後，才會重新註冊OP_READ
遍歷completedSends中的結果，向selector重新註冊對該Channel的OP_READ事件
遍歷各種原因down掉的connection，做一些收尾工作，清理一些狀態

以下是每一步具體的原始碼：

2.3.1 configureNewConnections

用於處理Acceptor新交給此Processor的連線

// SocketChannel.scala
private def configureNewConnections() {
    while (!newConnections.isEmpty) {
        val channel = newConnections.poll()
        // 新的連線註冊IO讀事件，connectionId就是ip+port形成的字串唯一標誌連線使用
        selector.register(connectionId(channel.socket), channel)
    }
}

// Selector.java
public void register(String id, SocketChannel socketChannel) throws IOException {
    // 確保沒有重複註冊
    ensureNotRegistered(id);
    // 建立kafkachannel並attach到selectkey上
    registerChannel(id, socketChannel, SelectionKey.OP_READ);
}

private SelectionKey registerChannel(String id, SocketChannel socketChannel, int interestedOps) throws IOException {
    // 向selector註冊 
    SelectionKey key = socketChannel.register(nioSelector, interestedOps);
    // 建立kafka channel並attach到SelectionKey上
    KafkaChannel channel = buildAndAttachKafkaChannel(socketChannel, id, key);
    this.channels.put(id, channel);
    return key;
}

主要完成一些初始化工作

遍歷newConnections佇列，從中取出新連線
向Selector註冊IO讀事件
建立KafkaChannel用於封裝SocketChannel
將KafkaChannel attach到對應的SelectionKey上

2.3.2 processNewResponses

處理已經處理完的Request的Response

// SocketServer.scala
private def processNewResponses() {
    var curr: RequestChannel.Response = null
    // 讀取responseQueue，處理所有返回
    while ({curr = dequeueResponse(); curr != null}) {
      // 理論上每個channel應該只會被遍歷一次，因為一個連線上同時只會有一個Request正在處理
      val channelId = curr.request.context.connectionId
      curr.responseAction match {
      case RequestChannel.NoOpAction =>
        // There is no response to send to the client, we need to read more pipelined requests
        // that are sitting in the server's socket buffer
        updateRequestMetrics(curr)
        trace("Socket server received empty response to send, registering for read: " + curr)
        // 空請求說明此請求處理完了，此時unmute此KafkaChannel，開始接受請求
        openOrClosingChannel(channelId).foreach(c => selector.unmute(c.id))
      case RequestChannel.SendAction =>
        val responseSend = curr.responseSend.getOrElse(
          throw new IllegalStateException(s"responseSend must be defined for SendAction, response: $curr"))
        // 注意這裡只是將responseSend註冊為KafkaChannel的待傳送Send並向SelectionKey註冊OP_WRITE事件
        sendResponse(curr, responseSend)
      case RequestChannel.CloseConnectionAction =>
        updateRequestMetrics(curr)
        trace("Closing socket connection actively according to the response code.")
        close(channelId)
    }
}

protected[network] def sendResponse(response: RequestChannel.Response, responseSend: Send) {
    val connectionId = response.request.context.connectionId
    // Invoke send for closingChannel as well so that the send is failed and the channel closed properly and
    // removed from the Selector after discarding any pending staged receives.
    // `openOrClosingChannel` can be None if the selector closed the connection because it was idle for too long
    if (openOrClosingChannel(connectionId).isDefined) {
      selector.send(responseSend)
      inflightResponses += (connectionId -> response)
    }
}
// Selector.java
public void send(Send send) {
    String connectionId = send.destination();
    KafkaChannel channel = openOrClosingChannelOrFail(connectionId);
    // 這裡只是設定channel的send，並沒有實際傳送
    channel.setSend(send);
}

public void setSend(Send send) {
    // 同時只能有一個send存在
    if (this.send != null)
        throw new IllegalStateException("Attempt to begin a send operation with prior send operation still in progress, connection id is " + id);
    // 設定send
    this.send = send;
    // transportLayer其實就是對不加密通訊、加密通訊的封裝，增加對OP_WRITE事件的監聽
    this.transportLayer.addInterestOps(SelectionKey.OP_WRITE);
}

public void addInterestOps(int ops) {
    key.interestOps(key.interestOps() | ops);
}

核心邏輯是從responseQueue中獲取待傳送的response，並作為KafkaChannel下一個待傳送Send，再註冊OP_WRITE事件

遍歷responseQueue，獲取已經處理完的Response
判斷Response是否為空，為空，unmute channel，註冊OP_READ，等待下一個Request，不為空呼叫sendResponse傳送Response
將當前待傳送Response封裝成Send，繫結到KafkaChannel上，一次只能有一個待傳送Send（一次也只處理一個Request）
註冊OP_WRITE事件，事件發生時，才實際傳送當前Send

2.3.3 poll

實際呼叫select，並對發生的IO事件進行處理的方法

// SocketServer.scala
private def poll() {
    selector.poll(300)
}

// selector.java
public void poll(long timeout) throws IOException {
    if (timeout < 0)
        throw new IllegalArgumentException("timeout should be >= 0");

    boolean madeReadProgressLastCall = madeReadProgressLastPoll;
    clear();

    boolean dataInBuffers = !keysWithBufferedRead.isEmpty();

    if (hasStagedReceives() || !immediatelyConnectedKeys.isEmpty() || (madeReadProgressLastCall && dataInBuffers))
        timeout = 0;

    if (!memoryPool.isOutOfMemory() && outOfMemory) {
        //we have recovered from memory pressure. unmute any channel not explicitly muted for other reasons
        log.trace("Broker no longer low on memory - unmuting incoming sockets");
        for (KafkaChannel channel : channels.values()) {
            if (channel.isInMutableState() && !explicitlyMutedChannels.contains(channel)) {
                channel.unmute();
            }
        }
        outOfMemory = false;
    }

    /* check ready keys */
    long startSelect = time.nanoseconds();
    int numReadyKeys = select(timeout);
    long endSelect = time.nanoseconds();
    this.sensors.selectTime.record(endSelect - startSelect, time.milliseconds());
    // 有IO事件發生或有immediatelyConnect發生或上次IO事件發生時channel資料沒有讀完
    if (numReadyKeys > 0 || !immediatelyConnectedKeys.isEmpty() || dataInBuffers) {
        Set<SelectionKey> readyKeys = this.nioSelector.selectedKeys();

        // Poll from channels that have buffered data (but nothing more from the underlying socket)
        if (dataInBuffers) {
            keysWithBufferedRead.removeAll(readyKeys); //so no channel gets polled twice
            Set<SelectionKey> toPoll = keysWithBufferedRead;
            keysWithBufferedRead = new HashSet<>(); //poll() calls will repopulate if needed
            pollSelectionKeys(toPoll, false, endSelect);
        }
        // 遍歷selectionKey處理IO讀寫事件，讀完的資料放入stagedReceive。同時將KafkaChannel中的Send寫出
        // Poll from channels where the underlying socket has more data
        pollSelectionKeys(readyKeys, false, endSelect);
        // Clear all selected keys so that they are included in the ready count for the next select
        readyKeys.clear();

        pollSelectionKeys(immediatelyConnectedKeys, true, endSelect);
        immediatelyConnectedKeys.clear();
    } else {
        madeReadProgressLastPoll = true; //no work is also "progress"
    }

    long endIo = time.nanoseconds();
    this.sensors.ioTime.record(endIo - endSelect, time.milliseconds());
    // 處理空閒的連線，預設10min，超時的連線會被斷開
    // we use the time at the end of select to ensure that we don't close any connections that
    // have just been processed in pollSelectionKeys
    maybeCloseOldestConnection(endSelect);
    // 從stagedReceives中每個channel取一條NetworkReceives放入到CompletedReceived
    // Add to completedReceives after closing expired connections to avoid removing
    // channels with completed receives until all staged receives are completed.
    addToCompletedReceives();
}

private int select(long timeoutMs) throws IOException {
    if (timeoutMs < 0L)
        throw new IllegalArgumentException("timeout should be >= 0");

    if (timeoutMs == 0L)
        return this.nioSelector.selectNow();
    else
        return this.nioSelector.select(timeoutMs);
}

void pollSelectionKeys(Set<SelectionKey> selectionKeys,
                           boolean isImmediatelyConnected,
                           long currentTimeNanos) {
    // determineHandlingOrder對key集合做了shuffle，避免發生飢餓
    for (SelectionKey key : determineHandlingOrder(selectionKeys)) {
        KafkaChannel channel = channel(key);
        long channelStartTimeNanos = recordTimePerConnection ? time.nanoseconds() : 0;
        // 更新channel的過期時間
        if (idleExpiryManager != null)
            idleExpiryManager.update(channel.id(), currentTimeNanos);

        boolean sendFailed = false;
        // 從channel讀資料到stagedReceive，若stagedReceive有資料，說明已形成完整Request，不再繼續讀
        attemptRead(key, channel);
        // 只有ssl通訊時才可能為true
        if (channel.hasBytesBuffered()) {
            keysWithBufferedRead.add(key);
        }
        // 往channel寫資料
        /* if channel is ready write to any sockets that have space in their buffer and for which we have data */
        if (channel.ready() && key.isWritable()) {
            Send send = null;
            try {
                // 將channel中的send傳送出去，如果傳送完成，則登出OP_WRITE事件
                send = channel.write();
            } catch (Exception e) {
                sendFailed = true;
                throw e;
            }
            if (send != null) {
                // 新增到completedSends集合中
                this.completedSends.add(send);
            }
        }
    }
}

private void attemptRead(SelectionKey key, KafkaChannel channel) throws IOException {
    //if channel is ready and has bytes to read from socket or buffer, and has no
    //previous receive(s) already staged or otherwise in progress then read from it
    if (channel.ready() && (key.isReadable() || channel.hasBytesBuffered()) && !hasStagedReceive(channel)
        && !explicitlyMutedChannels.contains(channel)) {
        NetworkReceive networkReceive;
        // channel.read返回不為null則代表讀到一個完的Request
        while ((networkReceive = channel.read()) != null) {
            madeReadProgressLastPoll = true;
            addToStagedReceives(channel, networkReceive);
        }
        // 這裡mute了，一定是channel.read()內由於memorypool記憶體不夠，才會mute
        if (channel.isMute()) {
            outOfMemory = true; //channel has muted itself due to memory pressure.
        } else {
            madeReadProgressLastPoll = true;
        }
    }
}

// KafkaChannel.java
public NetworkReceive read() throws IOException {
    NetworkReceive result = null;
    if (receive == null) {
        receive = new NetworkReceive(maxReceiveSize, id, memoryPool);
    }
    // 從channel裡讀取資料，內部實際呼叫的readFromReadableChannel()
    receive(receive);
    // 如果讀完了，形成一個完整的Request
    if (receive.complete()) {
        receive.payload().rewind();
        result = receive;
        receive = null;
    } else if (receive.requiredMemoryAmountKnown() && !receive.memoryAllocated() && isInMutableState()) {
        //pool must be out of memory, mute ourselves.
        mute();
    }
    return result;
}

// NetworkReceive.java
// 這裡的實現和zookeeper網路層很像，也是前4個位元組傳遞payload大小，然後建立指定大小buffer讀取資料
public long readFromReadableChannel(ReadableByteChannel channel) throws IOException {
    int read = 0;
    // size為4個位元組大小的bytebuffer，這裡沒讀滿，說明頭4個位元組還沒拿到
    if (size.hasRemaining()) {
        int bytesRead = channel.read(size);
        if (bytesRead < 0)
            throw new EOFException();
        read += bytesRead;
        if (!size.hasRemaining()) {
            size.rewind();
            // 實際的Request大小
            int receiveSize = size.getInt();
            if (receiveSize < 0)
                throw new InvalidReceiveException("Invalid receive (size = " + receiveSize + ")");
            if (maxSize != UNLIMITED && receiveSize > maxSize)
                throw new InvalidReceiveException("Invalid receive (size = " + receiveSize + " larger than " + maxSize + ")");
            requestedBufferSize = receiveSize; //may be 0 for some payloads (SASL)
            if (receiveSize == 0) {
                buffer = EMPTY_BUFFER;
            }
        }
    }
    // 說明頭4個位元組讀完了
    if (buffer == null && requestedBufferSize != -1) { //we know the size we want but havent been able to allocate it yet
        // 分配緩衝區記憶體，memorypool用於控制網路層緩衝區大小，預設為無限大
        buffer = memoryPool.tryAllocate(requestedBufferSize);
        if (buffer == null)
            log.trace("Broker low on memory - could not allocate buffer of size {} for source {}", requestedBufferSize, source);
    }
    if (buffer != null) {
        // 實際讀取payload
        int bytesRead = channel.read(buffer);
        if (bytesRead < 0)
            throw new EOFException();
        read += bytesRead;
    }

    return read;
}

呼叫select，對OP_READ、OP_WRITE事件進行響應，處理IO讀寫

呼叫select方法，獲取發生IO事件的SelectionKey
有IO事件發生或有immediatelyConnect發生或上次IO事件發生時channel資料沒有讀完，對對應的keys呼叫pollSelectionKeys
- 遍歷SelectionsKeys
- 若發生OP_READ事件，呼叫channel.read直到讀到完整的networkReceive，並放入stagedReceive
  - 先讀取4個位元組size，為整個payload大小
  - 再讀取size個位元組，讀完後形成的networkReceive為一個完整的Request，放入stagedReceive
- 若發生OP_WRITE事件，將channel繫結的當前Send寫出，若完全傳送完成，則將該Send放入CompletedReceive，並登出OP_WRITE事件
處理長時間空閒的連線，預設10m，關閉超時的連線
將stagedReceives中的networkReceive移動到completeReceives

2.3.4 processCompletedReceives

處理completedReceives中的NetworkReceive，封裝成Request放入RequestChannel的全域性requestQueue中，供API層呼叫

private def processCompletedReceives() {
    selector.completedReceives.asScala.foreach { receive =>
        // 根據connectionId獲取Channel
        openOrClosingChannel(receive.source) match {
          case Some(channel) =>
            val header = RequestHeader.parse(receive.payload)
            val context = new RequestContext(header, receive.source, channel.socketAddress,
              channel.principal, listenerName, securityProtocol)
            val req = new RequestChannel.Request(processor = id, context = context,
              startTimeNanos = time.nanoseconds, memoryPool, receive.payload, requestChannel.metrics)
            requestChannel.sendRequest(req)
            // 登出OP_READ事件監聽，保證一個連線來的請求，處理完後才會處理下個請求，因此保證單個連線請求處理的順序性
            selector.mute(receive.source)
          case None =>
            // This should never happen since completed receives are processed immediately after `poll()`
            throw new IllegalStateException(s"Channel ${receive.source} removed from selector before processing completed receive")
        }
    }
}
1. 遍歷completedReceives中的networkReceive，從payload中提取資料封裝成RequestChannel.Request放入RequestChannel的全域性requestQueue中
2. mute對應的KafkaChannel，即在對應selectionKey上登出OP_READ事件（原因第三章詳解）

2.3.5 processCompletedSends

處理已完成傳送的Response，遍歷CompletedSends，unmute對應的KafkaChannel，即重新在對應selectionKey上註冊OP_READ事件，接收下一個Request

private def processCompletedSends() {
    selector.completedSends.asScala.foreach { send =>
        val resp = inflightResponses.remove(send.destination).getOrElse {
          throw new IllegalStateException(s"Send for ${send.destination} completed, but not in `inflightResponses`")
        }
        updateRequestMetrics(resp)
        // response傳送完成，unmute channel，重新監聽OP_READ事件
        selector.unmute(send.destination)
    }
}
// selector.scala
public void unmute(String id) {
    KafkaChannel channel = openOrClosingChannelOrFail(id);
    unmute(channel);
}

private void unmute(KafkaChannel channel) {
    explicitlyMutedChannels.remove(channel);
    channel.unmute();
}

// kafkaChannel.scala
void unmute() {
    if (!disconnected)
        transportLayer.addInterestOps(SelectionKey.OP_READ);
    muted = false;
}

2.3.6 processDisconnected

若連線已關閉，從inflightResponses集合中移除，並減少對應的限流統計資訊

private def processDisconnected() {
    selector.disconnected.keySet.asScala.foreach { connectionId =>
        val remoteHost = ConnectionId.fromString(connectionId).getOrElse {
          throw new IllegalStateException(s"connectionId has unexpected format: $connectionId")
        }.remoteHost
        inflightResponses.remove(connectionId).foreach(updateRequestMetrics)
        // the channel has been closed by the selector but the quotas still need to be updated
        connectionQuotas.dec(InetAddress.getByName(remoteHost))
    }
}

三、其它細節

單個連線的順序性保證

Processor每接受到一個完整的Request就會再selector上取消監聽OP_READ事件，直到Response傳送完成後才會重新監聽OP_READ事件，從而保證單個連線的Channel上，Server端請求是嚴格按照到達順序處理的。

為什麼有transportLayer？

主要是封裝Plaintext通訊與ssl通訊，對於Plaintext不加密通訊，本質transportLayer沒做任何處理，而對ssl通訊，transportLayer對Kafka通訊協議遮蔽了握手、加解密等操作

為什麼要有stagedReceives，而不是直接放入compeletedReceived？

主要是由於SSL加密通訊時，無法得知準確的資料長度（前4位加密後不知道多長了），例如：一次OP_READ讀到，2個Request，此時需要將這個2個Request都存入stagedReceives(因此每個channel一個佇列)，然後一個一個處理（保障順序）。具體也可參考第2點git commit中的對話
這塊設計的確實不好，後續Kafka移除了stagedReceived，程式碼更加簡潔https://github.com/apache/kafka/pull/5920/commits

為什麼RequestQueue是單個佇列，不會有鎖衝突問題嗎？

因為kafka每次處理的資料是一批，實際一批資料才會競爭一次鎖，獲取鎖開銷平均下來並不大。騰訊雲曾嘗試優化這裡為無鎖佇列，實際IO效能並沒有顯著提高。

圖解 Kafka 原始碼之 NetworkClient 網路通訊元件架構設計
2023-03-15
圖解Kafka原始碼client元件架構
原始碼解讀Dubbo分層設計思想
2021-09-14
原始碼
計算機網路之網路層
2020-06-09
計算機網路
從RocketMQ的Broker原始碼層面驗證一下這兩個點
2021-03-15
MQ原始碼
計算機網路之網路介面層
2020-06-06
計算機網路
計算機網路總結（網路層）
2021-04-01
計算機網路
RocketMQ中Broker的刷盤原始碼分析
2019-08-07
MQ原始碼
從原始碼層面談談mybatis的快取設計
2019-03-29
原始碼MyBatis快取
讀Flink原始碼談設計：圖的抽象與分層
2022-01-26
原始碼抽象
程式碼分層設計
2019-03-09
網路的四層五層七層網路
2024-07-12
計算機網路（一）：網路層次劃分
2020-11-18
計算機網路
RocketMQ中Broker的啟動原始碼分析（一）
2019-08-04
MQ原始碼
計算機網路 | 資料鏈路層
2020-11-05
計算機網路
Kafka之Producer原始碼
2018-08-14
Kafka原始碼
Kafka 原始碼剖析(一)
2018-03-15
Kafka原始碼
計算機網路之物理層
2019-01-08
計算機網路
計算機網路--應用層
2018-07-11
計算機網路
計算機網路 - 應用層
2020-11-17
計算機網路
計算機網路的物理層
2022-03-25
計算機網路
計算機網路（二）物理層
2022-02-04
計算機網路
計算機網路 -- 應用層
2021-09-12
計算機網路
Android網路程式設計：Retrofit原始碼解析
2019-03-22
Android程式設計原始碼
網路安全學原始碼審計嗎？怎樣才能學好網路安全
2021-03-15
原始碼
spring5原始碼-ioc抽象層次設計與 aop流程理解
2020-11-21
Spring原始碼抽象
計算機網路之資料鏈路層
2020-12-16
計算機網路
RocketMQ中Broker的訊息儲存原始碼分析
2019-08-06
MQ原始碼
介面控制器層（Controller層）設計（網文）
2024-11-19
Controller
[從原始碼學設計]螞蟻金服SOFARegistry網路操作之連線管理
2020-11-28
原始碼
mongodb核心transport_layer網路傳輸層模組原始碼實現三
2020-10-04
MongoDB原始碼
mongodb核心transport_layer 網路傳輸層模組原始碼實現四
2020-10-04
MongoDB原始碼
計算機網路七層協議
2018-08-19
計算機網路協議
計算機網路之傳輸層
2020-06-13
計算機網路
計算機網路（四）傳輸層
2022-02-12
計算機網路
計算機網路之運輸層
2021-09-09
計算機網路
《計算機網路》傳輸層（1）
2020-12-07
計算機網路
原始碼分析Kafka之Producer
2018-08-27
原始碼Kafka
程式碼分層的設計之道
2019-03-04