Spark core篇 Rpc原始碼1
描述了Spark Master和Worker啟動的流程, 裡面無論是Master還是Workermain方法的第一步都是構建RpcEnv, 這個是訊息通訊的核心, 這裡就來詳細分析分析Rpc
首先看看Master和Worker的一段相似構建RpcEnv的程式碼:
Master: val rpcEnv = RpcEnv.create(SYSTEM_NAME, host, port, conf, securityMgr) val masterEndpoint = rpcEnv.setupEndpoint(ENDPOINT_NAME, new Master(rpcEnv, rpcEnv.address, webUiPort, securityMgr, conf))//這句master終端點send a message to the corresponding [[RpcEndpoint]],這個RpcEndpoint就是Master val portsResponse = masterEndpoint.askWithRetry[BoundPortsResponse](BoundPortsRequest) Worker: val rpcEnv = RpcEnv.create(systemName, host, port, conf, securityMgr) val masterAddresses = masterUrls.map(RpcAddress.fromSparkURL(_)) rpcEnv.setupEndpoint(ENDPOINT_NAME, new Worker(rpcEnv, webUiPort, cores, memory, masterAddresses, ENDPOINT_NAME, workDir, conf, securityMgr))
檢視可知道其實這兩部比較重要, RpcEnv.create和rpcEnv.setupEndpoint。這裡就單獨詳細分析這兩塊的內容
RpcEnv.create
RpcEnv.create流程圖大致為如此:
image.png
底層是啟動Netty的Server, 開啟Netty端通訊(server = transportContext.createServer(host, port, bootstraps))
rpcEnv.setupEndpoint
Spark所有的訊息實際上都是透過RpcEnv處理, 然後RpcEnv分發到對應的Endpoint。RpcEndpointRef相當於RpcEndpoint的引用, 如果想給RpcEndpoint傳送訊息,則需要先獲取RpcEndpoint的引用RpcEndpointRef
這裡以Master舉例:
val masterEndpoint = rpcEnv.setupEndpoint(ENDPOINT_NAME, new Master(rpcEnv, rpcEnv.address, webUiPort, securityMgr, conf))
new Master是一個RpcEndpoint,會轉到NettyRpcEnv類的setupEndpoint方法:
dispatcher.registerRpcEndpoint(name, endpoint)
這之後會轉到Dispatcher類的registerRpcEndpoint方法中:
def registerRpcEndpoint(name: String, endpoint: RpcEndpoint): NettyRpcEndpointRef = {//因為Dispatcher關聯NettyRpcEnv物件, 因此可以透過nettyEnv.address獲取。nettyEnv.address代表啟動此NettyRpcEnv的address(由host和Port構成) val addr = RpcEndpointAddress(nettyEnv.address, name)//建立endpointRef ,此處應該是對應Master的RpcEndpointRef, 它實際上是一個NettyRpcEndpointRef物件 val endpointRef = new NettyRpcEndpointRef(nettyEnv.conf, addr, nettyEnv) synchronized { if (stopped) { throw new IllegalStateException("RpcEnv has been stopped") }//判斷endpoints是否有對應名字的EndPointData, 沒有就加入進去 if (endpoints.putIfAbsent(name, new EndpointData(name, endpoint, endpointRef)) != null) { throw new IllegalArgumentException(s"There is already an RpcEndpoint called $name") } val data = endpoints.get(name)//新增進入endpointRefs endpointRefs.put(data.endpoint, data.ref)//將data新增進入receivers佇列, 等待執行緒池拉取,取其訊息進行執行。 receivers.offer(data) // for the OnStart message } endpointRef }
Dispatcher有幾個變數很重要:
//endpoints是一個執行緒安全的ConcurrentMap,key是名字,值是EndpointDataprivate val endpoints: ConcurrentMap[String, EndpointData] = new ConcurrentHashMap[String, EndpointData]//endpointRefs 存放了RpcEndpoint與RpcEndpointRef的一一對映關係 private val endpointRefs: ConcurrentMap[RpcEndpoint, RpcEndpointRef] = new ConcurrentHashMap[RpcEndpoint, RpcEndpointRef] // Track the receivers whose inboxes may contain messages.//receivers是一個佇列,Dispatcher會有threadpool執行緒池去消費receivers中的資訊 private val receivers = new LinkedBlockingQueue[EndpointData]
EndpointData由名字,RpcEndpoint,NettyRpcEndpointRef構成,並會例項化Inbox,Inbox new物件時會將OnStart加到Messages的佇列中作為inbox的首條訊息,這也是為何RpcEndpoint建構函式執行完之後就立馬執行onStar()函式了 private class EndpointData( val name: String, val endpoint: RpcEndpoint, val ref: NettyRpcEndpointRef) { val inbox = new Inbox(ref, endpoint) }
NettyRpcEnv有兩個方法用於序列化和反序列化的,因為NettyRpcEnv需要遠端傳輸,遠端通訊:
private[netty] def serialize(content: Any): ByteBuffer = { javaSerializerInstance.serialize(content) } private[netty] def deserialize[T: ClassTag](client: TransportClient, bytes: ByteBuffer): T = { NettyRpcEnv.currentClient.withValue(client) { deserialize { () => javaSerializerInstance.deserialize[T](bytes) } } }
同樣的Worker啟動的程式也是如此,透過setupEndpoint方法建立Worker 與 NettyRpcEnvRef的對映關係。
Rpc通訊
首先看看RpcEndpointRef中的兩個總要方法:
/** * Sends a one-way asynchronous message. Fire-and-forget semantics. */ def send(message: Any): Unit/** * Send a message to the corresponding [[RpcEndpoint.receiveAndReply)]] and return a [[Future]] to * receive the reply within the specified timeout. * * This method only sends the message once and never retries. */ def ask[T: ClassTag](message: Any, timeout: RpcTimeout): Future[T]
一個是send, 原始碼描述它就是一種非同步的one-way的訊息, 實際上也就是傳送過去無需回覆。
而ask與send不同需要回復,它是傳送一個訊息到指定的終端點,然後接收此訊息的終端點收到訊息處理後進行reply,這個可能是local模式也可能remote模式。
image.png
這裡我們從中選中一點程式碼進行分析:如Master中:
如以下程式碼:
case Heartbeat(workerId, worker) => idToWorker.get(workerId) match { case Some(workerInfo) => workerInfo.lastHeartbeat = System.currentTimeMillis() case None => if (workers.map(_.id).contains(workerId)) { logWarning(s"Got heartbeat from unregistered worker $workerId." + " Asking it to re-register.") worker.send(ReconnectWorker(masterUrl)) } else { logWarning(s"Got heartbeat from unregistered worker $workerId." + " This worker was never registered, so ignoring the heartbeat.") } }
上面的程式碼邏輯是Worker會定時傳送心跳包到Master端, 如果Master檢測到workerId對應的workerInfo找不到了, 則會校驗workers集合是不是包含此workerId,包含則會傳送重連給Worker
worker.send(ReconnectWorker(masterUrl))中worker是RpcendpointRef,實際上也是NettyRpcEnvRef,接著就到:
override def send(message: Any): Unit = { require(message != null, "Message is null") nettyEnv.send(RequestMessage(nettyEnv.address, this, message)) // RequestMessage(address, RpcEnv, message) } RequestMessage:/** * The message that is sent from the sender to the receiver. */private[netty] case class RequestMessage( senderAddress: RpcAddress, receiver: NettyRpcEndpointRef, content: Any)
然後就到:
private[netty] def send(message: RequestMessage): Unit = { val remoteAddr = message.receiver.address if (remoteAddr == address) { // Message to a local RPC endpoint. try { dispatcher.postOneWayMessage(message) } catch { case e: RpcEnvStoppedException => logWarning(e.getMessage) } } else { // Message to a remote RPC endpoint. postToOutbox(message.receiver, OneWayOutboxMessage(serialize(message))) } }
即根據messager的receiver方的地址與本機地址是否相同, 相同說明是local Rpc,不同則說明是remote Rpc,, 這裡由於Master節點要往Worker節點發訊息, 則屬於remote 模式。下面分別介紹兩種模式下的情景。
(1)remote RPC:
private def postToOutbox(receiver: NettyRpcEndpointRef, message: OutboxMessage): Unit = { if (receiver.client != null) { message.sendWith(receiver.client) } else { require(receiver.address != null, "Cannot send message to client endpoint with no listen address.") val targetOutbox = { val outbox = outboxes.get(receiver.address) if (outbox == null) { val newOutbox = new Outbox(this, receiver.address) val oldOutbox = outboxes.putIfAbsent(receiver.address, newOutbox) if (oldOutbox == null) { newOutbox } else { oldOutbox } } else { outbox } } if (stopped.get) { // It's possible that we put `targetOutbox` after stopping. So we need to clean it. outboxes.remove(receiver.address) targetOutbox.stop() } else { targetOutbox.send(message) } } }
remote Rpc傳送最終會透過TransportClient去傳送,
/** * Sends an opaque message to the RpcHandler on the server-side. The callback will be invoked * with the server's response or upon any failure. * * @param message The message to send. * @param callback Callback to handle the RPC's reply. * @return The RPC's id. */ public long sendRpc(ByteBuffer message, final RpcResponseCallback callback) {
即透過Netty框架將資料傳送到遠端伺服器端的RpcHandler那裡, 讓其去處理。
然後NettyRpcHandler收到訊息,就會發到inBox中,讓執行緒池來消費訊息
/** Posts a message sent by a remote endpoint. */ def postRemoteMessage(message: RequestMessage, callback: RpcResponseCallback): Unit = { val rpcCallContext = new RemoteNettyRpcCallContext(nettyEnv, callback, message.senderAddress) val rpcMessage = RpcMessage(message.senderAddress, message.content, rpcCallContext) postMessage(message.receiver.name, rpcMessage, (e) => callback.onFailure(e)) }
執行緒池消費:
/** Message loop used for dispatching messages. */ private class MessageLoop extends Runnable { override def run(): Unit = { NettyRpcEnv.rpcThreadFlag.value = true try { while (true) { try { val data = receivers.take() if (data == PoisonPill) { // Put PoisonPill back so that other MessageLoops can see it. receivers.offer(PoisonPill) return } data.inbox.process(Dispatcher.this) } catch { case NonFatal(e) => logError(e.getMessage, e) } } } catch { case ie: InterruptedException => // exit } } }
處理遠端訊息的程式碼:
** * Process stored messages. */ def process(dispatcher: Dispatcher): Unit = { var message: InboxMessage = null inbox.synchronized { if (!enableConcurrent && numActiveThreads != 0) { return } message = messages.poll() if (message != null) { numActiveThreads += 1 } else { return } } while (true) { safelyCall(endpoint) { message match { case RpcMessage(_sender, content, context) => try { endpoint.receiveAndReply(context).applyOrElse[Any, Unit](content, { msg => throw new SparkException(s"Unsupported message $message from ${_sender}") }) } catch { case NonFatal(e) => context.sendFailure(e) // Throw the exception -- this exception will be caught by the safelyCall function. // The endpoint's onError function will be called. throw e }
作者:kason_zhang
連結:
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/964/viewspace-2818742/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- Spark RPC框架原始碼分析(二)RPC執行時序SparkRPC框架原始碼
- Spark RPC框架原始碼分析(三)Spark心跳機制分析SparkRPC框架原始碼
- Spark RPC框架原始碼分析(一)簡述SparkRPC框架原始碼
- ASP.NET Core[原始碼分析篇] - StartupASP.NET原始碼
- go rpc 原始碼分析GoRPC原始碼
- ASP.NET Core[原始碼分析篇] - 認證ASP.NET原始碼
- 微服務8:通訊之RPC實踐篇(附原始碼)微服務RPC原始碼
- 以太坊原始碼分析(51)rpc原始碼分析原始碼RPC
- 比特幣原始碼分析--RPC比特幣原始碼RPC
- Spark 原始碼分析系列Spark原始碼
- spark reduceByKey原始碼解析Spark原始碼
- ReentrantLock 公平鎖原始碼 第1篇ReentrantLock原始碼
- myBatis原始碼解析-日誌篇(1)MyBatis原始碼
- Spark-2.4.0原始碼:sparkContextSpark原始碼Context
- spark核心原始碼深度剖析Spark原始碼
- iOS開發原始碼閱讀篇--FMDB原始碼分析1(FMResultSet)iOS原始碼
- SQLMAP原始碼分析Part1:流程篇SQL原始碼
- 以太坊原始碼分析(13)RPC分析原始碼RPC
- Spark 原始碼系列(七)Spark on yarn 具體實現Spark原始碼Yarn
- .net core 原始碼分析原始碼
- EF Core 原始碼分析原始碼
- 【spark筆記】在idea用maven匯入spark原始碼Spark筆記IdeaMaven原始碼
- spark 原始碼分析之十八 -- Spark儲存體系剖析Spark原始碼
- spark 原始碼分析之十五 -- Spark記憶體管理剖析Spark原始碼記憶體
- Dubbo RPC執行緒模型 原始碼分析RPC執行緒模型原始碼
- 在GO中呼叫C原始碼#基礎篇1Go原始碼
- .NET Core HttpClient原始碼探究HTTPclient原始碼
- .Net Core Configuration原始碼探究原始碼
- .NET Core Session原始碼探究Session原始碼
- Spark原始碼解析-Yarn部署流程(ApplicationMaster)Spark原始碼YarnAPPAST
- spark 原始碼分析之十三 -- SerializerManager剖析Spark原始碼
- spark 原始碼分析之十六 -- Spark記憶體儲存剖析Spark原始碼記憶體
- 使用 IntelliJ IDEA 匯入 Spark 最新原始碼及編譯 Spark 原始碼(博主強烈推薦)IntelliJIdeaSpark原始碼編譯
- Spark RPC 到底是個什麼鬼?SparkRPC
- Dubbo原始碼淺析(一)—RPC框架與Dubbo原始碼RPC框架
- [原始碼解析] PyTorch分散式優化器(1)----基石篇原始碼PyTorch分散式優化
- 探索 YOLO v3 原始碼 - 第1篇 訓練YOLO原始碼
- 【Spark篇】---Spark故障解決(troubleshooting)Spark