1.概述
之前在《Hadoop2原始碼分析-RPC探索實戰》一文當中介紹了Hadoop的RPC機制,今天給大家分享關於YARN的RPC的機制。下面是今天的分享目錄:
- YARN的RPC介紹
- YARN的RPC示例
- 截圖預覽
下面開始今天的內容分享。
2.YARN的RPC介紹
我們知道在Hadoop的RPC當中,其主要由RPC,Client及Server這三個大類組成,分別實現對外提供程式設計介面、客戶端實現及服務端實現。如下圖所示:
圖中是Hadoop的RPC的一個類的關係圖,大家可以到《Hadoop2原始碼分析-RPC探索實戰》一文中,通過程式碼示例去理解他們之間的關係,這裡就不多做贅述了。接下來,我們去看Yarn的RPC。
Yarn對外提供的是YarnRPC這個類,這是一個抽象類,通過閱讀YarnRPC的原始碼可以知道,實際的實現由引數yarn.ipc.rpc.class設定,預設情況下,其值為:org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC,部分程式碼如下:
- YarnRPC:
public abstract class YarnRPC { // ...... public static YarnRPC create(Configuration conf) { LOG.debug("Creating YarnRPC for " + conf.get(YarnConfiguration.IPC_RPC_IMPL)); String clazzName = conf.get(YarnConfiguration.IPC_RPC_IMPL); if (clazzName == null) { clazzName = YarnConfiguration.DEFAULT_IPC_RPC_IMPL; } try { return (YarnRPC) Class.forName(clazzName).newInstance(); } catch (Exception e) { throw new YarnRuntimeException(e); } } }
- YarnConfiguration類:
public class YarnConfiguration extends Configuration { //Configurations public static final String YARN_PREFIX = "yarn."; //////////////////////////////// // IPC Configs //////////////////////////////// public static final String IPC_PREFIX = YARN_PREFIX + "ipc."; /** RPC class implementation*/ public static final String IPC_RPC_IMPL = IPC_PREFIX + "rpc.class"; public static final String DEFAULT_IPC_RPC_IMPL = "org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC"; }
而HadoopYarnProtoRPC 通過 RPC 的 RpcFactoryProvider 生成客戶端工廠(由引數 yarn.ipc.client.factory.class 指定,預設值是 org.apache.hadoop.yarn.factories.impl.pb.RpcClientFactoryPBImpl)和伺服器工廠 (由引數 yarn.ipc.server.factory.class 指定,預設值是 org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl),以根據通訊協議的 Protocol Buffers 定義生成客戶端物件和伺服器物件。相關類的部分程式碼如下:
- HadoopYarnProtoRPC
public class HadoopYarnProtoRPC extends YarnRPC { private static final Log LOG = LogFactory.getLog(HadoopYarnProtoRPC.class); @Override public Object getProxy(Class protocol, InetSocketAddress addr, Configuration conf) { LOG.debug("Creating a HadoopYarnProtoRpc proxy for protocol " + protocol); return RpcFactoryProvider.getClientFactory(conf).getClient(protocol, 1, addr, conf); } @Override public void stopProxy(Object proxy, Configuration conf) { RpcFactoryProvider.getClientFactory(conf).stopClient(proxy); } @Override public Server getServer(Class protocol, Object instance, InetSocketAddress addr, Configuration conf, SecretManager<? extends TokenIdentifier> secretManager, int numHandlers, String portRangeConfig) { LOG.debug("Creating a HadoopYarnProtoRpc server for protocol " + protocol + " with " + numHandlers + " handlers"); return RpcFactoryProvider.getServerFactory(conf).getServer(protocol, instance, addr, conf, secretManager, numHandlers, portRangeConfig); } }
-
RpcFactoryProvider
public class RpcFactoryProvider { // ...... public static RpcClientFactory getClientFactory(Configuration conf) { String clientFactoryClassName = conf.get( YarnConfiguration.IPC_CLIENT_FACTORY_CLASS, YarnConfiguration.DEFAULT_IPC_CLIENT_FACTORY_CLASS); return (RpcClientFactory) getFactoryClassInstance(clientFactoryClassName); } //...... }
/** Factory to create client IPC classes.*/ public static final String IPC_CLIENT_FACTORY_CLASS = IPC_PREFIX + "client.factory.class"; public static final String DEFAULT_IPC_CLIENT_FACTORY_CLASS = "org.apache.hadoop.yarn.factories.impl.pb.RpcClientFactoryPBImpl";
在 YARN 中並未使用Hadoop自帶的Writable來做序列化,而是使用 Protocol Buffers 作為預設的序列化機制,這帶來的好處主要有以下幾點:
- 繼承Protocol Buffers的優點:Protocol Buffers已被實踐證明其擁有高效性、可擴充套件性、緊湊性以及跨語言性等特點。
- 支援線上升級回滾:在Hadoop 2.x版本後,新增的HA方案,該方案能夠進行主備切換,在不停止NNA節點服務的前提下,能夠線上升級版本。
3.YARN的RPC示例
YARN 的工作流程是先定義通訊協議介面ResourceTracker,它包含2個函式,具體程式碼如下所示:
- ResourceTracker:
public interface ResourceTracker { @Idempotent public RegisterNodeManagerResponse registerNodeManager( RegisterNodeManagerRequest request) throws YarnException, IOException; @AtMostOnce public NodeHeartbeatResponse nodeHeartbeat(NodeHeartbeatRequest request) throws YarnException, IOException; }
這裡ResourceTracker提供了Protocol Buffers定義和Java實現,其中設計的Protocol Buffers檔案有:ResourceTracker.proto、yarn_server_common_service_protos.proto和yarn_server_common_protos.proto,檔案路徑在Hadoop的原始碼包的 hadoop-2.6.0-src/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/proto,這裡就不貼出3個檔案的具體程式碼類,大家可以到該目錄去閱讀這部分程式碼。這裡需要注意的是,若是大家要編譯這些檔案需要安裝 ProtoBuf 的編譯環境,環境安裝較為簡單,這裡給大家簡要說明下。
首先是下載ProtoBuf的安裝包,然後解壓,進入到解壓目錄,編譯安裝。命令如下:
./configure --prefix=/home/work /protobuf/ make && make install
最後編譯 .proto 檔案的命令:
protoc ./ResourceTracker.proto --java_out=./
下面,我們去收取Hadoop原始碼到本地工程,執行除錯相關程式碼。
-
TestYarnServerApiClasses:
public class TestYarnServerApiClasses { // ...... // 列舉測試4個方法 @Test public void testRegisterNodeManagerResponsePBImpl() { RegisterNodeManagerResponsePBImpl original = new RegisterNodeManagerResponsePBImpl(); original.setContainerTokenMasterKey(getMasterKey()); original.setNMTokenMasterKey(getMasterKey()); original.setNodeAction(NodeAction.NORMAL); original.setDiagnosticsMessage("testDiagnosticMessage"); RegisterNodeManagerResponsePBImpl copy = new RegisterNodeManagerResponsePBImpl( original.getProto()); assertEquals(1, copy.getContainerTokenMasterKey().getKeyId()); assertEquals(1, copy.getNMTokenMasterKey().getKeyId()); assertEquals(NodeAction.NORMAL, copy.getNodeAction()); assertEquals("testDiagnosticMessage", copy.getDiagnosticsMessage()); } @Test public void testNodeHeartbeatRequestPBImpl() { NodeHeartbeatRequestPBImpl original = new NodeHeartbeatRequestPBImpl(); original.setLastKnownContainerTokenMasterKey(getMasterKey()); original.setLastKnownNMTokenMasterKey(getMasterKey()); original.setNodeStatus(getNodeStatus()); NodeHeartbeatRequestPBImpl copy = new NodeHeartbeatRequestPBImpl( original.getProto()); assertEquals(1, copy.getLastKnownContainerTokenMasterKey().getKeyId()); assertEquals(1, copy.getLastKnownNMTokenMasterKey().getKeyId()); assertEquals("localhost", copy.getNodeStatus().getNodeId().getHost()); } @Test public void testNodeHeartbeatResponsePBImpl() { NodeHeartbeatResponsePBImpl original = new NodeHeartbeatResponsePBImpl(); original.setDiagnosticsMessage("testDiagnosticMessage"); original.setContainerTokenMasterKey(getMasterKey()); original.setNMTokenMasterKey(getMasterKey()); original.setNextHeartBeatInterval(1000); original.setNodeAction(NodeAction.NORMAL); original.setResponseId(100); NodeHeartbeatResponsePBImpl copy = new NodeHeartbeatResponsePBImpl( original.getProto()); assertEquals(100, copy.getResponseId()); assertEquals(NodeAction.NORMAL, copy.getNodeAction()); assertEquals(1000, copy.getNextHeartBeatInterval()); assertEquals(1, copy.getContainerTokenMasterKey().getKeyId()); assertEquals(1, copy.getNMTokenMasterKey().getKeyId()); assertEquals("testDiagnosticMessage", copy.getDiagnosticsMessage()); } @Test public void testRegisterNodeManagerRequestPBImpl() { RegisterNodeManagerRequestPBImpl original = new RegisterNodeManagerRequestPBImpl(); original.setHttpPort(8080); original.setNodeId(getNodeId()); Resource resource = recordFactory.newRecordInstance(Resource.class); resource.setMemory(10000); resource.setVirtualCores(2); original.setResource(resource); RegisterNodeManagerRequestPBImpl copy = new RegisterNodeManagerRequestPBImpl( original.getProto()); assertEquals(8080, copy.getHttpPort()); assertEquals(9090, copy.getNodeId().getPort()); assertEquals(10000, copy.getResource().getMemory()); assertEquals(2, copy.getResource().getVirtualCores()); } }
-
TestResourceTrackerPBClientImpl:
public class TestResourceTrackerPBClientImpl { private static ResourceTracker client; private static Server server; private final static org.apache.hadoop.yarn.factories.RecordFactory recordFactory = RecordFactoryProvider .getRecordFactory(null); @BeforeClass public static void start() { System.out.println("Start client test"); InetSocketAddress address = new InetSocketAddress(0); Configuration configuration = new Configuration(); ResourceTracker instance = new ResourceTrackerTestImpl(); server = RpcServerFactoryPBImpl.get().getServer(ResourceTracker.class, instance, address, configuration, null, 1); server.start(); client = (ResourceTracker) RpcClientFactoryPBImpl.get().getClient(ResourceTracker.class, 1, NetUtils.getConnectAddress(server), configuration); } @AfterClass public static void stop() { System.out.println("Stop client"); if (server != null) { server.stop(); } } /** * Test the method registerNodeManager. Method should return a not null * result. * */ @Test public void testResourceTrackerPBClientImpl() throws Exception { RegisterNodeManagerRequest request = recordFactory.newRecordInstance(RegisterNodeManagerRequest.class); assertNotNull(client.registerNodeManager(request)); ResourceTrackerTestImpl.exception = true; try { client.registerNodeManager(request); fail("there should be YarnException"); } catch (YarnException e) { assertTrue(e.getMessage().startsWith("testMessage")); } finally { ResourceTrackerTestImpl.exception = false; } } /** * Test the method nodeHeartbeat. Method should return a not null result. * */ @Test public void testNodeHeartbeat() throws Exception { NodeHeartbeatRequest request = recordFactory.newRecordInstance(NodeHeartbeatRequest.class); assertNotNull(client.nodeHeartbeat(request)); ResourceTrackerTestImpl.exception = true; try { client.nodeHeartbeat(request); fail("there should be YarnException"); } catch (YarnException e) { assertTrue(e.getMessage().startsWith("testMessage")); } finally { ResourceTrackerTestImpl.exception = false; } } public static class ResourceTrackerTestImpl implements ResourceTracker { public static boolean exception = false; public RegisterNodeManagerResponse registerNodeManager(RegisterNodeManagerRequest request) throws YarnException, IOException { if (exception) { throw new YarnException("testMessage"); } return recordFactory.newRecordInstance(RegisterNodeManagerResponse.class); } public NodeHeartbeatResponse nodeHeartbeat(NodeHeartbeatRequest request) throws YarnException, IOException { if (exception) { throw new YarnException("testMessage"); } return recordFactory.newRecordInstance(NodeHeartbeatResponse.class); } } }
4.截圖預覽
接下來,我們使用JUnit去測試程式碼,截圖預覽如下所示:
- 對testRegisterNodeManagerRequestPBImpl()方法的一個DEBUG除錯
-
testResourceTrackerPBClientImpl()方法的DEBUG除錯
這裡由於設定exception的狀態為true,在呼叫registerNodeManager()時,會列印一條測試異常資訊。
if (exception) { throw new YarnException("testMessage"); }
5.總結
在學習Hadoop YARN的RPC時,可以先了解Hadoop的RPC機制,這樣在接觸YARN的RPC的會比較好理解,YARN的RPC只是其中的一部分,後續會給大家分享更多關於YARN的內容。
6.結束語
這篇部落格就和大家分享到這裡,如果大家在研究學習的過程當中有什麼問題,可以加群進行討論或傳送郵件給我,我會盡我所能為您解答,與君共勉!