Hadoop2原始碼分析-YARN RPC 示例介紹

哥不是小蘿莉發表於2015-07-21

1.概述

  之前在《Hadoop2原始碼分析-RPC探索實戰》一文當中介紹了Hadoop的RPC機制,今天給大家分享關於YARN的RPC的機制。下面是今天的分享目錄:

  • YARN的RPC介紹
  • YARN的RPC示例
  • 截圖預覽

  下面開始今天的內容分享。

2.YARN的RPC介紹

  我們知道在Hadoop的RPC當中,其主要由RPC,Client及Server這三個大類組成,分別實現對外提供程式設計介面、客戶端實現及服務端實現。如下圖所示:

 

 

  圖中是Hadoop的RPC的一個類的關係圖,大家可以到《Hadoop2原始碼分析-RPC探索實戰》一文中,通過程式碼示例去理解他們之間的關係,這裡就不多做贅述了。接下來,我們去看Yarn的RPC。

  Yarn對外提供的是YarnRPC這個類,這是一個抽象類,通過閱讀YarnRPC的原始碼可以知道,實際的實現由引數yarn.ipc.rpc.class設定,預設情況下,其值為:org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC,部分程式碼如下:

  • YarnRPC:
public abstract class YarnRPC {
   // ......

    public static YarnRPC create(Configuration conf) {
    LOG.debug("Creating YarnRPC for " + 
        conf.get(YarnConfiguration.IPC_RPC_IMPL));
    String clazzName = conf.get(YarnConfiguration.IPC_RPC_IMPL);
    if (clazzName == null) {
      clazzName = YarnConfiguration.DEFAULT_IPC_RPC_IMPL;
    }
    try {
      return (YarnRPC) Class.forName(clazzName).newInstance();
    } catch (Exception e) {
      throw new YarnRuntimeException(e);
    }
  }

}
  • YarnConfiguration類:
public class YarnConfiguration extends Configuration {

  //Configurations
  public static final String YARN_PREFIX = "yarn.";

  ////////////////////////////////
  // IPC Configs
  ////////////////////////////////
  public static final String IPC_PREFIX = YARN_PREFIX + "ipc.";
  /** RPC class implementation*/
  public static final String IPC_RPC_IMPL =
    IPC_PREFIX + "rpc.class";
  public static final String DEFAULT_IPC_RPC_IMPL = 
    "org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC";
}

  而HadoopYarnProtoRPC 通過 RPC 的 RpcFactoryProvider 生成客戶端工廠(由引數 yarn.ipc.client.factory.class 指定,預設值是 org.apache.hadoop.yarn.factories.impl.pb.RpcClientFactoryPBImpl)和伺服器工廠 (由引數 yarn.ipc.server.factory.class 指定,預設值是 org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl),以根據通訊協議的 Protocol Buffers 定義生成客戶端物件和伺服器物件。相關類的部分程式碼如下:

  • HadoopYarnProtoRPC
public class HadoopYarnProtoRPC extends YarnRPC {

  private static final Log LOG = LogFactory.getLog(HadoopYarnProtoRPC.class);

  @Override
  public Object getProxy(Class protocol, InetSocketAddress addr,
      Configuration conf) {
    LOG.debug("Creating a HadoopYarnProtoRpc proxy for protocol " + protocol);
    return RpcFactoryProvider.getClientFactory(conf).getClient(protocol, 1,
        addr, conf);
  }

  @Override
  public void stopProxy(Object proxy, Configuration conf) {
    RpcFactoryProvider.getClientFactory(conf).stopClient(proxy);
  }

  @Override
  public Server getServer(Class protocol, Object instance,
      InetSocketAddress addr, Configuration conf,
      SecretManager<? extends TokenIdentifier> secretManager,
      int numHandlers, String portRangeConfig) {
    LOG.debug("Creating a HadoopYarnProtoRpc server for protocol " + protocol + 
        " with " + numHandlers + " handlers");
    
    return RpcFactoryProvider.getServerFactory(conf).getServer(protocol, 
        instance, addr, conf, secretManager, numHandlers, portRangeConfig);

  }

}
  • RpcFactoryProvider

public class RpcFactoryProvider {

  // ......

  public static RpcClientFactory getClientFactory(Configuration conf) {
    String clientFactoryClassName = conf.get(
        YarnConfiguration.IPC_CLIENT_FACTORY_CLASS,
        YarnConfiguration.DEFAULT_IPC_CLIENT_FACTORY_CLASS);
    return (RpcClientFactory) getFactoryClassInstance(clientFactoryClassName);
  }

  //......
  
}
/** Factory to create client IPC classes.*/
  public static final String IPC_CLIENT_FACTORY_CLASS =
    IPC_PREFIX + "client.factory.class";
  public static final String DEFAULT_IPC_CLIENT_FACTORY_CLASS = 
      "org.apache.hadoop.yarn.factories.impl.pb.RpcClientFactoryPBImpl";

  在 YARN 中並未使用Hadoop自帶的Writable來做序列化,而是使用 Protocol Buffers 作為預設的序列化機制,這帶來的好處主要有以下幾點:

  • 繼承Protocol Buffers的優點:Protocol Buffers已被實踐證明其擁有高效性、可擴充套件性、緊湊性以及跨語言性等特點。
  • 支援線上升級回滾:在Hadoop 2.x版本後,新增的HA方案,該方案能夠進行主備切換,在不停止NNA節點服務的前提下,能夠線上升級版本。

3.YARN的RPC示例

  YARN 的工作流程是先定義通訊協議介面ResourceTracker,它包含2個函式,具體程式碼如下所示:

  • ResourceTracker:
public interface ResourceTracker {
  
  @Idempotent
  public RegisterNodeManagerResponse registerNodeManager(
      RegisterNodeManagerRequest request) throws YarnException,
      IOException;

  @AtMostOnce
  public NodeHeartbeatResponse nodeHeartbeat(NodeHeartbeatRequest request)
      throws YarnException, IOException;

}

  這裡ResourceTracker提供了Protocol Buffers定義和Java實現,其中設計的Protocol Buffers檔案有:ResourceTracker.proto、yarn_server_common_service_protos.proto和yarn_server_common_protos.proto,檔案路徑在Hadoop的原始碼包的 hadoop-2.6.0-src/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/proto,這裡就不貼出3個檔案的具體程式碼類,大家可以到該目錄去閱讀這部分程式碼。這裡需要注意的是,若是大家要編譯這些檔案需要安裝 ProtoBuf 的編譯環境,環境安裝較為簡單,這裡給大家簡要說明下。

  首先是下載ProtoBuf的安裝包,然後解壓,進入到解壓目錄,編譯安裝。命令如下:

./configure --prefix=/home/work /protobuf/  

make && make install

最後編譯 .proto 檔案的命令:

protoc ./ResourceTracker.proto  --java_out=./

  下面,我們去收取Hadoop原始碼到本地工程,執行除錯相關程式碼。

  • TestYarnServerApiClasses:

public class TestYarnServerApiClasses {

  // ......

  // 列舉測試4個方法  

@Test
  public void testRegisterNodeManagerResponsePBImpl() {
    RegisterNodeManagerResponsePBImpl original =
        new RegisterNodeManagerResponsePBImpl();
    original.setContainerTokenMasterKey(getMasterKey());
    original.setNMTokenMasterKey(getMasterKey());
    original.setNodeAction(NodeAction.NORMAL);
    original.setDiagnosticsMessage("testDiagnosticMessage");

    RegisterNodeManagerResponsePBImpl copy =
        new RegisterNodeManagerResponsePBImpl(
            original.getProto());
    assertEquals(1, copy.getContainerTokenMasterKey().getKeyId());
    assertEquals(1, copy.getNMTokenMasterKey().getKeyId());
    assertEquals(NodeAction.NORMAL, copy.getNodeAction());
    assertEquals("testDiagnosticMessage", copy.getDiagnosticsMessage());

  }

@Test
  public void testNodeHeartbeatRequestPBImpl() {
    NodeHeartbeatRequestPBImpl original = new NodeHeartbeatRequestPBImpl();
    original.setLastKnownContainerTokenMasterKey(getMasterKey());
    original.setLastKnownNMTokenMasterKey(getMasterKey());
    original.setNodeStatus(getNodeStatus());
    NodeHeartbeatRequestPBImpl copy = new NodeHeartbeatRequestPBImpl(
        original.getProto());
    assertEquals(1, copy.getLastKnownContainerTokenMasterKey().getKeyId());
    assertEquals(1, copy.getLastKnownNMTokenMasterKey().getKeyId());
    assertEquals("localhost", copy.getNodeStatus().getNodeId().getHost());
  }

@Test
  public void testNodeHeartbeatResponsePBImpl() {
    NodeHeartbeatResponsePBImpl original = new NodeHeartbeatResponsePBImpl();

    original.setDiagnosticsMessage("testDiagnosticMessage");
    original.setContainerTokenMasterKey(getMasterKey());
    original.setNMTokenMasterKey(getMasterKey());
    original.setNextHeartBeatInterval(1000);
    original.setNodeAction(NodeAction.NORMAL);
    original.setResponseId(100);

    NodeHeartbeatResponsePBImpl copy = new NodeHeartbeatResponsePBImpl(
        original.getProto());
    assertEquals(100, copy.getResponseId());
    assertEquals(NodeAction.NORMAL, copy.getNodeAction());
    assertEquals(1000, copy.getNextHeartBeatInterval());
    assertEquals(1, copy.getContainerTokenMasterKey().getKeyId());
    assertEquals(1, copy.getNMTokenMasterKey().getKeyId());
    assertEquals("testDiagnosticMessage", copy.getDiagnosticsMessage());
  }

@Test
  public void testRegisterNodeManagerRequestPBImpl() {
    RegisterNodeManagerRequestPBImpl original = new RegisterNodeManagerRequestPBImpl();
    original.setHttpPort(8080);
    original.setNodeId(getNodeId());
    Resource resource = recordFactory.newRecordInstance(Resource.class);
    resource.setMemory(10000);
    resource.setVirtualCores(2);
    original.setResource(resource);
    RegisterNodeManagerRequestPBImpl copy = new RegisterNodeManagerRequestPBImpl(
        original.getProto());

    assertEquals(8080, copy.getHttpPort());
    assertEquals(9090, copy.getNodeId().getPort());
    assertEquals(10000, copy.getResource().getMemory());
    assertEquals(2, copy.getResource().getVirtualCores());

  }

}
  • TestResourceTrackerPBClientImpl:

public class TestResourceTrackerPBClientImpl {

    private static ResourceTracker client;
    private static Server server;
    private final static org.apache.hadoop.yarn.factories.RecordFactory recordFactory = RecordFactoryProvider
            .getRecordFactory(null);

    @BeforeClass
    public static void start() {

        System.out.println("Start client test");

        InetSocketAddress address = new InetSocketAddress(0);
        Configuration configuration = new Configuration();
        ResourceTracker instance = new ResourceTrackerTestImpl();
        server = RpcServerFactoryPBImpl.get().getServer(ResourceTracker.class, instance, address, configuration, null,
                1);
        server.start();

        client = (ResourceTracker) RpcClientFactoryPBImpl.get().getClient(ResourceTracker.class, 1,
                NetUtils.getConnectAddress(server), configuration);

    }

    @AfterClass
    public static void stop() {

        System.out.println("Stop client");

        if (server != null) {
            server.stop();
        }
    }

    /**
     * Test the method registerNodeManager. Method should return a not null
     * result.
     * 
     */
    @Test
    public void testResourceTrackerPBClientImpl() throws Exception {
        RegisterNodeManagerRequest request = recordFactory.newRecordInstance(RegisterNodeManagerRequest.class);
        assertNotNull(client.registerNodeManager(request));

        ResourceTrackerTestImpl.exception = true;
        try {
            client.registerNodeManager(request);
            fail("there should be YarnException");
        } catch (YarnException e) {
            assertTrue(e.getMessage().startsWith("testMessage"));
        } finally {
            ResourceTrackerTestImpl.exception = false;
        }

    }

    /**
     * Test the method nodeHeartbeat. Method should return a not null result.
     * 
     */

    @Test
    public void testNodeHeartbeat() throws Exception {
        NodeHeartbeatRequest request = recordFactory.newRecordInstance(NodeHeartbeatRequest.class);
        assertNotNull(client.nodeHeartbeat(request));

        ResourceTrackerTestImpl.exception = true;
        try {
            client.nodeHeartbeat(request);
            fail("there  should be YarnException");
        } catch (YarnException e) {
            assertTrue(e.getMessage().startsWith("testMessage"));
        } finally {
            ResourceTrackerTestImpl.exception = false;
        }

    }

    public static class ResourceTrackerTestImpl implements ResourceTracker {

        public static boolean exception = false;

        public RegisterNodeManagerResponse registerNodeManager(RegisterNodeManagerRequest request)
                throws YarnException, IOException {
            if (exception) {
                throw new YarnException("testMessage");
            }
            return recordFactory.newRecordInstance(RegisterNodeManagerResponse.class);
        }

        public NodeHeartbeatResponse nodeHeartbeat(NodeHeartbeatRequest request) throws YarnException, IOException {
            if (exception) {
                throw new YarnException("testMessage");
            }
            return recordFactory.newRecordInstance(NodeHeartbeatResponse.class);
        }

    }
}

4.截圖預覽

  接下來,我們使用JUnit去測試程式碼,截圖預覽如下所示:

  • 對testRegisterNodeManagerRequestPBImpl()方法的一個DEBUG除錯

  • testResourceTrackerPBClientImpl()方法的DEBUG除錯

  這裡由於設定exception的狀態為true,在呼叫registerNodeManager()時,會列印一條測試異常資訊。

if (exception) {
  throw new YarnException("testMessage");
}

5.總結

  在學習Hadoop YARN的RPC時,可以先了解Hadoop的RPC機制,這樣在接觸YARN的RPC的會比較好理解,YARN的RPC只是其中的一部分,後續會給大家分享更多關於YARN的內容。

6.結束語

  這篇部落格就和大家分享到這裡,如果大家在研究學習的過程當中有什麼問題,可以加群進行討論或傳送郵件給我,我會盡我所能為您解答,與君共勉!

相關文章