Hadoop3.2.1 【 YARN 】原始碼分析 :AdminService 淺析

張伯毅發表於2020-12-11

一. 前言

AdminService是一個RPC Server [ 8033埠 ], 但它的服務物件是管理員。 在YARN中, 管理員列表由屬性yarn.admin.acl指定(在yarn-site.xml中設定) , 預設情況下, 屬性值為“*”, 表示所有使用者都是管理員。 從實現角度看,它是一個實現了ResourceManagerAdministrationProtocol協議的服務

在ResourceManager中, ClientRMService和AdminService兩個服務分別負責處理來自普通使用者和管理員的請求, 需要注意的是, 之所以讓這兩類請求通過兩個不同的通訊通道傳送給ResourceManager, 是因為要避免普通使用者請求過多導致管理員請求被阻塞而遲遲得不到處理。

在這裡插入圖片描述

二.協議

Admin與RM之間的通訊協議, Admin通過該RPC協議更新系統配置檔案, 例如節點黑白名單等。ResourceManagerAdministrationProtocol繼成了GetUserMappingsProtocol協議介面.
介面資訊如下:
在這裡插入圖片描述

方法名稱描述
refreshQueues重新整理佇列
refreshNodes重新整理節點
refreshSuperUserGroupsConfiguration重新整理配置
refreshUserToGroupsMappings重新整理使用者->使用者組對映資訊
refreshAdminAcls重新整理Admin的ACL資訊
refreshServiceAcls重新整理服務級別資訊(SLA)
updateNodeResource更新在RM端維護的RMNode資源資訊
refreshNodesResources重新整理node資源資訊
addToClusterNodeLabels向叢集中節點新增Label
removeFromClusterNodeLabels移除叢集中節點Label
replaceLabelsOnNode替換叢集中節點Label
checkForDecommissioningNodes檢查停用的節點
refreshClusterMaxPriority重新整理群集最大優先順序
mapAttributesToNodes獲取attribute -> node 資訊

三.方法

3.1. refreshQueues

重新整理佇列


  @Private
  public void refreshQueues() throws IOException, YarnException {
    Configuration conf = loadNewConfiguration();
    
    // ResourceScheduler 重新加在配置檔案
    rm.getRMContext().getScheduler().reinitialize(conf,
        this.rm.getRMContext());
    // refresh the reservation system
    ReservationSystem rSystem = rm.getRMContext().getReservationSystem();
    if (rSystem != null) {
      rSystem.reinitialize(conf, rm.getRMContext());
    }
  }

3.2. refreshNodes

呼叫NodesListManager 重新整理節點資訊.


  @Override
  public RefreshNodesResponse refreshNodes(RefreshNodesRequest request)
      throws YarnException, StandbyException {
    final String operation = "refreshNodes";
    final String msg = "refresh nodes.";
    UserGroupInformation user = checkAcls("refreshNodes");

    checkRMStatus(user.getShortUserName(), operation, msg);

    try {
      Configuration conf =
          getConfiguration(new Configuration(false),
              YarnConfiguration.YARN_SITE_CONFIGURATION_FILE);
      switch (request.getDecommissionType()) {
      case NORMAL:
        rm.getRMContext().getNodesListManager().refreshNodes(conf);
        break;
      case GRACEFUL:
        rm.getRMContext().getNodesListManager().refreshNodesGracefully(
            conf, request.getDecommissionTimeout());
        break;
      case FORCEFUL:
        rm.getRMContext().getNodesListManager().refreshNodesForcefully();
        break;
      }
      RMAuditLogger.logSuccess(user.getShortUserName(), operation,
          "AdminService");
      return recordFactory.newRecordInstance(RefreshNodesResponse.class);
    } catch (IOException ioe) {
      throw logAndWrapException(ioe, user.getShortUserName(), operation, msg);
    }
  }

3.3. addToClusterNodeLabels

通過NodeLabelManager向叢集新增標籤資訊


  @Override
  public AddToClusterNodeLabelsResponse addToClusterNodeLabels(AddToClusterNodeLabelsRequest request)
      throws YarnException, IOException {
    final String operation = "addToClusterNodeLabels";
    final String msg = "add labels.";
    UserGroupInformation user = checkAcls(operation);

    checkRMStatus(user.getShortUserName(), operation, msg);

    AddToClusterNodeLabelsResponse response =
        recordFactory.newRecordInstance(AddToClusterNodeLabelsResponse.class);
    try {
      // 使用RMNodeLabelsManager新增標籤資訊
      rm.getRMContext().getNodeLabelManager()
          .addToCluserNodeLabels(request.getNodeLabels());
      RMAuditLogger.logSuccess(user.getShortUserName(), operation,
          "AdminService");
      return response;
    } catch (IOException ioe) {
      throw logAndWrapException(ioe, user.getShortUserName(), operation, msg);
    }
  }

3.4. removeFromClusterNodeLabels

通過NodeLabelManager向叢集移除標籤資訊


  @Override
  public RemoveFromClusterNodeLabelsResponse removeFromClusterNodeLabels(
      RemoveFromClusterNodeLabelsRequest request) throws YarnException, IOException {
    final String operation = "removeFromClusterNodeLabels";
    final String msg = "remove labels.";

    UserGroupInformation user = checkAcls(operation);

    checkRMStatus(user.getShortUserName(), operation, msg);

    RemoveFromClusterNodeLabelsResponse response =
        recordFactory.newRecordInstance(RemoveFromClusterNodeLabelsResponse.class);
    try {
      // 移除標籤資訊
      rm.getRMContext().getNodeLabelManager()
          .removeFromClusterNodeLabels(request.getNodeLabels());
      RMAuditLogger
          .logSuccess(user.getShortUserName(), operation, "AdminService");
      return response;
    } catch (IOException ioe) {
      throw logAndWrapException(ioe, user.getShortUserName(), operation, msg);
    }
  }

3.5. replaceLabelsOnNode

通過NodeLabelManager向Node節點替換標籤資訊

// 替換 標籤資訊
      rm.getRMContext().getNodeLabelManager().replaceLabelsOnNode(
          request.getNodeToLabels());

3.6. refreshSuperUserGroupsConfiguration

重新整理 超級使用者配置資訊


  private void refreshSuperUserGroupsConfiguration()
      throws IOException, YarnException {
    // Accept hadoop common configs in core-site.xml as well as RM specific
    // configurations in yarn-site.xml
    Configuration conf =
        getConfiguration(new Configuration(false),
            YarnConfiguration.CORE_SITE_CONFIGURATION_FILE,
            YarnConfiguration.YARN_SITE_CONFIGURATION_FILE);
    // 重新整理配置 yarn-site.xml 和 core-site.xml
    RMServerUtils.processRMProxyUsersConf(conf);
    ProxyUsers.refreshSuperUserGroupsConfiguration(conf);
  }

3.7. refreshUserToGroupsMappings

3.8. updateNodeResource

// update resource to RMNode
        this.rm.getRMContext().getDispatcher().getEventHandler()
          .handle(new RMNodeResourceUpdateEvent(nodeId, newResourceOption));

3.9. refreshNodesResources

使用ResourceTrackerService 更新資源資訊

// refresh dynamic resource in ResourceTrackerService
      this.rm.getRMContext().getResourceTrackerService().
          updateDynamicResourceConfiguration(newConf);

相關文章