Nacos - 服務端處理心跳請求

大軍發表於2021-01-05

服務端用InstanceController#beat方法接收心跳請求。

InstanceController#beat

這裡會判斷是否已經有例項,如果沒有就建立例項,然後再開始檢查心跳。

public ObjectNode beat(HttpServletRequest request) throws Exception {
        
    ObjectNode result = JacksonUtils.createEmptyJsonNode();
    // 設定心跳時間,會直接改客戶端的心跳時間
    result.put(SwitchEntry.CLIENT_BEAT_INTERVAL, switchDomain.getClientBeatInterval());
    String beat = WebUtils.optional(request, "beat", StringUtils.EMPTY);
    // 其他略
    // 透過namespaceId, serviceName, clusterName, ip, port獲取Instance
    Instance instance = serviceManager.getInstance(namespaceId, serviceName, clusterName, ip, port);
    // 如果沒有,則註冊
    if (instance == null) {
        // 這個是透過beat判斷的,如果是第一次,則beat有資訊,就會建立clientBeat
        // 如果不是第一次,正常instance不為空的,所以此時為空說明可能被移除了
        if (clientBeat == null) {
            result.put(CommonParams.CODE, NamingResponseCode.RESOURCE_NOT_FOUND);
            return result;
        }
        // 其他略
        // 註冊
        serviceManager.registerInstance(namespaceId, serviceName, instance);
    }
    // 從serviceMap快取獲取Service
    Service service = serviceManager.getService(namespaceId, serviceName);
    
    if (service == null) {
        throw new NacosException(NacosException.SERVER_ERROR,
                "service not found: " + serviceName + "@" + namespaceId);
    }
    // 不是第一次,組裝clientBeat
    if (clientBeat == null) {
        clientBeat = new RsInfo();
        clientBeat.setIp(ip);
        clientBeat.setPort(port);
        clientBeat.setCluster(clusterName);
    }
    // 處理心跳
    service.processClientBeat(clientBeat);
    
    result.put(CommonParams.CODE, NamingResponseCode.OK);
    if (instance.containsMetadata(PreservedMetadataKeys.HEART_BEAT_INTERVAL)) {
        result.put(SwitchEntry.CLIENT_BEAT_INTERVAL, instance.getInstanceHeartBeatInterval());
    }
    result.put(SwitchEntry.LIGHT_BEAT_ENABLED, switchDomain.isLightBeatEnabled());
    return result;
}

ServiceManager#getInstance

透過ip和埠獲取例項

public Instance getInstance(String namespaceId, String serviceName, String cluster, String ip, int port) {
    // 從serviceMap快取獲取Service
    Service service = getService(namespaceId, serviceName);
    if (service == null) {
        return null;
    }
    
    List<String> clusters = new ArrayList<>();
    clusters.add(cluster);
    // 從clusters叢集獲取Instance集合
    List<Instance> ips = service.allIPs(clusters);
    if (ips == null || ips.isEmpty()) {
        return null;
    }
    // 透過ip和埠獲取例項
    for (Instance instance : ips) {
        if (instance.getIp().equals(ip) && instance.getPort() == port) {
            return instance;
        }
    }
    
    return null;
}

Service#processClientBeat

封裝Runnable物件,放入執行緒池。

public void processClientBeat(final RsInfo rsInfo) {
    // 建立ClientBeatProcessor物件,這個是Runnable,所以執行緒池會呼叫他的run方法
    ClientBeatProcessor clientBeatProcessor = new ClientBeatProcessor();
    clientBeatProcessor.setService(this);
    clientBeatProcessor.setRsInfo(rsInfo);
    HealthCheckReactor.scheduleNow(clientBeatProcessor);
}

ClientBeatProcessor#run

找到對應的Instance,設定最後心跳時間,並設定為健康的,最後廣播訊息。

public void run() {
    Service service = this.service;
    if (Loggers.EVT_LOG.isDebugEnabled()) {
        Loggers.EVT_LOG.debug("[CLIENT-BEAT] processing beat: {}", rsInfo.toString());
    }
    
    String ip = rsInfo.getIp();
    String clusterName = rsInfo.getCluster();
    int port = rsInfo.getPort();
    Cluster cluster = service.getClusterMap().get(clusterName);
    // 獲取所有Instance
    List<Instance> instances = cluster.allIPs(true);
    
    for (Instance instance : instances) {
        //  透過ip和埠獲取Instance
        if (instance.getIp().equals(ip) && instance.getPort() == port) {
            if (Loggers.EVT_LOG.isDebugEnabled()) {
                Loggers.EVT_LOG.debug("[CLIENT-BEAT] refresh beat: {}", rsInfo.toString());
            }
            // 設定最後心跳時間
            instance.setLastBeat(System.currentTimeMillis());
            // 沒有被標記且不不健康的,設定為健康
            if (!instance.isMarked()) {
                if (!instance.isHealthy()) {
                    instance.setHealthy(true);
                    Loggers.EVT_LOG
                            .info("service: {} {POS} {IP-ENABLED} valid: {}:{}@{}, region: {}, msg: client beat ok",
                                    cluster.getService().getName(), ip, port, cluster.getName(),
                                    UtilsAndCommons.LOCALHOST_SITE);
                    // 廣播訊息
                    getPushService().serviceChanged(service);
                }
            }
        }
    }
}

PushService#onApplicationEvent

廣播訊息後,監聽ServiceChangeEvent型別的類會呼叫onApplicationEvent方法。這裡主要是封裝UDP資料併傳送。

public void onApplicationEvent(ServiceChangeEvent event) {
    Service service = event.getService();
    String serviceName = service.getName();
    String namespaceId = service.getNamespaceId();

    Future future = GlobalExecutor.scheduleUdpSender(() -> {
        try {
            Loggers.PUSH.info(serviceName + " is changed, add it to push queue.");
            ConcurrentMap<String, PushClient> clients = clientMap
                    .get(UtilsAndCommons.assembleFullServiceName(namespaceId, serviceName));
            if (MapUtils.isEmpty(clients)) {
                return;
            }

            Map<String, Object> cache = new HashMap<>(16);
            long lastRefTime = System.nanoTime();
            // 遍歷PushClient集合
            for (PushClient client : clients.values()) {
                // 過期了就算了
                if (client.zombie()) {
                    Loggers.PUSH.debug("client is zombie: " + client.toString());
                    clients.remove(client.toString());
                    Loggers.PUSH.debug("client is zombie: " + client.toString());
                    continue;
                }

                Receiver.AckEntry ackEntry;
                Loggers.PUSH.debug("push serviceName: {} to client: {}", serviceName, client.toString());
                String key = getPushCacheKey(serviceName, client.getIp(), client.getAgent());
                byte[] compressData = null;
                Map<String, Object> data = null;
                if (switchDomain.getDefaultPushCacheMillis() >= 20000 && cache.containsKey(key)) {
                    org.javatuples.Pair pair = (org.javatuples.Pair) cache.get(key);
                    compressData = (byte[]) (pair.getValue0());
                    data = (Map<String, Object>) pair.getValue1();

                    Loggers.PUSH.debug("[PUSH-CACHE] cache hit: {}:{}", serviceName, client.getAddrStr());
                }
                // 封裝UDP資料,如果資料大於1kb則壓縮,compressIfNecessary這個方法判斷
                if (compressData != null) {
                    ackEntry = prepareAckEntry(client, compressData, data, lastRefTime);
                } else {
                    ackEntry = prepareAckEntry(client, prepareHostsData(client), lastRefTime);
                    if (ackEntry != null) {
                        cache.put(key, new org.javatuples.Pair<>(ackEntry.origin.getData(), ackEntry.data));
                    }
                }

                Loggers.PUSH.info("serviceName: {} changed, schedule push for: {}, agent: {}, key: {}",
                        client.getServiceName(), client.getAddrStr(), client.getAgent(),
                        (ackEntry == null ? null : ackEntry.key));
                // 傳送udp資料
                udpPush(ackEntry);
            }
        } catch (Exception e) {
            Loggers.PUSH.error("[NACOS-PUSH] failed to push serviceName: {} to client, error: {}", serviceName, e);

        } finally {
            futureMap.remove(UtilsAndCommons.assembleFullServiceName(namespaceId, serviceName));
        }

    }, 1000, TimeUnit.MILLISECONDS);

    futureMap.put(UtilsAndCommons.assembleFullServiceName(namespaceId, serviceName), future);

}

PushService#udpPush

傳送UDP資料,會重試10次。每10秒檢查一次。

private static Receiver.AckEntry udpPush(Receiver.AckEntry ackEntry) {
    if (ackEntry == null) {
        Loggers.PUSH.error("[NACOS-PUSH] ackEntry is null.");
        return null;
    }
    // 重試最大次數還沒成功,就刪除ackMap和udpSendTimeMap的內容
    if (ackEntry.getRetryTimes() > MAX_RETRY_TIMES) {
        Loggers.PUSH.warn("max re-push times reached, retry times {}, key: {}", ackEntry.retryTimes, ackEntry.key);
        ackMap.remove(ackEntry.key);
        udpSendTimeMap.remove(ackEntry.key);
        failedPush += 1;
        return ackEntry;
    }

    try {
        if (!ackMap.containsKey(ackEntry.key)) {
            totalPush++;
        }
        ackMap.put(ackEntry.key, ackEntry);
        udpSendTimeMap.put(ackEntry.key, System.currentTimeMillis());

        Loggers.PUSH.info("send udp packet: " + ackEntry.key);
        // udp傳送
        udpSocket.send(ackEntry.origin);

        ackEntry.increaseRetryTime();
        // 10秒檢查一次
        GlobalExecutor.scheduleRetransmitter(new Retransmitter(ackEntry),
                TimeUnit.NANOSECONDS.toMillis(ACK_TIMEOUT_NANOS), TimeUnit.MILLISECONDS);

        return ackEntry;
    } catch (Exception e) {
        Loggers.PUSH.error("[NACOS-PUSH] failed to push data: {} to client: {}, error: {}", ackEntry.data,
                ackEntry.origin.getAddress().getHostAddress(), e);
        ackMap.remove(ackEntry.key);
        udpSendTimeMap.remove(ackEntry.key);
        failedPush += 1;

        return null;
    }
}

Retransmitter#run

每10秒檢查是否傳送成功,如果沒傳送成功,就繼續傳送,最多10次。

public void run() {
    if (ackMap.containsKey(ackEntry.key)) {
        Loggers.PUSH.info("retry to push data, key: " + ackEntry.key);
        udpPush(ackEntry);
    }
}

Receiver#run

PushService建立的時候,會開啟Receiver的執行緒。

static {
    // 其他略
    Receiver receiver = new Receiver();
    Thread inThread = new Thread(receiver);
    inThread.setDaemon(true);
    inThread.setName("com.alibaba.nacos.naming.push.receiver");
    inThread.start();
    // 其他略
}

他這裡會有個while(true),收到請求後移除ackMap對應的key。

public void run() {
    while (true) {
        // 其他略
        String ackKey = getAckKey(ip, port, ackPacket.lastRefTime);
        AckEntry ackEntry = ackMap.remove(ackKey);
        // 其他略
    }
}

廣播總結

廣播的時候,會往ackMap存入值,廣播過程失敗就從ackMap移除對應的值。有時候UDP請求不成功,那這個值一直會在ackMap,這個時候,Retransmitter每隔10秒就會去ackMap看看有沒有成功,如果沒有成功,他就會去重試,直至到達重試最大次數。另外還有一個執行緒,會去監聽UDP響應,如果收到了響應,就會從ackMap移除對應的值。這個UDP是傳送給客戶端的,Nacos - HostReactor的建立提到了收到請求後的處理,讓客戶端自己去更新資訊。
image

心跳總結

主要是收到心跳請求後,更新心跳的時間、健康狀態以及廣播
image

相關文章