Hystrix指標視窗實現原理

weixin_33763244發表於2018-03-29

一、引子

Hystrix是一個熔斷中介軟體,能夠實現fast-fail並走備用方案。Hystrix基於滑動視窗判定服務失敗佔比選擇性熔斷。滑動視窗的實現方案有很多種,指標計數也有很多種實現常見的就是AtomicInteger進行原子增減維護計數,具體的方案就不探討了。

Hystrix是基於Rxjava去實現的,那麼如何利用RxJava實現指標的匯聚和滑動視窗實現呢?當然本篇不是作為教程去介紹RxJava的使用姿勢,本篇文章主要解說Hystrix是什麼一個思路完成這項功能。

二、指標資料上傳

看HystrixCommand執行的主入口

public Observable<R> toObservable() {
    final AbstractCommand<R> _cmd = this;

    final Action0 terminateCommandCleanup = new Action0() {

        @Override
        public void call() {
            if (_cmd.commandState.compareAndSet(CommandState.OBSERVABLE_CHAIN_CREATED, CommandState.TERMINAL)) {
                handleCommandEnd(false); //user code never ran
            } else if (_cmd.commandState.compareAndSet(CommandState.USER_CODE_EXECUTED, CommandState.TERMINAL)) {
                handleCommandEnd(true); //user code did run
            }
        }
    };

    //mark the command as CANCELLED and store the latency (in addition to standard cleanup)
    final Action0 unsubscribeCommandCleanup = new Action0() {
        @Override
        public void call() {
            if (_cmd.commandState.compareAndSet(CommandState.OBSERVABLE_CHAIN_CREATED, CommandState.UNSUBSCRIBED)) {
                .......省略干擾程式碼...........
                handleCommandEnd(false); //user code never ran
            } else if (_cmd.commandState.compareAndSet(CommandState.USER_CODE_EXECUTED, CommandState.UNSUBSCRIBED)) {
                .......省略干擾程式碼...........
                handleCommandEnd(true); //user code did run
            }
        }
    };

   .......省略干擾程式碼...........

    return Observable.defer(new Func0<Observable<R>>() {

    .......省略干擾程式碼...........

            return afterCache
                    .doOnTerminate(terminateCommandCleanup) 
                    .doOnUnsubscribe(unsubscribeCommandCleanup) 
                    .doOnCompleted(fireOnCompletedHook);
        }
});

我們的主入口Observable當doOnTerminate doOnUnsubscribe 的時候觸發 handleCommandEnd 方法,從字面意思就是當command執行結束處理一些事情。

private void handleCommandEnd(boolean commandExecutionStarted) {
    ........省略干擾程式碼..........
    executionResult = executionResult.markUserThreadCompletion((int) userThreadLatency);
    if (executionResultAtTimeOfCancellation == null) {
        metrics.markCommandDone(executionResult, commandKey, threadPoolKey, commandExecutionStarted);
    } else {
        metrics.markCommandDone(executionResultAtTimeOfCancellation, commandKey, threadPoolKey, commandExecutionStarted);
    }
    ........省略干擾程式碼..........
}

注意看 metrics.markCommandDone,呼叫了HystrixCommandMetrics的markCommandDone方法,把一個executionResult傳入了進來。ExecutionResult 這是個什麼鬼呢?
我們擷取部分程式碼瀏覽下

public class ExecutionResult {
    private final EventCounts eventCounts;
    private final Exception failedExecutionException;
    private final Exception executionException;
    private final long startTimestamp;
    private final int executionLatency; //time spent in run() method
    private final int userThreadLatency; //time elapsed between caller thread submitting request and response being visible to it
    private final boolean executionOccurred;
    private final boolean isExecutedInThread;
    private final HystrixCollapserKey collapserKey;

    private static final HystrixEventType[] ALL_EVENT_TYPES = HystrixEventType.values();
    private static final int NUM_EVENT_TYPES = ALL_EVENT_TYPES.length;
    private static final BitSet EXCEPTION_PRODUCING_EVENTS = new BitSet(NUM_EVENT_TYPES);
    private static final BitSet TERMINAL_EVENTS = new BitSet(NUM_EVENT_TYPES);

以大家聰慧的頭腦應該能夠猜測到這個類就是當前HystrixCommand的 執行結果記錄,只不過這個結果不僅僅是結果,也包含了各種狀態以及出現的異常。它的身影在Hystrix執行原理裡講的各Observable裡出現,跟著HystrixCommand整個生命週期。

回到上面講,當時command執行完畢後,呼叫了HystrixCommandMetrics的markCommandDone方法

void markCommandDone(ExecutionResult executionResult, HystrixCommandKey commandKey, HystrixThreadPoolKey threadPoolKey, boolean executionStarted) {
    HystrixThreadEventStream.getInstance().executionDone(executionResult, commandKey, threadPoolKey);
    if (executionStarted) {
        concurrentExecutionCount.decrementAndGet();
    }
}

最終呼叫量HystrixThreadEventStream. executionDone方法的HystrixThreadEventStream是ThreadLocal方式,和當前執行緒繫結

//HystrixThreadEventStream.threadLocalStreams
private static final ThreadLocal<HystrixThreadEventStream> threadLocalStreams = new ThreadLocal<HystrixThreadEventStream>() {
    @Override
    protected HystrixThreadEventStream initialValue() {
        return new HystrixThreadEventStream(Thread.currentThread());
    }
};

executionDone程式碼如下

public void executionDone(ExecutionResult executionResult, HystrixCommandKey commandKey, HystrixThreadPoolKey threadPoolKey) {
    HystrixCommandCompletion event = HystrixCommandCompletion.from(executionResult, commandKey, threadPoolKey);
    writeOnlyCommandCompletionSubject.onNext(event);
}

這裡根據 executionResult, threadpoolkey,comandKey,生成 了一個HystrixCommandCompletion然後通過writeOnlyCommandCompletionSubject寫入,writeOnlyCommandCompletionSubject整個東西,我們等會再看。現在思考下HystrixCommandCompletion是什麼?HystrixCommandCompletion包含了 ExecutionResultHystrixRequestContext,它是一種HystrixEvent,標識著command執行完成的一個事件,該事件是當前這個點HystrixCommand的請求資訊,執行結果,狀態等資料的載體。

從上面類圖可以看到不僅僅HystrixCommandCompletion一種還有其它的Event,這裡就不一一介紹了。

writeOnlyCommandCompletionSubject onNext的時候會觸發 writeCommandCompletionsToShardedStreams執行裡面的call()方法。

  private static final Action1<HystrixCommandCompletion> writeCommandCompletionsToShardedStreams = new Action1<HystrixCommandCompletion>() {
    @Override
    public void call(HystrixCommandCompletion commandCompletion) {
        HystrixCommandCompletionStream commandStream = HystrixCommandCompletionStream.getInstance(commandCompletion.getCommandKey());
        commandStream.write(commandCompletion);

        if (commandCompletion.isExecutedInThread() || commandCompletion.isResponseThreadPoolRejected()) {
            HystrixThreadPoolCompletionStream threadPoolStream = HystrixThreadPoolCompletionStream.getInstance(commandCompletion.getThreadPoolKey());
            threadPoolStream.write(commandCompletion);
        }
    }
};

這個方法的意思是,會把HystrixCommandCompletion 通過HystrixCommandCompletionStream 寫入,如果當前command使用的是執行緒池隔離策略的話 會通過 HystrixThreadPoolCompletionStream 再寫一遍。HystrixCommandCompletionStream HystrixThreadPoolCompletionStream 他們兩個概念類似,我們拿著前者解釋,這個是個什麼東西。
HystrixCommandCompletionStream 以commandKey為key,維護在記憶體中,呼叫它的write的方法實則是呼叫內部屬性 writeOnlySubject的方法,writeOnlySubject是一個Subject(RxJava的東西),通過SerializedSubject保證其寫入的順序性,呼叫其share()方法獲得一個Observable也就是readOnlyStream,讓外界能夠讀這個Subject的資料。總結下Subject是連線兩個Observable之間的橋樑,它有兩個泛型元素標識著進出資料型別,全部都是HystrixCommandCompletion型別

HystrixCommandCompletionStream(final HystrixCommandKey commandKey) {
        this.commandKey = commandKey;

        this.writeOnlySubject = new SerializedSubject<HystrixCommandCompletion, HystrixCommandCompletion>(PublishSubject.<HystrixCommandCompletion>create());
        this.readOnlyStream = writeOnlySubject.share();
    }

我們從源頭開始梳理,明白了這個HystrixCommandCompletion資料流是如何寫入的(其它型別的的思路一致,就不一一解釋了),那它是如何被蒐集起來呢?

三、指標資料蒐集

追溯至AbstractCommand初始化

protected AbstractCommand(HystrixCommandGroupKey group, HystrixCommandKey key, HystrixThreadPoolKey threadPoolKey, HystrixCircuitBreaker circuitBreaker, HystrixThreadPool threadPool,
        HystrixCommandProperties.Setter commandPropertiesDefaults, HystrixThreadPoolProperties.Setter threadPoolPropertiesDefaults,
        HystrixCommandMetrics metrics, TryableSemaphore fallbackSemaphore, TryableSemaphore executionSemaphore,
        HystrixPropertiesStrategy propertiesStrategy, HystrixCommandExecutionHook executionHook) {

    ........省略程式碼........
    this.metrics = initMetrics(metrics, this.commandGroup, this.threadPoolKey, this.commandKey, this.properties);
    ........省略程式碼........
}

初始化command指標

HystrixCommandMetrics(final HystrixCommandKey key, HystrixCommandGroupKey commandGroup, HystrixThreadPoolKey threadPoolKey, HystrixCommandProperties properties, HystrixEventNotifier eventNotifier) {
    super(null);
    this.key = key;
    this.group = commandGroup;
    this.threadPoolKey = threadPoolKey;
    this.properties = properties;

    healthCountsStream = HealthCountsStream.getInstance(key, properties);
    rollingCommandEventCounterStream = RollingCommandEventCounterStream.getInstance(key, properties);
    cumulativeCommandEventCounterStream = CumulativeCommandEventCounterStream.getInstance(key, properties);

    rollingCommandLatencyDistributionStream = RollingCommandLatencyDistributionStream.getInstance(key, properties);
    rollingCommandUserLatencyDistributionStream = RollingCommandUserLatencyDistributionStream.getInstance(key, properties);
    rollingCommandMaxConcurrencyStream = RollingCommandMaxConcurrencyStream.getInstance(key, properties);
}

有很多各種 XXXStream.getInstance(),這些Stream就是針對各類用途進行指標蒐集,統計的具體實現,下面可以看下他們的UML類圖

Hystrix幾個別Stream類圖(並非所有子類)

BucketedCounterStream實現了基本的桶計數器,BucketedCumulativeCounterStream基於父類實現了累計計數,BucketedRollingCounterStream基於父類實現了滑動視窗計數。兩者的子類就是對特定指標的具體實現。

接下來分兩塊累計計數和滑動視窗計數,挑選其對應的CumulativeCommandEventCounterStream和HealthCountsStream進行詳細說明。

3.1、BucketedCounterStream 基本桶的實現

資料採集示意圖

protected BucketedCounterStream(final HystrixEventStream<Event> inputEventStream, final int numBuckets, final int bucketSizeInMs,
                                    final Func2<Bucket, Event, Bucket> appendRawEventToBucket) {
    this.numBuckets = numBuckets;
    this.reduceBucketToSummary = new Func1<Observable<Event>, Observable<Bucket>>() {
        @Override
        public Observable<Bucket> call(Observable<Event> eventBucket) {
            return eventBucket.reduce(getEmptyBucketSummary(), appendRawEventToBucket);
        }
    };

    final List<Bucket> emptyEventCountsToStart = new ArrayList<Bucket>();
    for (int i = 0; i < numBuckets; i++) {
        emptyEventCountsToStart.add(getEmptyBucketSummary());
    }

    this.bucketedStream = Observable.defer(new Func0<Observable<Bucket>>() {
        @Override
        public Observable<Bucket> call() {
            return inputEventStream
                    .observe()
                    .window(bucketSizeInMs, TimeUnit.MILLISECONDS)
                    .flatMap(reduceBucketToSummary)                
                    .startWith(emptyEventCountsToStart);   
        }
    });
}

這裡父類的構造方法主要成三個部分分別是
I. reduceBucketToSummary 每個桶如何計算聚合的資料

appendRawEventToBucket的實現由其子類決定,不過大同小異,我們自行拔下程式碼看下HealthCountsStream, 可以看到他用的是HystrixCommandMetrics.appendEventToBucket

public static final Func2<long[], HystrixCommandCompletion, long[]> appendEventToBucket = new Func2<long[], HystrixCommandCompletion, long[]>() {
        @Override
        public long[] call(long[] initialCountArray, HystrixCommandCompletion execution) {
            ExecutionResult.EventCounts eventCounts = execution.getEventCounts();
            for (HystrixEventType eventType: ALL_EVENT_TYPES) {
                switch (eventType) {
                    case EXCEPTION_THROWN: break; //this is just a sum of other anyway - don't do the work here
                    default:
                        initialCountArray[eventType.ordinal()] += eventCounts.getCount(eventType);
                        break;
                }
            }
            return initialCountArray;
        }
    };
}

這個方法就是將一個桶時長內的資料進行累計計數相加。initialCountArray可以看出一個桶內前面的n個資料流的計算結果,陣列的下標就是HystrixEventType 列舉裡事件的下標值。

II. emptyEventCountsToStart 第一個桶的定義,裝逼點叫創世桶

III. window視窗的定義,這裡第一個引數就是每個桶的時長,第二個引數時間的單位。利用RxJava的window幫我們做聚合資料。

.window(bucketSizeInMs, TimeUnit.MILLISECONDS)

Bucket 時長如何計算
每個桶的時長如何得出的?這個也是基於我們的配置得出,拿HealthCountsStream舉例子。
metrics.rollingStats.timeInMilliseconds 滑動視窗時長 預設10000ms
metrics.healthSnapshot.intervalInMilliseconds 檢測健康狀態的時間片,預設500ms 在這裡對應一個bucket的時長

滑動視窗內桶的個數 = 滑動視窗時長 / bucket時長

而 CumulativeCommandEventCounterStream
metrics.rollingStats.timeInMilliseconds 滑動視窗時長 預設10000ms
metrics.rollingStats.numBuckets 滑動視窗要切的桶個數

bucket時長 = 滑動視窗時長 / 桶個數

不同職能的 XXXStream對應的演算法和對應的配置也不一樣,不過都一個套路,就不一一去展示了。

inputEventStream
inputEventStream 可以認為是視窗採集的資料流,這個資料流由其子類去傳遞,大致看了下

//HealthCountsStream
private HealthCountsStream(final HystrixCommandKey commandKey, final int numBuckets, final int bucketSizeInMs,
                               Func2<long[], HystrixCommandCompletion, long[]> reduceCommandCompletion) {
    super(HystrixCommandCompletionStream.getInstance(commandKey), numBuckets, bucketSizeInMs, reduceCommandCompletion, healthCheckAccumulator);
}

//RollingThreadPoolEventCounterStream
private RollingThreadPoolEventCounterStream(HystrixThreadPoolKey threadPoolKey, int numCounterBuckets, int counterBucketSizeInMs,
                                                Func2<long[], HystrixCommandCompletion, long[]> reduceCommandCompletion,
                                                Func2<long[], long[], long[]> reduceBucket) {
    super(HystrixThreadPoolCompletionStream.getInstance(threadPoolKey), numCounterBuckets, counterBucketSizeInMs, reduceCommandCompletion, reduceBucket);
}

我們發現這個 inputEventStream,其實就是 HystrixCommandCompletionStream、HystrixThreadPoolCompletionStream或者其它的,我們挑其中HystrixCommandCompletionStream看下,這個就是上面第二部分指標資料上傳裡講的寫資料那個stream,inputEventStream.observe()也就是 HystrixCommandCompletionStream的 readOnlyStreamSubject的只讀Observable。(這裡如果沒明白可以回到第二點看下結尾的部分)

3.2、累計計數器之CumulativeCommandEventCounterStream

先看下累計計數器的父類BucketedCumulativeCounterStream

protected BucketedCumulativeCounterStream(HystrixEventStream<Event> stream, int numBuckets, int bucketSizeInMs,
                                              Func2<Bucket, Event, Bucket> reduceCommandCompletion,
                                              Func2<Output, Bucket, Output> reduceBucket) {
    super(stream, numBuckets, bucketSizeInMs, reduceCommandCompletion);

    this.sourceStream = bucketedStream
            .scan(getEmptyOutputValue(), reduceBucket)
            .skip(numBuckets)
            ........省略程式碼........
            
}

bucketedStream就是3.1裡的資料匯聚後的一個一個桶流,這裡執行了scan方法,scan方法的意思就是會將當前視窗內已經提交的資料流進行按照順序進行遍歷並執行指定的function邏輯,scan裡有兩個引數第一個參數列示上一次執行function的結果,第二個引數就是每次遍歷要執行的function,scan完畢後skip numBuckets 個bucket,可以認為丟棄掉已經計算過的bucket。

scan裡的function是如何實現呢?它也是實現累計計數的關鍵,由子類實現,本小節也就是CumulativeCommandEventCounterStream來實現

CumulativeCommandEventCounterStream newStream = new CumulativeCommandEventCounterStream(commandKey, numBuckets, bucketSizeInMs,HystrixCommandMetrics.appendEventToBucket, HystrixCommandMetrics.bucketAggregator);

發現呼叫的是 HystrixCommandMetrics.bucketAggregator,我們看下其函式體

public static final Func2<long[], long[], long[]> bucketAggregator = new Func2<long[], long[], long[]>() {
    @Override
    public long[] call(long[] cumulativeEvents, long[] bucketEventCounts) {
        for (HystrixEventType eventType: ALL_EVENT_TYPES) {
            switch (eventType) {
                case EXCEPTION_THROWN:
                    for (HystrixEventType exceptionEventType: HystrixEventType.EXCEPTION_PRODUCING_EVENT_TYPES) {
                        cumulativeEvents[eventType.ordinal()] += bucketEventCounts[exceptionEventType.ordinal()];
                    }
                    break;
                default:
                    cumulativeEvents[eventType.ordinal()] += bucketEventCounts[eventType.ordinal()];
                    break;
            }
        }
        return cumulativeEvents;
    }
};

call() 方法有兩個引數第一個引數指的之前的計算結果,第二個引數指的當前桶內的計數,方法體不難理解,就是對各個時間的count計數累加。

如此,一個command的計數就實現了,其它累計計數也雷同。

3.3、滑動視窗之HealthCountsStream

直接父類程式碼

protected BucketedRollingCounterStream(HystrixEventStream<Event> stream, final int numBuckets, int bucketSizeInMs,
                                           final Func2<Bucket, Event, Bucket> appendRawEventToBucket,
                                           final Func2<Output, Bucket, Output> reduceBucket) {
    super(stream, numBuckets, bucketSizeInMs, appendRawEventToBucket);
    Func1<Observable<Bucket>, Observable<Output>> reduceWindowToSummary = new Func1<Observable<Bucket>, Observable<Output>>() {
        @Override
        public Observable<Output> call(Observable<Bucket> window) {
            return window.scan(getEmptyOutputValue(), reduceBucket).skip(numBuckets);
        }
    };
    this.sourceStream = bucketedStream      
            .window(numBuckets, 1)          
            .flatMap(reduceWindowToSummary) 
            ........省略程式碼........
}

依然像累計計數器一樣對父級的桶流資料進行操作,這裡用的是window(),第一個參數列示桶的個數,第二個參數列示一次移動的個數。這裡numBuckets就是我們的滑動視窗桶個數

滑動視窗

第一排我們可以認為是移動前的滑動視窗的資料,在執行完 flatMap裡的function之後,滑動視窗向前移動一個桶位,那麼 23 5 2 0 這個桶就被丟棄了,然後新進了最新的桶 45 6 2 0
那麼每次滑動視窗內的資料是如何被處理呢?就是flatMap裡的function做的,reduceWindowToSummary 最終被具體的子類stream實現,我們就研究下HealthCountsStream

private static final Func2<HystrixCommandMetrics.HealthCounts, long[], HystrixCommandMetrics.HealthCounts> healthCheckAccumulator = new Func2<HystrixCommandMetrics.HealthCounts, long[], HystrixCommandMetrics.HealthCounts>() {
    @Override
    public HystrixCommandMetrics.HealthCounts call(HystrixCommandMetrics.HealthCounts healthCounts, long[] bucketEventCounts) {
        return healthCounts.plus(bucketEventCounts);
    }
};

//HystrixCommandMetrics.HealthCounts#plus
public HealthCounts plus(long[] eventTypeCounts) {
    long updatedTotalCount = totalCount;
    long updatedErrorCount = errorCount;

    long successCount = eventTypeCounts[HystrixEventType.SUCCESS.ordinal()];
    long failureCount = eventTypeCounts[HystrixEventType.FAILURE.ordinal()];
    long timeoutCount = eventTypeCounts[HystrixEventType.TIMEOUT.ordinal()];
    long threadPoolRejectedCount = eventTypeCounts[HystrixEventType.THREAD_POOL_REJECTED.ordinal()];
    long semaphoreRejectedCount = eventTypeCounts[HystrixEventType.SEMAPHORE_REJECTED.ordinal()];

    updatedTotalCount += (successCount + failureCount + timeoutCount + threadPoolRejectedCount + semaphoreRejectedCount);
    updatedErrorCount += (failureCount + timeoutCount + threadPoolRejectedCount + semaphoreRejectedCount);
    return new HealthCounts(updatedTotalCount, updatedErrorCount);
}

方法的實現也顯而易見,統計了當前滑動視窗內成功數、失敗數、執行緒拒絕數,超時數.....

該stream的職責就是探測服務的可用性,也是Hystrix熔斷器是否生效依賴的資料來源。

四、回顧

Hystrix的滑動視窗設計相對於其它可能稍微偏難理解些,其主要原因還是因為我們對RxJava的瞭解不夠,不過這不重要,只要耐心的多看幾遍就沒有什麼問題。

本篇主要從指標資料上報到指標資料收集來逐步解開Hystrix指標蒐集的神祕面紗。最後借用一大牛的圖彙總下本篇的內容

參考文件
官方文件-How it works
官方文件-configuration
Hystrix 1.5 滑動視窗實現原理總結


系列文章推薦
Hystrix常用功能介紹
Hystrix執行原理
Hystrix熔斷器執行機制
Hystrix超時實現機制

相關文章