消費者如何讀取資料?
前一篇是生產者的處理,這一篇講消費者的處理
我們都知道,消費者無非就是不停地從佇列中讀取資料,處理資料。但是與BlockedQueue不同的是,RingBuffer的消費者不會對佇列進行上鎖,那它是怎樣實現的呢?
概括地說,就是通過CAS原子性地得到一個可消費的序號,然後再根據序號取出資料進行處理。
在看程式碼之前,我們先把能想到的東西先羅列一下:
1.需要一個尾指標來追蹤消費狀態
2.如何防止一個資料被多個消費者重複消費?
3.消費速度不能超過生產者,如何限制?
4.當沒有可處理資料的時候消費者該做什麼,自旋還是掛起等待生產者喚醒?
5.如果4選擇掛起,那麼如果RingBuffer關閉,如何喚醒消費者以終結執行緒任務?
6.RingBuffer構造的時候需要傳入執行緒工廠,RingBuffer是如何使用執行緒的,多個任務使用一個執行緒排程?
7.消費者何時啟動?
好,問題有了,現在我們來看程式碼,下面是EventProcessor的一個實現,WorkProcessor的程式碼。
public final class WorkProcessor<T> implements EventProcessor { private final AtomicBoolean running = new AtomicBoolean(false); //當前處理器狀態 private final Sequence sequence = new Sequence(Sequencer.INITIAL_CURSOR_VALUE); //當前已消費過的最新序號 private final RingBuffer<T> ringBuffer; //保留此引用,方便取資料 private final SequenceBarrier sequenceBarrier; //用於等待下一個最大可用序號,可與多個Processor共用 private final WorkHandler<? super T> workHandler; //實際上的處理器 private final ExceptionHandler<? super T> exceptionHandler; private final Sequence workSequence; //多個Processor共用的workSequence,可以得到下一個待處理的序號 private final EventReleaser eventReleaser = new EventReleaser() { @Override public void release() { sequence.set(Long.MAX_VALUE); } }; private final TimeoutHandler timeoutHandler; /** * Construct a {@link WorkProcessor}. * * @param ringBuffer to which events are published. * @param sequenceBarrier on which it is waiting. * @param workHandler is the delegate to which events are dispatched. * @param exceptionHandler to be called back when an error occurs * @param workSequence from which to claim the next event to be worked on. It should always be initialised * as {@link Sequencer#INITIAL_CURSOR_VALUE} */ public WorkProcessor( final RingBuffer<T> ringBuffer, final SequenceBarrier sequenceBarrier, final WorkHandler<? super T> workHandler, final ExceptionHandler<? super T> exceptionHandler, final Sequence workSequence) { this.ringBuffer = ringBuffer; this.sequenceBarrier = sequenceBarrier; this.workHandler = workHandler; this.exceptionHandler = exceptionHandler; this.workSequence = workSequence; if (this.workHandler instanceof EventReleaseAware) { ((EventReleaseAware) this.workHandler).setEventReleaser(eventReleaser); } timeoutHandler = (workHandler instanceof TimeoutHandler) ? (TimeoutHandler) workHandler : null; } @Override public Sequence getSequence() { return sequence; } @Override public void halt() { running.set(false); sequenceBarrier.alert(); //喚醒卡在WaitStrategy的Processor執行緒,好讓它知道“結束”狀態 } /** * remove workProcessor dynamic without message lost */ public void haltLater() { running.set(false); //所謂later,就是等待下一次檢查的時候才推出,如果處於卡在WaitStrategy,則等待它返回後才檢查 } @Override public boolean isRunning() { return running.get(); } /** * It is ok to have another thread re-run this method after a halt(). * * @throws IllegalStateException if this processor is already running */ @Override public void run() { if (!running.compareAndSet(false, true)) //防止run方法重複呼叫造成的問題 { throw new IllegalStateException("Thread is already running"); } sequenceBarrier.clearAlert(); notifyStart(); boolean processedSequence = true; long cachedAvailableSequence = Long.MIN_VALUE; long nextSequence = sequence.get(); T event = null; while (true) //死迴圈 { try { // if previous sequence was processed - fetch the next sequence and set // that we have successfully processed the previous sequence // typically, this will be true // this prevents the sequence getting too far forward if an exception // is thrown from the WorkHandler if (processedSequence) { if (!running.get()) //如果檢查到已關閉,則喚醒在同一個Barrier上的其他processor執行緒 { sequenceBarrier.alert(); //喚醒其他執行緒 sequenceBarrier.checkAlert(); //丟擲異常,終結此執行緒 } processedSequence = false; do { //workSequence可能和多個Processor共用 nextSequence = workSequence.get() + 1L; //這個sequence才是當前處理器處理過的序號,生產者判斷尾指標的時候就是按照這個來的,這個就是gatingSequence //拿到下一個新序號的時候,說明workSequence前一個資料已經處理過了 sequence.set(nextSequence - 1L); } //由於workSequence可能由多個Processor共用,故存在競爭情況,需要使用CAS while (!workSequence.compareAndSet(nextSequence - 1L, nextSequence)); } //如果沒有超過生產者的最大遊標,則表明資料可取 if (cachedAvailableSequence >= nextSequence) { //取出序號對應位置的資料 event = ringBuffer.get(nextSequence); //交給handler處理 workHandler.onEvent(event); processedSequence = true; } else { //阻塞等待下一個可用的序號 //如果就是nextSequence,就返回nextSequence //如果可用的大於nextSequence,則返回最新可用的sequence cachedAvailableSequence = sequenceBarrier.waitFor(nextSequence); } } catch (final TimeoutException e) { notifyTimeout(sequence.get()); } catch (final AlertException ex) //checkAlert()丟擲的 { if (!running.get()) //如果已經結束,則終結迴圈,執行緒任務結束 { break; } } catch (final Throwable ex) //其他異常,則交給異常處理器處理 { // handle, mark as processed, unless the exception handler threw an exception exceptionHandler.handleEventException(ex, nextSequence, event); processedSequence = true; } } notifyShutdown(); running.set(false); } private void notifyTimeout(final long availableSequence) { try { if (timeoutHandler != null) { timeoutHandler.onTimeout(availableSequence); } } catch (Throwable e) { exceptionHandler.handleEventException(e, availableSequence, null); } } private void notifyStart() { if (workHandler instanceof LifecycleAware) { try { ((LifecycleAware) workHandler).onStart(); } catch (final Throwable ex) { exceptionHandler.handleOnStartException(ex); } } } private void notifyShutdown() { if (workHandler instanceof LifecycleAware) { try { ((LifecycleAware) workHandler).onShutdown(); } catch (final Throwable ex) { exceptionHandler.handleOnShutdownException(ex); } } } }
針對問題一:需要一個尾指標來追蹤消費狀態
你們注意到程式碼中有兩個Sequence,workSequence和sequence。為啥需要兩個呢?
workSequence消費者使用的最新序號(該序號的資料未被處理過,只是被消費者標記成可消費);而sequence序號的資料則是被消費過的,這個序號正是前一篇中的gatingSequence。
針對問題二:如何防止一個資料被多個消費者重複消費?
問題二的解決方案就是WorkPool,即讓多個WorkProcessor共用一個workSequence,這樣它們就會競爭序號,一個序號只能被消費一次。
public final class WorkerPool<T> { private final AtomicBoolean started = new AtomicBoolean(false); private final Sequence workSequence = new Sequence(Sequencer.INITIAL_CURSOR_VALUE); //從-1開始 private final RingBuffer<T> ringBuffer; //RingBuffer引用,用於構造Processor,取資料 // WorkProcessors are created to wrap each of the provided WorkHandlers private final WorkProcessor<?>[] workProcessors; //... public WorkerPool( final RingBuffer<T> ringBuffer, final SequenceBarrier sequenceBarrier, final ExceptionHandler<? super T> exceptionHandler, final WorkHandler<? super T>... workHandlers) { this.ringBuffer = ringBuffer; final int numWorkers = workHandlers.length; workProcessors = new WorkProcessor[numWorkers]; //每個handler構造一個Processor for (int i = 0; i < numWorkers; i++) { workProcessors[i] = new WorkProcessor<>( ringBuffer, sequenceBarrier, //共用同一個sequenceBarrier workHandlers[i], exceptionHandler, workSequence); //共用同一個workSequence } } //... } public class Disruptor<T> { private final RingBuffer<T> ringBuffer; private final Executor executor; private final ConsumerRepository<T> consumerRepository = new ConsumerRepository<>(); private final AtomicBoolean started = new AtomicBoolean(false); private ExceptionHandler<? super T> exceptionHandler = new ExceptionHandlerWrapper<>(); //... public final EventHandlerGroup<T> handleEventsWithWorkerPool(final WorkHandler<T>... workHandlers) { return createWorkerPool(new Sequence[0], workHandlers); } EventHandlerGroup<T> createWorkerPool( final Sequence[] barrierSequences, final WorkHandler<? super T>[] workHandlers) { final SequenceBarrier sequenceBarrier = ringBuffer.newBarrier(barrierSequences); final WorkerPool<T> workerPool = new WorkerPool<>(ringBuffer, sequenceBarrier, exceptionHandler, workHandlers); consumerRepository.add(workerPool, sequenceBarrier); //將workPool存入Repository,啟動的時候會從Repository中取出,交給Executor處理 final Sequence[] workerSequences = workerPool.getWorkerSequences(); updateGatingSequencesForNextInChain(barrierSequences, workerSequences); return new EventHandlerGroup<>(this, consumerRepository, workerSequences); } //.... }
針對問題三、四:消費速度不能超過生產者,如何限制?當沒有可處理資料的時候消費者該做什麼,自旋還是掛起等待生產者喚醒?
使用SequenceBarrier,從WorkProcessor的程式碼中我們可以知道,消費者會快取上次獲取的最大可消費序號,然後在這序號範圍內都可以直接競爭。每次獲取最小可用序號的時候,則會觸發waitStrategy等待策略進行等待。
其中等待策略有很多中,常見的就是BlockingWaitStategy,該等待策略會掛起執行執行緒。當生產者publishEvent的時候,則會呼叫WaitStrategy#signalAllWhenBlocking()方法喚醒所有等待執行緒。
final class ProcessingSequenceBarrier implements SequenceBarrier { private final WaitStrategy waitStrategy; private final Sequence dependentSequence; private volatile boolean alerted = false; private final Sequence cursorSequence; private final Sequencer sequencer; //... public long waitFor(final long sequence) throws AlertException, InterruptedException, TimeoutException { checkAlert(); long availableSequence = waitStrategy.waitFor(sequence, cursorSequence, dependentSequence, this); if (availableSequence < sequence) { return availableSequence; } return sequencer.getHighestPublishedSequence(sequence, availableSequence); } //... }
針對問題六、七:RingBuffer構造的時候需要傳入執行緒工廠,RingBuffer是如何使用執行緒的,多個任務使用一個執行緒排程?消費者何時啟動?
消費者隨Disruptor啟動,Disruptor啟動時會從ConsumerRepository中取出Consumer,提交給Executor執行。
public RingBuffer<T> start() { checkOnlyStartedOnce(); for (final ConsumerInfo consumerInfo : consumerRepository) { consumerInfo.start(executor); } return ringBuffer; }
其中,在新版的Disruptor中,不建議使用外部傳入的Executor,而是隻傳ThreadFactory,然後由內部構造一個Executor,就是BasicExecutor。它的實現就是每次提交的任務都建立一個新的執行緒負責。所以它的執行緒模型就是一個消費者一個執行緒。
public class Disruptor<T> { //... public Disruptor(final EventFactory<T> eventFactory, final int ringBufferSize, final ThreadFactory threadFactory) { this(RingBuffer.createMultiProducer(eventFactory, ringBufferSize), new BasicExecutor(threadFactory)); } //... } public class BasicExecutor implements Executor { private final ThreadFactory factory; private final Queue<Thread> threads = new ConcurrentLinkedQueue<>(); public BasicExecutor(ThreadFactory factory) { this.factory = factory; } @Override public void execute(Runnable command) { //每提交一個任務就新建一個新的執行緒處理這個任務 final Thread thread = factory.newThread(command); if (null == thread) { throw new RuntimeException("Failed to create thread to run: " + command); } thread.start(); threads.add(thread); } //... }
//TODO 後續整理補充一些圖例,方便理解