java執行緒池趣味事:這不是執行緒池

等你歸去來發表於2021-02-20

  要想寫出高效能高併發的應用,自然有許多關鍵,如io,演算法,非同步,語言特性,作業系統特性,佇列,記憶體,cpu,分散式,網路,資料結構,高效能元件。

  胡說一通先。

  回到主題,執行緒池。如果說多執行緒是提高系統併發能力利器之一,那麼執行緒池就是讓這個利器更容易控制的一種工具。如果我們自己純粹使用多執行緒基礎特性編寫,那麼,必然需要相當老道的經驗,才能夠駕馭複雜的環境。而執行緒池則不需要,你只需知道如何使用,即可輕鬆掌控多執行緒,安全地為你服務。

 

1. 常見執行緒池的應用樣例

  執行緒池,不說本身很簡單,但應用一定是簡單的。

  執行緒池有許多的實現,但我們只說 ThreadPoolExecutor 版本,因其應用最廣泛,別無其他。當然了,還有一個定時排程執行緒池 ScheduledThreadPoolExecutor 另說,因其需求場景不同,無法比較。

  下面,我就幾個應用級別,說明下我們如何快速使用執行緒池。(走走過場而已,無關其他)

 

1.1. 初級執行緒池

  初級版本的使用執行緒池,只需要藉助一個工具類即可: Executors . 它提供了許多靜態方法,你只需隨便選一個就可以使用執行緒池了。比如:

// 建立固定數量的執行緒池
Executors.newFixedThreadPool(8);
// 建立無限動態建立的執行緒池
Executors.newCachedThreadPool();
// 建立定時排程執行緒池
Executors.newScheduledThreadPool(2);
// 還有個建立單執行緒的就不說了,都一樣

  使用上面這些方法建立好的執行緒池,直接呼叫其 execute() 或者 submit() 方法,就可以實現多執行緒程式設計了。沒毛病!

 

1.2. 中級執行緒池

  我這裡所說的中級,實際就是不使用以上超級簡單方式使用執行緒池的方式。即你已經知道了 ThreadPoolExecutor 這個東東了。這不管你的出發點是啥!

// 自定義各執行緒引數
ThreadPoolExecutor threadPoolExecutor = new ThreadPoolExecutor(4, 20, 20, TimeUnit.MILLISECONDS, new LinkedBlockingQueue<>());

  具體引數解釋就不說了,我們們不掃盲。總之,使用這玩意兒,說明你已經開始有點門道了。

 

1.3. 高階執行緒池

  實際上,這個版本就沒法具體說如何做了。

  但它可能是,你知道你的執行緒池應用場景的,你清楚你的硬體執行環境的,你會使用執行緒池命名的,你會定義你的佇列大小的,你會考慮上下文切換的,你會考慮執行緒安全的,你會考慮鎖效能的,你可能會自己造個輪子的。。。

 

2. 這不是執行緒池

  我們通常理解的執行緒池,就是能夠同時跑多個任務的地方。但有時候執行緒池不一像執行緒池,而像一個單執行緒。來看一個具體的簡單的執行緒池的使用場景:

    // 初始化執行緒池
    private ExecutorService executor
            = new ThreadPoolExecutor(Runtime.getRuntime().availableProcessors(),
                Runtime.getRuntime().availableProcessors(),
                0L, TimeUnit.SECONDS,
                new ArrayBlockingQueue<>(50),
                new NamedThreadFactory("test-pool"),
                new ThreadPoolExecutor.CallerRunsPolicy());
    // 使用執行緒池處理任務
    public Integer doTask(String updateIntervalDesc) throws Exception {
        long startTime = System.currentTimeMillis();
        List<TestDto> testList;
        AtomicInteger affectNum = new AtomicInteger(0);
        int pageSize = 1000;
        AtomicInteger pageNo = new AtomicInteger(1);
        Map<String, Object> condGroupLabel = new HashMap<>();
        log.info("start do sth:{}", updateIntervalDesc);
        List<Future<?>> futureList = new ArrayList<>();
        do {
            PageHelper.startPage(pageNo.getAndIncrement(), pageSize);
            List<TestDto> list
                    = testDao.getLabelListNew(condGroupLabel);
            testList = list;
            // 迴圈向執行緒池中提交任務
            for (TestDto s : list) {
                Future<?> future = executor.submit(() -> {
                    try {
                        // do sth...
                        affectNum.incrementAndGet();
                    }
                    catch (Throwable e) {
                        log.error("error:{}", pageNo.get(), e);
                    }
                });
                futureList.add(future);
            }
        } while (testList.size() >= pageSize);
        // 等待任務完成
        int i = 0;
        for (Future<?> future : futureList) {
            future.get();
            log.info("done:+{} ", i++);
        }
        log.info("doTask done:{}, num:{}, cost:{}ms",
                updateIntervalDesc, affectNum.get(), System.currentTimeMillis() - startTime);
        return affectNum.get();
    }

  主要業務就是,從資料庫中取出許多工,放入執行緒池中執行。因為任務又涉及到db等的io操作,所以使用多執行緒處理,非常合理。

  然而,有一種情況的出現,也許會打破這個平衡:那就是當單個任務能夠快速執行完成時,而且快到剛上一任務提交完成,還沒等下一次提交時,就任務就已被執行完成。這時,你就可能會看到一個神奇的現象,即一直只有一個執行緒在執行任務。這不是執行緒池該乾的事,更像是單執行緒任務在跑。

  然後,我們可能開始懷疑:某個執行緒被阻塞了?執行緒排程不公平了?佇列選擇不正確了?觸發jdk bug了?執行緒池未完全利用的執行緒了?等等。。。

  然而結果並非如此,糾其原因只是當我們向執行緒池提交任務時,實際上只是向執行緒池的佇列中新增了任務。即上面顯示的 ArrayBlockingQueue 新增了任務,而執行緒池中的各worker負責從佇列中獲取任務進行執行。而當任務數很少時,自然只有一部分worker會處理執行中了。至於為什麼一直是同一個執行緒在執行,則可能是由於jvm的排程機制導致。事實上,是受制於 ArrayBlockingQueue.poll() 的公平性。而這個poll()的實現原理,則是由 wait/notify 機制的公平性決定的。

 

  如下,是執行緒池的worker工作原理:

    // java.util.concurrent.ThreadPoolExecutor#runWorker
    /**
     * Main worker run loop.  Repeatedly gets tasks from queue and
     * executes them, while coping with a number of issues:
     *
     * 1. We may start out with an initial task, in which case we
     * don't need to get the first one. Otherwise, as long as pool is
     * running, we get tasks from getTask. If it returns null then the
     * worker exits due to changed pool state or configuration
     * parameters.  Other exits result from exception throws in
     * external code, in which case completedAbruptly holds, which
     * usually leads processWorkerExit to replace this thread.
     *
     * 2. Before running any task, the lock is acquired to prevent
     * other pool interrupts while the task is executing, and then we
     * ensure that unless pool is stopping, this thread does not have
     * its interrupt set.
     *
     * 3. Each task run is preceded by a call to beforeExecute, which
     * might throw an exception, in which case we cause thread to die
     * (breaking loop with completedAbruptly true) without processing
     * the task.
     *
     * 4. Assuming beforeExecute completes normally, we run the task,
     * gathering any of its thrown exceptions to send to afterExecute.
     * We separately handle RuntimeException, Error (both of which the
     * specs guarantee that we trap) and arbitrary Throwables.
     * Because we cannot rethrow Throwables within Runnable.run, we
     * wrap them within Errors on the way out (to the thread's
     * UncaughtExceptionHandler).  Any thrown exception also
     * conservatively causes thread to die.
     *
     * 5. After task.run completes, we call afterExecute, which may
     * also throw an exception, which will also cause thread to
     * die. According to JLS Sec 14.20, this exception is the one that
     * will be in effect even if task.run throws.
     *
     * The net effect of the exception mechanics is that afterExecute
     * and the thread's UncaughtExceptionHandler have as accurate
     * information as we can provide about any problems encountered by
     * user code.
     *
     * @param w the worker
     */
    final void runWorker(Worker w) {
        Thread wt = Thread.currentThread();
        Runnable task = w.firstTask;
        w.firstTask = null;
        w.unlock(); // allow interrupts
        boolean completedAbruptly = true;
        try {
            // worker 不停地向佇列中獲取任務,然後執行
            // 其中獲取任務的過程,可能被中斷,也可能不會,受到執行緒池伸縮配置的影響
            while (task != null || (task = getTask()) != null) {
                w.lock();
                // If pool is stopping, ensure thread is interrupted;
                // if not, ensure thread is not interrupted.  This
                // requires a recheck in second case to deal with
                // shutdownNow race while clearing interrupt
                if ((runStateAtLeast(ctl.get(), STOP) ||
                     (Thread.interrupted() &&
                      runStateAtLeast(ctl.get(), STOP))) &&
                    !wt.isInterrupted())
                    wt.interrupt();
                try {
                    beforeExecute(wt, task);
                    Throwable thrown = null;
                    try {
                        task.run();
                    } catch (RuntimeException x) {
                        thrown = x; throw x;
                    } catch (Error x) {
                        thrown = x; throw x;
                    } catch (Throwable x) {
                        thrown = x; throw new Error(x);
                    } finally {
                        afterExecute(task, thrown);
                    }
                } finally {
                    task = null;
                    w.completedTasks++;
                    w.unlock();
                }
            }
            completedAbruptly = false;
        } finally {
            processWorkerExit(w, completedAbruptly);
        }
    }
    /**
     * Performs blocking or timed wait for a task, depending on
     * current configuration settings, or returns null if this worker
     * must exit because of any of:
     * 1. There are more than maximumPoolSize workers (due to
     *    a call to setMaximumPoolSize).
     * 2. The pool is stopped.
     * 3. The pool is shutdown and the queue is empty.
     * 4. This worker timed out waiting for a task, and timed-out
     *    workers are subject to termination (that is,
     *    {@code allowCoreThreadTimeOut || workerCount > corePoolSize})
     *    both before and after the timed wait, and if the queue is
     *    non-empty, this worker is not the last thread in the pool.
     *
     * @return task, or null if the worker must exit, in which case
     *         workerCount is decremented
     */
    private Runnable getTask() {
        boolean timedOut = false; // Did the last poll() time out?

        for (;;) {
            int c = ctl.get();
            int rs = runStateOf(c);

            // Check if queue empty only if necessary.
            if (rs >= SHUTDOWN && (rs >= STOP || workQueue.isEmpty())) {
                decrementWorkerCount();
                return null;
            }

            int wc = workerCountOf(c);

            // Are workers subject to culling?
            boolean timed = allowCoreThreadTimeOut || wc > corePoolSize;

            if ((wc > maximumPoolSize || (timed && timedOut))
                && (wc > 1 || workQueue.isEmpty())) {
                if (compareAndDecrementWorkerCount(c))
                    return null;
                continue;
            }

            try {
                // 可能呼叫超時方法,也可能呼叫阻塞方法
                // 固定執行緒池的情況下,呼叫阻塞 take() 方法
                Runnable r = timed ?
                    workQueue.poll(keepAliveTime, TimeUnit.NANOSECONDS) :
                    workQueue.take();
                if (r != null)
                    return r;
                timedOut = true;
            } catch (InterruptedException retry) {
                timedOut = false;
            }
        }
    }

  即執行緒池worker持續向佇列獲取任務,執行即可。而佇列任務的獲取,則由兩個讀寫鎖決定:

    // java.util.concurrent.ArrayBlockingQueue#take
    public E take() throws InterruptedException {
        final ReentrantLock lock = this.lock;
        // 此處鎖,保證執行執行緒安全性
        lock.lockInterruptibly();
        try {
            while (count == 0)
                // 此處釋放鎖等待,再次喚醒時,要求必須重新持有鎖
                notEmpty.await();
            return dequeue();
        } finally {
            lock.unlock();
        }
    }
    // 
    /**
     * Inserts the specified element at the tail of this queue, waiting
     * for space to become available if the queue is full.
     *
     * @throws InterruptedException {@inheritDoc}
     * @throws NullPointerException {@inheritDoc}
     */
    public void put(E e) throws InterruptedException {
        checkNotNull(e);
        final ReentrantLock lock = this.lock;
        lock.lockInterruptibly();
        try {
            while (count == items.length)
                notFull.await();
            enqueue(e);
        } finally {
            lock.unlock();
        }
    }
    /**
     * Inserts element at current put position, advances, and signals.
     * Call only when holding lock.
     */
    private void enqueue(E x) {
        // assert lock.getHoldCount() == 1;
        // assert items[putIndex] == null;
        final Object[] items = this.items;
        items[putIndex] = x;
        if (++putIndex == items.length)
            putIndex = 0;
        count++;
        // 通知取等執行緒,喚醒
        notEmpty.signal();
    }

  所以,具體誰取到任務,就是要看誰搶到了鎖。而這,可能又涉及到jvm的高效排程策略啥的了吧。(雖然不確定,但感覺像) 至少,任務執行的表象是,所有任務被某個執行緒一直搶到。

 

3. 迴歸執行緒池

  執行緒池的目的,在於處理一些非同步的任務,或者併發的執行多個無關聯的任務。在於讓系統減負。而當任務的提交消耗,大於了任務的執行消耗,那就沒必要使用多執行緒了,或者說這是錯誤的用法了。我們應該執行緒池做更重的活,而不是輕量級的。如上問題,執行效能必然很差。但我們稍做轉變,也許就不一樣了。

    // 初始化執行緒池
    private ExecutorService executor
            = new ThreadPoolExecutor(Runtime.getRuntime().availableProcessors(),
                Runtime.getRuntime().availableProcessors(),
                0L, TimeUnit.SECONDS,
                new ArrayBlockingQueue<>(50),
                new NamedThreadFactory("test-pool"),
                new ThreadPoolExecutor.CallerRunsPolicy());
    // 使用執行緒池處理任務
    public Integer doTask(String updateIntervalDesc) throws Exception {
        long startTime = System.currentTimeMillis();
        List<TestDto> testList;
        AtomicInteger affectNum = new AtomicInteger(0);
        int pageSize = 1000;
        AtomicInteger pageNo = new AtomicInteger(1);
        Map<String, Object> condGroupLabel = new HashMap<>();
        log.info("start do sth:{}", updateIntervalDesc);
        List<Future<?>> futureList = new ArrayList<>();
        do {
            PageHelper.startPage(pageNo.getAndIncrement(), pageSize);
            List<TestDto> list
                    = testDao.getLabelListNew(condGroupLabel);
            testList = list;
            // 一批任務只向執行緒池中提交任務
            Future<?> future = executor.submit(() -> {
                for (TestDto s : list) {
                    try {
                        // do sth...
                        affectNum.incrementAndGet();
                    }
                    catch (Throwable e) {
                        log.error("error:{}", pageNo.get(), e);
                    }
                }
            });
            futureList.add(future);
        } while (testList.size() >= pageSize);
        // 等待任務完成
        int i = 0;
        for (Future<?> future : futureList) {
            future.get();
            log.info("done:+{} ", i++);
        }
        log.info("doTask done:{}, num:{}, cost:{}ms",
                updateIntervalDesc, affectNum.get(), System.currentTimeMillis() - startTime);
        return affectNum.get();
    }

  即,讓每個執行緒執行的任務足夠重,以至於完全忽略提交的消耗。這樣才能夠發揮多執行緒的作用。

相關文章