要想寫出高效能高併發的應用,自然有許多關鍵,如io,演算法,非同步,語言特性,作業系統特性,佇列,記憶體,cpu,分散式,網路,資料結構,高效能元件。
胡說一通先。
回到主題,執行緒池。如果說多執行緒是提高系統併發能力利器之一,那麼執行緒池就是讓這個利器更容易控制的一種工具。如果我們自己純粹使用多執行緒基礎特性編寫,那麼,必然需要相當老道的經驗,才能夠駕馭複雜的環境。而執行緒池則不需要,你只需知道如何使用,即可輕鬆掌控多執行緒,安全地為你服務。
1. 常見執行緒池的應用樣例
執行緒池,不說本身很簡單,但應用一定是簡單的。
執行緒池有許多的實現,但我們只說 ThreadPoolExecutor 版本,因其應用最廣泛,別無其他。當然了,還有一個定時排程執行緒池 ScheduledThreadPoolExecutor 另說,因其需求場景不同,無法比較。
下面,我就幾個應用級別,說明下我們如何快速使用執行緒池。(走走過場而已,無關其他)
1.1. 初級執行緒池
初級版本的使用執行緒池,只需要藉助一個工具類即可: Executors . 它提供了許多靜態方法,你只需隨便選一個就可以使用執行緒池了。比如:
// 建立固定數量的執行緒池 Executors.newFixedThreadPool(8); // 建立無限動態建立的執行緒池 Executors.newCachedThreadPool(); // 建立定時排程執行緒池 Executors.newScheduledThreadPool(2); // 還有個建立單執行緒的就不說了,都一樣
使用上面這些方法建立好的執行緒池,直接呼叫其 execute() 或者 submit() 方法,就可以實現多執行緒程式設計了。沒毛病!
1.2. 中級執行緒池
我這裡所說的中級,實際就是不使用以上超級簡單方式使用執行緒池的方式。即你已經知道了 ThreadPoolExecutor 這個東東了。這不管你的出發點是啥!
// 自定義各執行緒引數 ThreadPoolExecutor threadPoolExecutor = new ThreadPoolExecutor(4, 20, 20, TimeUnit.MILLISECONDS, new LinkedBlockingQueue<>());
具體引數解釋就不說了,我們們不掃盲。總之,使用這玩意兒,說明你已經開始有點門道了。
1.3. 高階執行緒池
實際上,這個版本就沒法具體說如何做了。
但它可能是,你知道你的執行緒池應用場景的,你清楚你的硬體執行環境的,你會使用執行緒池命名的,你會定義你的佇列大小的,你會考慮上下文切換的,你會考慮執行緒安全的,你會考慮鎖效能的,你可能會自己造個輪子的。。。
2. 這不是執行緒池
我們通常理解的執行緒池,就是能夠同時跑多個任務的地方。但有時候執行緒池不一像執行緒池,而像一個單執行緒。來看一個具體的簡單的執行緒池的使用場景:
// 初始化執行緒池 private ExecutorService executor = new ThreadPoolExecutor(Runtime.getRuntime().availableProcessors(), Runtime.getRuntime().availableProcessors(), 0L, TimeUnit.SECONDS, new ArrayBlockingQueue<>(50), new NamedThreadFactory("test-pool"), new ThreadPoolExecutor.CallerRunsPolicy()); // 使用執行緒池處理任務 public Integer doTask(String updateIntervalDesc) throws Exception { long startTime = System.currentTimeMillis(); List<TestDto> testList; AtomicInteger affectNum = new AtomicInteger(0); int pageSize = 1000; AtomicInteger pageNo = new AtomicInteger(1); Map<String, Object> condGroupLabel = new HashMap<>(); log.info("start do sth:{}", updateIntervalDesc); List<Future<?>> futureList = new ArrayList<>(); do { PageHelper.startPage(pageNo.getAndIncrement(), pageSize); List<TestDto> list = testDao.getLabelListNew(condGroupLabel); testList = list; // 迴圈向執行緒池中提交任務 for (TestDto s : list) { Future<?> future = executor.submit(() -> { try { // do sth... affectNum.incrementAndGet(); } catch (Throwable e) { log.error("error:{}", pageNo.get(), e); } }); futureList.add(future); } } while (testList.size() >= pageSize); // 等待任務完成 int i = 0; for (Future<?> future : futureList) { future.get(); log.info("done:+{} ", i++); } log.info("doTask done:{}, num:{}, cost:{}ms", updateIntervalDesc, affectNum.get(), System.currentTimeMillis() - startTime); return affectNum.get(); }
主要業務就是,從資料庫中取出許多工,放入執行緒池中執行。因為任務又涉及到db等的io操作,所以使用多執行緒處理,非常合理。
然而,有一種情況的出現,也許會打破這個平衡:那就是當單個任務能夠快速執行完成時,而且快到剛上一任務提交完成,還沒等下一次提交時,就任務就已被執行完成。這時,你就可能會看到一個神奇的現象,即一直只有一個執行緒在執行任務。這不是執行緒池該乾的事,更像是單執行緒任務在跑。
然後,我們可能開始懷疑:某個執行緒被阻塞了?執行緒排程不公平了?佇列選擇不正確了?觸發jdk bug了?執行緒池未完全利用的執行緒了?等等。。。
然而結果並非如此,糾其原因只是當我們向執行緒池提交任務時,實際上只是向執行緒池的佇列中新增了任務。即上面顯示的 ArrayBlockingQueue 新增了任務,而執行緒池中的各worker負責從佇列中獲取任務進行執行。而當任務數很少時,自然只有一部分worker會處理執行中了。至於為什麼一直是同一個執行緒在執行,則可能是由於jvm的排程機制導致。事實上,是受制於 ArrayBlockingQueue.poll() 的公平性。而這個poll()的實現原理,則是由 wait/notify 機制的公平性決定的。
如下,是執行緒池的worker工作原理:
// java.util.concurrent.ThreadPoolExecutor#runWorker /** * Main worker run loop. Repeatedly gets tasks from queue and * executes them, while coping with a number of issues: * * 1. We may start out with an initial task, in which case we * don't need to get the first one. Otherwise, as long as pool is * running, we get tasks from getTask. If it returns null then the * worker exits due to changed pool state or configuration * parameters. Other exits result from exception throws in * external code, in which case completedAbruptly holds, which * usually leads processWorkerExit to replace this thread. * * 2. Before running any task, the lock is acquired to prevent * other pool interrupts while the task is executing, and then we * ensure that unless pool is stopping, this thread does not have * its interrupt set. * * 3. Each task run is preceded by a call to beforeExecute, which * might throw an exception, in which case we cause thread to die * (breaking loop with completedAbruptly true) without processing * the task. * * 4. Assuming beforeExecute completes normally, we run the task, * gathering any of its thrown exceptions to send to afterExecute. * We separately handle RuntimeException, Error (both of which the * specs guarantee that we trap) and arbitrary Throwables. * Because we cannot rethrow Throwables within Runnable.run, we * wrap them within Errors on the way out (to the thread's * UncaughtExceptionHandler). Any thrown exception also * conservatively causes thread to die. * * 5. After task.run completes, we call afterExecute, which may * also throw an exception, which will also cause thread to * die. According to JLS Sec 14.20, this exception is the one that * will be in effect even if task.run throws. * * The net effect of the exception mechanics is that afterExecute * and the thread's UncaughtExceptionHandler have as accurate * information as we can provide about any problems encountered by * user code. * * @param w the worker */ final void runWorker(Worker w) { Thread wt = Thread.currentThread(); Runnable task = w.firstTask; w.firstTask = null; w.unlock(); // allow interrupts boolean completedAbruptly = true; try { // worker 不停地向佇列中獲取任務,然後執行 // 其中獲取任務的過程,可能被中斷,也可能不會,受到執行緒池伸縮配置的影響 while (task != null || (task = getTask()) != null) { w.lock(); // If pool is stopping, ensure thread is interrupted; // if not, ensure thread is not interrupted. This // requires a recheck in second case to deal with // shutdownNow race while clearing interrupt if ((runStateAtLeast(ctl.get(), STOP) || (Thread.interrupted() && runStateAtLeast(ctl.get(), STOP))) && !wt.isInterrupted()) wt.interrupt(); try { beforeExecute(wt, task); Throwable thrown = null; try { task.run(); } catch (RuntimeException x) { thrown = x; throw x; } catch (Error x) { thrown = x; throw x; } catch (Throwable x) { thrown = x; throw new Error(x); } finally { afterExecute(task, thrown); } } finally { task = null; w.completedTasks++; w.unlock(); } } completedAbruptly = false; } finally { processWorkerExit(w, completedAbruptly); } } /** * Performs blocking or timed wait for a task, depending on * current configuration settings, or returns null if this worker * must exit because of any of: * 1. There are more than maximumPoolSize workers (due to * a call to setMaximumPoolSize). * 2. The pool is stopped. * 3. The pool is shutdown and the queue is empty. * 4. This worker timed out waiting for a task, and timed-out * workers are subject to termination (that is, * {@code allowCoreThreadTimeOut || workerCount > corePoolSize}) * both before and after the timed wait, and if the queue is * non-empty, this worker is not the last thread in the pool. * * @return task, or null if the worker must exit, in which case * workerCount is decremented */ private Runnable getTask() { boolean timedOut = false; // Did the last poll() time out? for (;;) { int c = ctl.get(); int rs = runStateOf(c); // Check if queue empty only if necessary. if (rs >= SHUTDOWN && (rs >= STOP || workQueue.isEmpty())) { decrementWorkerCount(); return null; } int wc = workerCountOf(c); // Are workers subject to culling? boolean timed = allowCoreThreadTimeOut || wc > corePoolSize; if ((wc > maximumPoolSize || (timed && timedOut)) && (wc > 1 || workQueue.isEmpty())) { if (compareAndDecrementWorkerCount(c)) return null; continue; } try { // 可能呼叫超時方法,也可能呼叫阻塞方法 // 固定執行緒池的情況下,呼叫阻塞 take() 方法 Runnable r = timed ? workQueue.poll(keepAliveTime, TimeUnit.NANOSECONDS) : workQueue.take(); if (r != null) return r; timedOut = true; } catch (InterruptedException retry) { timedOut = false; } } }
即執行緒池worker持續向佇列獲取任務,執行即可。而佇列任務的獲取,則由兩個讀寫鎖決定:
// java.util.concurrent.ArrayBlockingQueue#take public E take() throws InterruptedException { final ReentrantLock lock = this.lock; // 此處鎖,保證執行執行緒安全性 lock.lockInterruptibly(); try { while (count == 0) // 此處釋放鎖等待,再次喚醒時,要求必須重新持有鎖 notEmpty.await(); return dequeue(); } finally { lock.unlock(); } } // /** * Inserts the specified element at the tail of this queue, waiting * for space to become available if the queue is full. * * @throws InterruptedException {@inheritDoc} * @throws NullPointerException {@inheritDoc} */ public void put(E e) throws InterruptedException { checkNotNull(e); final ReentrantLock lock = this.lock; lock.lockInterruptibly(); try { while (count == items.length) notFull.await(); enqueue(e); } finally { lock.unlock(); } } /** * Inserts element at current put position, advances, and signals. * Call only when holding lock. */ private void enqueue(E x) { // assert lock.getHoldCount() == 1; // assert items[putIndex] == null; final Object[] items = this.items; items[putIndex] = x; if (++putIndex == items.length) putIndex = 0; count++; // 通知取等執行緒,喚醒 notEmpty.signal(); }
所以,具體誰取到任務,就是要看誰搶到了鎖。而這,可能又涉及到jvm的高效排程策略啥的了吧。(雖然不確定,但感覺像) 至少,任務執行的表象是,所有任務被某個執行緒一直搶到。
3. 迴歸執行緒池
執行緒池的目的,在於處理一些非同步的任務,或者併發的執行多個無關聯的任務。在於讓系統減負。而當任務的提交消耗,大於了任務的執行消耗,那就沒必要使用多執行緒了,或者說這是錯誤的用法了。我們應該執行緒池做更重的活,而不是輕量級的。如上問題,執行效能必然很差。但我們稍做轉變,也許就不一樣了。
// 初始化執行緒池 private ExecutorService executor = new ThreadPoolExecutor(Runtime.getRuntime().availableProcessors(), Runtime.getRuntime().availableProcessors(), 0L, TimeUnit.SECONDS, new ArrayBlockingQueue<>(50), new NamedThreadFactory("test-pool"), new ThreadPoolExecutor.CallerRunsPolicy()); // 使用執行緒池處理任務 public Integer doTask(String updateIntervalDesc) throws Exception { long startTime = System.currentTimeMillis(); List<TestDto> testList; AtomicInteger affectNum = new AtomicInteger(0); int pageSize = 1000; AtomicInteger pageNo = new AtomicInteger(1); Map<String, Object> condGroupLabel = new HashMap<>(); log.info("start do sth:{}", updateIntervalDesc); List<Future<?>> futureList = new ArrayList<>(); do { PageHelper.startPage(pageNo.getAndIncrement(), pageSize); List<TestDto> list = testDao.getLabelListNew(condGroupLabel); testList = list; // 一批任務只向執行緒池中提交任務 Future<?> future = executor.submit(() -> { for (TestDto s : list) { try { // do sth... affectNum.incrementAndGet(); } catch (Throwable e) { log.error("error:{}", pageNo.get(), e); } } }); futureList.add(future); } while (testList.size() >= pageSize); // 等待任務完成 int i = 0; for (Future<?> future : futureList) { future.get(); log.info("done:+{} ", i++); } log.info("doTask done:{}, num:{}, cost:{}ms", updateIntervalDesc, affectNum.get(), System.currentTimeMillis() - startTime); return affectNum.get(); }
即,讓每個執行緒執行的任務足夠重,以至於完全忽略提交的消耗。這樣才能夠發揮多執行緒的作用。