一個關於wait/notify與鎖關係的探究

等你歸去來發表於2022-03-28

  wait/notify 機制是解決生產者消費者問題的良藥。它的核心邏輯是基於條件變數的鎖機制處理。所以,它們到底是什麼關係?wait()時是否需要持有鎖? notify()是否需要持有鎖?先說答案:都需要持有鎖。

  wait需要持有鎖的原因是,你肯定需要知道在哪個物件上進行等待,如果不持有鎖,將無法做到物件變更時進行實時感知通知的作用,但是為了讓其他物件可以操作該值的變化,它必須要先釋放掉鎖,然後在該節點上進行等待。不持有鎖而進行wait,可能會導致長眠不起。

  notify需要持有鎖的原因是,它要保證執行緒的安全,只有它知道資料變化了,所以它有權力去通知其他執行緒資料變化。而且通知完之後,不能立即釋放鎖,即必須在持有鎖的情況下進行通知,否則notify後續的工作的執行緒安全性將無法保證,儘量它是在lock的範圍內,但卻因為鎖釋放,將導致不可預期的結果。而且在notify的時候,並不能真正地將對應的執行緒喚醒,即不能從作業系統層面喚醒執行緒,因為此時當前通知執行緒持有鎖,而此時如果將其他等待執行緒喚醒,它們將立即參與到鎖的競爭中來,而這時的競爭是一定會失敗的,這可能會導致被喚醒的執行緒立即又進入等待佇列,更糟糕的是它可能再也不會被喚醒 了。所以不能將在持有鎖的時,將對應的執行緒真正喚醒,我們看到的notify只是從語言上下文級別,將它從等待佇列轉移到同步佇列而已,對此作業系統一無所知。

 

1. 實驗驗證

  我們通過一個實驗來看一下,wait/和notify是否會在持有鎖的情況下進行。

    private ReentrantLock mainLock = new ReentrantLock();

    @Test
    public void testWaitNotify() throws InterruptedException {
        Condition c1 = mainLock.newCondition();
        Condition c3 = mainLock.newCondition();

        CountDownLatch t1StartLatch = new CountDownLatch(2);
        Thread t1 = new Thread(() -> {
            mainLock.lock();
            try {
                System.out.println(LocalDateTime.now() + " - t1 start");
                c1.await();
                System.out.println(LocalDateTime.now() + " - t1 c1 await out");
                // 過早通知問題,導致無法測試下一步
//                c3.await();
//                System.out.println(LocalDateTime.now() + " - t1 c2 await out");
                t1StartLatch.await();
                System.out.println(LocalDateTime.now() + " - t1 sleeping");
                SleepUtil.sleepMillis(10_000L);
                c1.signalAll();
                c3.signalAll();
                System.out.println(LocalDateTime.now() + " - t1 notified, sleeping again");
                SleepUtil.sleepMillis(10_000L);
                System.out.println(LocalDateTime.now() + " - t1 out");
            }
            catch (Exception e) {
                System.err.println("t1 exception ");
                e.printStackTrace();
            }
            finally {
                mainLock.unlock();
            }
        }, "t1");
        Thread t2 = new Thread(() -> {
            mainLock.lock();
            try {
                t1StartLatch.countDown();
                System.out.println(LocalDateTime.now() + " - t2 c1 signal");
                c1.signalAll();
                System.out.println(LocalDateTime.now() + " - t2 wait");
                c1.await();
                System.out.println(LocalDateTime.now() + " - t2 out");
            }
            catch (Exception e) {
                System.err.println("t2 exception ");
                e.printStackTrace();
            }
            finally {
                mainLock.unlock();
            }
        }, "t2");
        Thread t3 = new Thread(() -> {
            mainLock.lock();
            try {
                t1StartLatch.countDown();
                System.out.println(LocalDateTime.now() + " - t3 c3 signal");
                c3.signalAll();
                System.out.println(LocalDateTime.now() + " - t3 wait");
                c3.await();
                System.out.println(LocalDateTime.now() + " - t3 out");
            }
            catch (Exception e) {
                System.err.println("t2 exception ");
                e.printStackTrace();
            }
            finally {
                mainLock.unlock();
            }
        }, "t3");
        t1.start();
        t2.start();
        t3.start();
        t1.join();
        System.out.println(LocalDateTime.now() + " - main t1 out");
        t2.join();
        System.out.println(LocalDateTime.now() + " - main t2 out");
        t3.join();
        System.out.println(LocalDateTime.now() + " - main t3 out");
    }

  大概意思是,針對同一個鎖,wait之後,是否可以被其他執行緒進入臨界區?如果wait之前不通知進入,wait之後能進入,說明wait依賴於鎖,而且會釋放當前鎖。notify之後,wait()是否會立即執行,如果必須等到notify的模組完成後,才執行,說明notify是必須要依賴於鎖的。

  結果如下:

2022-03-27T20:09:43.588 - t1 start
2022-03-27T20:09:43.603 - t2 c1 signal
2022-03-27T20:09:43.603 - t2 wait
2022-03-27T20:09:43.603 - t3 c3 signal
2022-03-27T20:09:43.603 - t3 wait
2022-03-27T20:09:43.603 - t1 c1 await out
2022-03-27T20:09:43.603 - t1 sleeping
2022-03-27T20:09:53.605 - t1 notified, sleeping again
2022-03-27T20:10:03.612 - t1 out
2022-03-27T20:10:03.612 - t2 out
2022-03-27T20:10:03.612 - main t1 out
2022-03-27T20:10:03.612 - t3 out
2022-03-27T20:10:03.612 - main t2 out
2022-03-27T20:10:03.612 - main t3 out


2022-03-27T20:11:39.982 - t1 start
2022-03-27T20:11:39.982 - t2 c1 signal
2022-03-27T20:11:39.982 - t2 wait
2022-03-27T20:11:39.982 - t3 c3 signal
2022-03-27T20:11:39.982 - t3 wait
2022-03-27T20:11:39.982 - t1 c1 await out
2022-03-27T20:11:39.982 - t1 sleeping
2022-03-27T20:11:49.989 - t1 notified, sleeping again
2022-03-27T20:11:59.990 - t1 out
2022-03-27T20:11:59.990 - t2 out
2022-03-27T20:11:59.990 - main t1 out
2022-03-27T20:11:59.990 - t3 out
2022-03-27T20:11:59.990 - main t2 out
2022-03-27T20:11:59.990 - main t3 out

  

2. wait/notify 的實現機制

  我們以AQS的實現機制為線索,探索wait/notify機制。它在喚醒操作佇列時,設定狀態為 SIGNAL , 但它實際不執行作業系統喚醒。

        //     java.util.concurrent.locks.AbstractQueuedSynchronizer.ConditionObject#signalAll
        /**
         * Moves all threads from the wait queue for this condition to
         * the wait queue for the owning lock.
         *
         * @throws IllegalMonitorStateException if {@link #isHeldExclusively}
         *         returns {@code false}
         */
        public final void signalAll() {
            if (!isHeldExclusively())
                throw new IllegalMonitorStateException();
            Node first = firstWaiter;
            if (first != null)
                doSignalAll(first);
        }

        // java.util.concurrent.locks.AbstractQueuedSynchronizer.ConditionObject#doSignalAll
        /**
         * Removes and transfers all nodes.
         * @param first (non-null) the first node on condition queue
         */
        private void doSignalAll(Node first) {
            lastWaiter = firstWaiter = null;
            do {
                Node next = first.nextWaiter;
                first.nextWaiter = null;
                transferForSignal(first);
                first = next;
            } while (first != null);
        }
    // java.util.concurrent.locks.AbstractQueuedSynchronizer#transferForSignal
    /**
     * Transfers a node from a condition queue onto sync queue.
     * Returns true if successful.
     * @param node the node
     * @return true if successfully transferred (else the node was
     * cancelled before signal)
     */
    final boolean transferForSignal(Node node) {
        /*
         * If cannot change waitStatus, the node has been cancelled.
         */
        if (!compareAndSetWaitStatus(node, Node.CONDITION, 0))
            return false;

        /*
         * Splice onto queue and try to set waitStatus of predecessor to
         * indicate that thread is (probably) waiting. If cancelled or
         * attempt to set waitStatus fails, wake up to resync (in which
         * case the waitStatus can be transiently and harmlessly wrong).
         */
        Node p = enq(node);
        int ws = p.waitStatus;
        // 不到萬不得已,不會真正喚醒等待中的佇列,從而滿足notify無法將執行緒喚醒的作用,或者說執行緒仍然在作業系統的等待佇列上
        // 它只是將當前執行緒移動到本語文的同步佇列中,以下執行緒下次執行過來時可以通過該限制
        if (ws > 0 || !compareAndSetWaitStatus(p, ws, Node.SIGNAL))
            LockSupport.unpark(node.thread);
        return true;
    }
    
    /**
     * Inserts node into queue, initializing if necessary. See picture above.
     * @param node the node to insert
     * @return node's predecessor
     */
    private Node enq(final Node node) {
        for (;;) {
            Node t = tail;
            if (t == null) { // Must initialize
                if (compareAndSetHead(new Node()))
                    tail = head;
            } else {
                node.prev = t;
                if (compareAndSetTail(t, node)) {
                    t.next = node;
                    return t;
                }
            }
        }
    }

        // java.util.concurrent.locks.AbstractQueuedSynchronizer.ConditionObject#await()
        /**
         * Implements interruptible condition wait.
         * <ol>
         * <li> If current thread is interrupted, throw InterruptedException.
         * <li> Save lock state returned by {@link #getState}.
         * <li> Invoke {@link #release} with saved state as argument,
         *      throwing IllegalMonitorStateException if it fails.
         * <li> Block until signalled or interrupted.
         * <li> Reacquire by invoking specialized version of
         *      {@link #acquire} with saved state as argument.
         * <li> If interrupted while blocked in step 4, throw InterruptedException.
         * </ol>
         */
        public final void await() throws InterruptedException {
            if (Thread.interrupted())
                throw new InterruptedException();
            Node node = addConditionWaiter();
            // 進來等待佇列,先釋放鎖,此時進入執行緒不安全狀態
            int savedState = fullyRelease(node);
            int interruptMode = 0;
            // 此判斷只是本語文級別的等待佇列限制
            // notify 時只能滿足這個條件,而不會將執行緒從作業系統掛起佇列中喚醒,即不會進行 LockSupport.unpark()
            while (!isOnSyncQueue(node)) {
                // 交由作業系統進行執行緒掛起
                LockSupport.park(this);
                if ((interruptMode = checkInterruptWhileWaiting(node)) != 0)
                    break;
            }
            // 重新進行鎖的獲取,嘗試
            if (acquireQueued(node, savedState) && interruptMode != THROW_IE)
                interruptMode = REINTERRUPT;
            if (node.nextWaiter != null) // clean up if cancelled
                unlinkCancelledWaiters();
            if (interruptMode != 0)
                reportInterruptAfterWait(interruptMode);
        }
    // java.util.concurrent.locks.AbstractQueuedSynchronizer#acquireQueued
    /**
     * Acquires in exclusive uninterruptible mode for thread already in
     * queue. Used by condition wait methods as well as acquire.
     *
     * @param node the node
     * @param arg the acquire argument
     * @return {@code true} if interrupted while waiting
     */
    final boolean acquireQueued(final Node node, int arg) {
        boolean failed = true;
        try {
            boolean interrupted = false;
            for (;;) {
                final Node p = node.predecessor();
                // 獲取當鎖,則替換head後返回
                // 而 tryAcquire() 則由各自策略實現
                if (p == head && tryAcquire(arg)) {
                    setHead(node);
                    p.next = null; // help GC
                    failed = false;
                    return interrupted;
                }
                // 如果獲取不到鎖,則重新進入作業系統等待佇列
                if (shouldParkAfterFailedAcquire(p, node) &&
                    parkAndCheckInterrupt())
                    interrupted = true;
            }
        } finally {
            if (failed)
                cancelAcquire(node);
        }
    }

  所以,總結:

1. wait將會釋放持有的鎖;
2. wait將會加入到語言級別的等待佇列,同時也會提交給作業系統的等待佇列,做到真正的執行緒掛起;
3. wait將會在被作業系統喚醒後,重新進行新一輪的鎖獲取嘗試,返回時已攜帶回原有的鎖,從外部看起來就像鎖一直都在一樣;
4. notify不會真正的喚醒等待的執行緒,而只是將各等待執行緒從語言級別的等待佇列移出,到語言級別的同步佇列;
5. notify只有在極端情況下,才會做到執行緒的真正喚醒作用,比如中斷,但這被喚醒的執行緒將無法正常進行業務操作,所以也是安全的;
6. 只有在整體的鎖在進行 unlock() 的時候,才會喚醒執行緒,使其重新參與鎖的競爭;

  

3. lock/unlock 流程

  同樣的AQS的實現為線索,lock/unlock 流程如下:

    // java.util.concurrent.locks.ReentrantLock#lock
    /**
     * Acquires the lock.
     *
     * <p>Acquires the lock if it is not held by another thread and returns
     * immediately, setting the lock hold count to one.
     *
     * <p>If the current thread already holds the lock then the hold
     * count is incremented by one and the method returns immediately.
     *
     * <p>If the lock is held by another thread then the
     * current thread becomes disabled for thread scheduling
     * purposes and lies dormant until the lock has been acquired,
     * at which time the lock hold count is set to one.
     */
    public void lock() {
        sync.lock();
    }
    
        // java.util.concurrent.locks.ReentrantLock.NonfairSync#lock
        /**
         * Performs lock.  Try immediate barge, backing up to normal
         * acquire on failure.
         */
        final void lock() {
            if (compareAndSetState(0, 1))
                setExclusiveOwnerThread(Thread.currentThread());
            else
                acquire(1);
        }
    // java.util.concurrent.locks.AbstractQueuedSynchronizer#acquire
    /**
     * Acquires in exclusive mode, ignoring interrupts.  Implemented
     * by invoking at least once {@link #tryAcquire},
     * returning on success.  Otherwise the thread is queued, possibly
     * repeatedly blocking and unblocking, invoking {@link
     * #tryAcquire} until success.  This method can be used
     * to implement method {@link Lock#lock}.
     *
     * @param arg the acquire argument.  This value is conveyed to
     *        {@link #tryAcquire} but is otherwise uninterpreted and
     *        can represent anything you like.
     */
    public final void acquire(int arg) {
        if (!tryAcquire(arg) &&
            // 同上wait時的鎖爭搶操作
            acquireQueued(addWaiter(Node.EXCLUSIVE), arg))
            selfInterrupt();
    }
    
    // java.util.concurrent.locks.ReentrantLock#unlock
    /**
     * Attempts to release this lock.
     *
     * <p>If the current thread is the holder of this lock then the hold
     * count is decremented.  If the hold count is now zero then the lock
     * is released.  If the current thread is not the holder of this
     * lock then {@link IllegalMonitorStateException} is thrown.
     *
     * @throws IllegalMonitorStateException if the current thread does not
     *         hold this lock
     */
    public void unlock() {
        sync.release(1);
    }
    
    // java.util.concurrent.locks.AbstractQueuedSynchronizer#release
    /**
     * Releases in exclusive mode.  Implemented by unblocking one or
     * more threads if {@link #tryRelease} returns true.
     * This method can be used to implement method {@link Lock#unlock}.
     *
     * @param arg the release argument.  This value is conveyed to
     *        {@link #tryRelease} but is otherwise uninterpreted and
     *        can represent anything you like.
     * @return the value returned from {@link #tryRelease}
     */
    public final boolean release(int arg) {
        if (tryRelease(arg)) {
            Node h = head;
            // 直接喚醒頭節點(真正的喚醒)
            if (h != null && h.waitStatus != 0)
                unparkSuccessor(h);
            return true;
        }
        return false;
    }
    
    // java.util.concurrent.locks.AbstractQueuedSynchronizer#unparkSuccessor
    /**
     * Wakes up node's successor, if one exists.
     *
     * @param node the node
     */
    private void unparkSuccessor(Node node) {
        /*
         * If status is negative (i.e., possibly needing signal) try
         * to clear in anticipation of signalling.  It is OK if this
         * fails or if status is changed by waiting thread.
         */
        int ws = node.waitStatus;
        if (ws < 0)
            compareAndSetWaitStatus(node, ws, 0);

        /*
         * Thread to unpark is held in successor, which is normally
         * just the next node.  But if cancelled or apparently null,
         * traverse backwards from tail to find the actual
         * non-cancelled successor.
         */
        Node s = node.next;
        if (s == null || s.waitStatus > 0) {
            s = null;
            for (Node t = tail; t != null && t != node; t = t.prev)
                if (t.waitStatus <= 0)
                    s = t;
        }
        // 真正喚醒執行緒,只有一個執行緒將被喚醒
        if (s != null)
            LockSupport.unpark(s.thread);
    }

  總結: lock/unlock 是一個真正的上鎖解鎖操作,上鎖時如未成功,則進行park()進行作業系統掛起,解鎖時將頭節點unpark()交由作業系統排程。

 

4. 喚醒多個等待執行緒

  如何喚醒多個等待執行緒?共享鎖有這個需求,其實也是notifyAll 的表面語義所在。

    // java.util.concurrent.locks.AbstractQueuedSynchronizer#releaseShared
    /**
     * Releases in shared mode.  Implemented by unblocking one or more
     * threads if {@link #tryReleaseShared} returns true.
     *
     * @param arg the release argument.  This value is conveyed to
     *        {@link #tryReleaseShared} but is otherwise uninterpreted
     *        and can represent anything you like.
     * @return the value returned from {@link #tryReleaseShared}
     */
    public final boolean releaseShared(int arg) {
        if (tryReleaseShared(arg)) {
            doReleaseShared();
            return true;
        }
        return false;
    }

    // java.util.concurrent.locks.AbstractQueuedSynchronizer#doReleaseShared
    /**
     * Release action for shared mode -- signals successor and ensures
     * propagation. (Note: For exclusive mode, release just amounts
     * to calling unparkSuccessor of head if it needs signal.)
     */
    private void doReleaseShared() {
        /*
         * Ensure that a release propagates, even if there are other
         * in-progress acquires/releases.  This proceeds in the usual
         * way of trying to unparkSuccessor of head if it needs
         * signal. But if it does not, status is set to PROPAGATE to
         * ensure that upon release, propagation continues.
         * Additionally, we must loop in case a new node is added
         * while we are doing this. Also, unlike other uses of
         * unparkSuccessor, we need to know if CAS to reset status
         * fails, if so rechecking.
         */
        for (;;) {
            Node h = head;
            if (h != null && h != tail) {
                int ws = h.waitStatus;
                if (ws == Node.SIGNAL) {
                    if (!compareAndSetWaitStatus(h, Node.SIGNAL, 0))
                        continue;            // loop to recheck cases
                    // 喚醒頭節點
                    unparkSuccessor(h);
                }
                // 因為上一頭節點剛剛被設定為0,說明正在執行中,設定當前head為 PROPAGATE
                else if (ws == 0 &&
                         !compareAndSetWaitStatus(h, 0, Node.PROPAGATE))
                    continue;                // loop on failed CAS
            }
            // 即儘量只設定一個 head 節點即可
            // 除非在這期間發生變更
            if (h == head)                   // loop if head changed
                break;
        }
    }


    // java.util.concurrent.locks.AbstractQueuedSynchronizer#acquireSharedInterruptibly
    /**
     * Acquires in shared mode, aborting if interrupted.  Implemented
     * by first checking interrupt status, then invoking at least once
     * {@link #tryAcquireShared}, returning on success.  Otherwise the
     * thread is queued, possibly repeatedly blocking and unblocking,
     * invoking {@link #tryAcquireShared} until success or the thread
     * is interrupted.
     * @param arg the acquire argument.
     * This value is conveyed to {@link #tryAcquireShared} but is
     * otherwise uninterpreted and can represent anything
     * you like.
     * @throws InterruptedException if the current thread is interrupted
     */
    public final void acquireSharedInterruptibly(int arg)
            throws InterruptedException {
        if (Thread.interrupted())
            throw new InterruptedException();
        if (tryAcquireShared(arg) < 0)
            doAcquireSharedInterruptibly(arg);
    }
    // java.util.concurrent.locks.AbstractQueuedSynchronizer#doAcquireSharedInterruptibly
    /**
     * Acquires in shared interruptible mode.
     * @param arg the acquire argument
     */
    private void doAcquireSharedInterruptibly(int arg)
        throws InterruptedException {
        final Node node = addWaiter(Node.SHARED);
        boolean failed = true;
        try {
            for (;;) {
                final Node p = node.predecessor();
                if (p == head) {
                    int r = tryAcquireShared(arg);
                    if (r >= 0) {
                        // 共享式鎖的傳播性質實現
                        setHeadAndPropagate(node, r);
                        p.next = null; // help GC
                        failed = false;
                        return;
                    }
                }
                if (shouldParkAfterFailedAcquire(p, node) &&
                    parkAndCheckInterrupt())
                    throw new InterruptedException();
            }
        } finally {
            if (failed)
                cancelAcquire(node);
        }
    }
    
    // java.util.concurrent.locks.AbstractQueuedSynchronizer#setHeadAndPropagate
    /**
     * Sets head of queue, and checks if successor may be waiting
     * in shared mode, if so propagating if either propagate > 0 or
     * PROPAGATE status was set.
     *
     * @param node the node
     * @param propagate the return value from a tryAcquireShared
     */
    private void setHeadAndPropagate(Node node, int propagate) {
        Node h = head; // Record old head for check below
        setHead(node);
        /*
         * Try to signal next queued node if:
         *   Propagation was indicated by caller,
         *     or was recorded (as h.waitStatus either before
         *     or after setHead) by a previous operation
         *     (note: this uses sign-check of waitStatus because
         *      PROPAGATE status may transition to SIGNAL.)
         * and
         *   The next node is waiting in shared mode,
         *     or we don't know, because it appears null
         *
         * The conservatism in both of these checks may cause
         * unnecessary wake-ups, but only when there are multiple
         * racing acquires/releases, so most need signals now or soon
         * anyway.
         */
        if (propagate > 0 || h == null || h.waitStatus < 0 ||
            (h = head) == null || h.waitStatus < 0) {
            Node s = node.next;
            // 遞迴進行喚醒下一執行緒節點,從而級聯喚醒
            if (s == null || s.isShared())
                doReleaseShared();
        }
    }

    /**
     * Release action for shared mode -- signals successor and ensures
     * propagation. (Note: For exclusive mode, release just amounts
     * to calling unparkSuccessor of head if it needs signal.)
     */
    private void doReleaseShared() {
        /*
         * Ensure that a release propagates, even if there are other
         * in-progress acquires/releases.  This proceeds in the usual
         * way of trying to unparkSuccessor of head if it needs
         * signal. But if it does not, status is set to PROPAGATE to
         * ensure that upon release, propagation continues.
         * Additionally, we must loop in case a new node is added
         * while we are doing this. Also, unlike other uses of
         * unparkSuccessor, we need to know if CAS to reset status
         * fails, if so rechecking.
         */
        for (;;) {
            Node h = head;
            if (h != null && h != tail) {
                int ws = h.waitStatus;
                if (ws == Node.SIGNAL) {
                    if (!compareAndSetWaitStatus(h, Node.SIGNAL, 0))
                        continue;            // loop to recheck cases
                    unparkSuccessor(h);
                }
                else if (ws == 0 &&
                         !compareAndSetWaitStatus(h, 0, Node.PROPAGATE))
                    continue;                // loop on failed CAS
            }
            if (h == head)                   // loop if head changed
                break;
        }
    }

  總結: 多個執行緒的喚醒,主要是使用了級聯喚醒的機制,在做共享鎖時,根據現有的情況,進行喚醒下一執行緒。而當執行緒排程很快或演算法不確定時,就會給人一種所有執行緒一起被喚醒工作的效果。

相關文章