Django | 訊號使用思考

青陽半雪發表於2023-04-20

重拾些許關於訊號模組使用的記憶,記錄對於 Django 訊號使用的思考。

本文使用的 Django 的版本是 4.2

1 原始碼註釋

import logging
import threading
import weakref

from django.utils.inspect import func_accepts_kwargs

logger = logging.getLogger("django.dispatch")


def _make_id(target):
    """
    對傳遞進來的函式生成對應的標識,這裡使用了 id 函式
    """

    # 如果物件具有 __func__ 屬性,則意味著函式是類中的函式
    if hasattr(target, "__func__"):
        return (id(target.__self__), id(target.__func__))
    return id(target)

# None 對應的標識,意味著無意義的鍵
NONE_ID = _make_id(None)

# A marker for caching
NO_RECEIVERS = object()


class Signal:
    """
    Base class for all signals

    Internal attributes:

        receivers
            { receiverkey (id) : weakref(receiver) }
    """

    def __init__(self, use_caching=False):
        """
        建立一個新的訊號物件

        Create a new signal.
        """
        # 接收器列表,好比訂閱者列表
        self.receivers = []

        # Django 的 Signal 系統需要處理多執行緒環境中的併發問題。在多執行緒應用中,可能會有
        # 多個執行緒同時操作 Signal 物件,例如連線或斷開接收器、傳送訊號等。為了確保 Signal
        # 物件在多執行緒環境中的一致性和執行緒安全,Django 使用 threading.Lock 對關鍵
        # 部分的程式碼進行加鎖。
        self.lock = threading.Lock()

        # 是否使用快取
        self.use_caching = use_caching

        # For convenience we create empty caches even if they are not used.
        # A note about caching: if use_caching is defined, then for each
        # distinct sender we cache the receivers that sender has in
        # 'sender_receivers_cache'. The cache is cleaned when .connect() or
        # .disconnect() is called and populated on send().
        # 快取傳送者物件和對應的接收器
        self.sender_receivers_cache = weakref.WeakKeyDictionary() if use_caching else {}

        # 標識是否存在已經失效的接收器
        self._dead_receivers = False

    def connect(self, receiver, sender=None, weak=True, dispatch_uid=None):
        """
        用於將訊號接收器(receiver)註冊到訊號物件(signal)。接收器是一個函式,當訊號
        被髮送時,對應傳送者所有對應的接收器將被觸發

        Connect receiver to sender for signal.

        Arguments:

            receiver 接收器

                接收器是一個用來接收訊號的函式或者物件的方法,接收器必須可 hash。
                A function or an instance method which is to receive signals.
                Receivers must be hashable objects.

                當 weak 為 True 時,接收器一定可以被弱引用。
                If weak is True, then receiver must be weak referenceable.

                接收器必須可以接受關鍵字引數
                Receivers must be able to accept keyword arguments.

                如果一個接收器(A) 連線時,使用了 dispatch_uid 引數,那麼如果其他接收器(B)連線時,
                使用了同樣的 dispatch_uid,那麼接收器(A)將不會被新增,即 dispatch_uid 不能重複。
                If a receiver is connected with a dispatch_uid argument, it
                will not be added if another receiver was already connected
                with that dispatch_uid.

            sender 傳送者
                一個用於觸發接收器響應的物件。如果為 sender 設定一個具體的物件,那麼只有來自該
                物件傳送的訊號才會觸發接收器。如果省略 sender 引數,那麼該接收器將響應所有傳送者的訊號。
                在 django 的呼叫中,多處基本上都是類。例如 request_started 訊號對應的傳送者是
                class 'django.core.handlers.wsgi.WSGIHandler'

                The sender to which the receiver should respond. Must either be
                a Python object, or None to receive events from any sender.

            weak 弱引用
                是否使用對接收器的弱引用。預設情況下,該模組將嘗試使用弱引用來引用接收器。
                如果這個引數為 false,那麼將使用強引用

                Whether to use weak references to the receiver. By default, the
                module will attempt to use weak references to the receiver
                objects. If this parameter is false, then strong references will
                be used.

            dispatch_uid
                在可能傳送重複訊號的情況下,訊號接收器的唯一識別符號

                一個用於唯一地標識一個特定接收器物件的識別符號,它通常是一個字串,雖然它可以是
                任何可雜湊的東西。

                An identifier used to uniquely identify a particular instance of
                a receiver. This will usually be a string, though it may be
                anything hashable.
        """
        from django.conf import settings

        # If DEBUG is on, check that we got a good receiver
        # 如果開啟 DEBUG 模式,檢測接收器是否符合要求
        if settings.configured and settings.DEBUG:
            if not callable(receiver):
                raise TypeError("Signal receivers must be callable.")
            # Check for **kwargs
            # 檢查接收器接收的是否都是關鍵字引數
            if not func_accepts_kwargs(receiver):
                raise ValueError(
                    "Signal receivers must accept keyword arguments (**kwargs)."
                )

        # 如果指定了 dispatch_uid,則優先使用 dispatch_uid,所以針對同一個訊號,同樣的傳送者
        # dispatch_uid 是不能重複的,否則後續驗證 lookup_key 已經存在的話,接收器則不會加入
        # 到接收器列表。
        if dispatch_uid:
            lookup_key = (dispatch_uid, _make_id(sender))
        else:
            lookup_key = (_make_id(receiver), _make_id(sender))

        # 預設使用弱引用,這個也是弱引用的妙用之處。
        if weak:
            ref = weakref.ref
            receiver_object = receiver
            # Check for bound methods
            if hasattr(receiver, "__self__") and hasattr(receiver, "__func__"):
                ref = weakref.WeakMethod
                receiver_object = receiver.__self__
            receiver = ref(receiver)
            weakref.finalize(receiver_object, self._remove_receiver)

        with self.lock:
            # 清除無效的接收器
            self._clear_dead_receivers()
            if not any(r_key == lookup_key for r_key, _ in self.receivers):
                # 如果接收器對應的鍵不在訊號物件的接收器列表中,則加入到接收器列表中
                self.receivers.append((lookup_key, receiver))
            # 清除 sender_receivers_cache 快取
            self.sender_receivers_cache.clear()

    def disconnect(self, receiver=None, sender=None, dispatch_uid=None):
        """
        為指定的傳送者物件移除對應的接收器

        Disconnect receiver from sender for signal.

        如果使用了弱引用,disconnect 函式不需要呼叫。因為弱引用的接收器會自動移除。

        If weak references are used, disconnect need not be called. The receiver
        will be removed from dispatch automatically.

        Arguments:

            receiver
                The registered receiver to disconnect. May be none if
                dispatch_uid is specified.

            sender
                The registered sender to disconnect

            dispatch_uid
                the unique identifier of the receiver to disconnect
        """
        # 計算索引鍵
        if dispatch_uid:
            lookup_key = (dispatch_uid, _make_id(sender))
        else:
            lookup_key = (_make_id(receiver), _make_id(sender))

        disconnected = False
        with self.lock:
            self._clear_dead_receivers()
            # 透過對比索引鍵,如果存在,則進行刪除
            for index in range(len(self.receivers)):
                (r_key, _) = self.receivers[index]
                if r_key == lookup_key:
                    disconnected = True
                    del self.receivers[index]
                    break
            # 刪除完後需要重置 sender_receivers_cache 快取
            self.sender_receivers_cache.clear()

        # 返回是否斷開的標識,資料型別為布林型
        return disconnected

    def has_listeners(self, sender=None):
        """是否存在指定傳送者有效的接收器"""
        return bool(self._live_receivers(sender))

    def send(self, sender, **named):
        """
        傳送訊號到指定傳送者的接收器中

        Send signal from sender to all connected receivers.

        If any receiver raises an error, the error propagates back through send,
        terminating the dispatch loop. So it's possible that all receivers
        won't be called if an error is raised.

        Arguments:

            sender
                The sender of the signal. Either a specific object or None.

            named
                Named arguments which will be passed to receivers.

        Return a list of tuple pairs [(receiver, response), ... ].
        """
        if not self.receivers or self.sender_receivers_cache.get(sender) is NO_RECEIVERS:
            return []

        return [(receiver, receiver(signal=self, sender=sender, **named)) for receiver in self._live_receivers(sender)]

    def send_robust(self, sender, **named):
        """
        Send signal from sender to all connected receivers catching errors.

        Arguments:

            sender
                The sender of the signal. Can be any Python object (normally one
                registered with a connect if you actually want something to
                occur).

            named
                Named arguments which will be passed to receivers.

        Return a list of tuple pairs [(receiver, response), ... ].

        If any receiver raises an error (specifically any subclass of
        Exception), return the error instance as the result for that receiver.
        """
        if not self.receivers or self.sender_receivers_cache.get(sender) is NO_RECEIVERS:
            return []

        # Call each receiver with whatever arguments it can accept.
        # Return a list of tuple pairs [(receiver, response), ... ].
        responses = []
        for receiver in self._live_receivers(sender):
            try:
                response = receiver(signal=self, sender=sender, **named)
            except Exception as err:
                logger.error(
                    "Error calling %s in Signal.send_robust() (%s)",
                    receiver.__qualname__,
                    err,
                    exc_info=err,
                )
                responses.append((receiver, err))
            else:
                responses.append((receiver, response))
        return responses

    def _clear_dead_receivers(self):
        """清除無效的接收器"""
        # Note: caller is assumed to hold self.lock.
        if self._dead_receivers:
            self._dead_receivers = False

            # 迭代處理,獲取有效的接收器
            # - 如果是強引用,這直接略過
            # - 如果是弱引用,弱引用物件執行為 None,則代表是無效的接收器
            self.receivers = [
                r for r in self.receivers if not (isinstance(r[1], weakref.ReferenceType) and r[1]() is None)
            ]

    def _live_receivers(self, sender):
        """
        根據指定的傳送者獲取接收器列表

        Filter sequence of receivers to get resolved, live receivers.

        This checks for weak references and resolves them, then returning only
        live receivers.
        """
        # 初始化接收器列表物件
        receivers = None

        # 如果使用了快取,同時 _dead_receivers 為 False 時
        if self.use_caching and not self._dead_receivers:
            # 直接透過傳送者物件獲取接收器列表
            receivers = self.sender_receivers_cache.get(sender)
            # We could end up here with NO_RECEIVERS even if we do check this case in
            # .send() prior to calling _live_receivers() due to concurrent .send() call.
            # 如果接收器列表為空,則不做任何動作,直接返回
            if receivers is NO_RECEIVERS:
                return []

        # 如果接收器列表為 None
        if receivers is None:
            with self.lock:
                # 清除無效的接收器
                self._clear_dead_receivers()
                senderkey = _make_id(sender)
                receivers = []

                # 根據傳送者校驗,獲取傳送者物件對應的接收器列表
                for (receiverkey, r_senderkey), receiver in self.receivers:
                    # 因為 sender 在有些訊號物件中是為 None,所以需要判斷是否是 NONE_ID
                    if r_senderkey == NONE_ID or r_senderkey == senderkey:
                        receivers.append(receiver)

                # 如果使用管理快取,則進行快取
                if self.use_caching:
                    if not receivers:
                        self.sender_receivers_cache[sender] = NO_RECEIVERS
                    else:
                        # Note, we must cache the weakref versions.
                        self.sender_receivers_cache[sender] = receivers
        non_weak_receivers = []

        # 迭代處理獲取非弱引用的接收器(即正常的接收器)
        for receiver in receivers:
            if isinstance(receiver, weakref.ReferenceType):
                # Dereference the weak reference.
                receiver = receiver()
                if receiver is not None:
                    non_weak_receivers.append(receiver)
            else:
                # 如果是強引用,則直接加入
                non_weak_receivers.append(receiver)
        return non_weak_receivers

    def _remove_receiver(self, receiver=None):
        """
        當弱引用引用的物件不存在時,給當前的訊號標識存在無效的接收器

        標註 self.receivers 存在無效的弱引用。如果存在無效的弱引用,
        將在 connect、disconnect 和 _live_receivers 中清理這些
        無效的弱引用物件。
        """

        # Mark that the self.receivers list has dead weakrefs. If so, we will
        # clean those up in connect, disconnect and _live_receivers while
        # holding self.lock. Note that doing the cleanup here isn't a good
        # idea, _remove_receiver() will be called as side effect of garbage
        # collection, and so the call can happen while we are already holding
        # self.lock.
        self._dead_receivers = True


def receiver(signal, **kwargs):
    """
    連線接收器到訊號的裝飾器,其內部實際上是對 connect 方法的包裝,使用裝飾器看起來更直觀一些。

    A decorator for connecting receivers to signals. Used by passing in the
    signal (or list of signals) and keyword arguments to connect::

        @receiver(post_save, sender=MyModel)
        def signal_receiver(sender, **kwargs):
            ...

        @receiver([post_save, post_delete], sender=MyModel)
        def signals_receiver(sender, **kwargs):
            ...
    """

    def _decorator(func):
        if isinstance(signal, (list, tuple)):
            for s in signal:
                s.connect(func, **kwargs)
        else:
            signal.connect(func, **kwargs)
        return func

    return _decorator

2 函式清單

2.1 _make_id 方法

def _make_id(target):
    if hasattr(target, "__func__"):
        return (id(target.__self__), id(target.__func__))
    return id(target)

首先認真分析下其業務實現,target 引數是接收器(即普通的函式或者是 bound 方法)

  • 如果是普通的函式,則使用 id 函式獲取 target 的唯一標識,返回的型別是整型,即一個數字。
  • 如果是 bound 方法,返回的結果是一個元組,其元組包含兩個元素,其中第一個元素是 target 所關聯物件的唯一標識,第二個元素是 target 的唯一標識。

同時參考下 connect 方法中對 _make_id 的呼叫,下面摘取一些片段

        if dispatch_uid:
            lookup_key = (dispatch_uid, _make_id(sender))
        else:
            lookup_key = (_make_id(receiver), _make_id(sender))

        # 省略程式碼

        with self.lock:
            # 省略程式碼
            if not any(r_key == lookup_key for r_key, _ in self.receivers):
                # 如果接收器對應的鍵不在訊號物件的接收器列表中,則加入到接收器列表中
                self.receivers.append((lookup_key, receiver))
            # 省略程式碼

可以清楚的看到 lookup_key 是一個元組,因為我們這裡重點關注了接收器,所以就元組的第一個元素做些說明,元組的第一個元素,根據接收器的型別,所以有可能是一個數字,也有可能是一個元組。接下來使用一個示例驗證下。

from django.core.signals import request_started
from django.dispatch import receiver


class CustomSignal:
    def bound_method(self, signal=None, sender=None, environ=None, **kwargs):
        print("bound method receiver run")
        print(request_started.receivers)


custom_signal = CustomSignal()
request_started.connect(custom_signal.bound_method)


@receiver(request_started)
def common_function(signal=None, sender=None, environ=None, **kwargs):
    print("common method receiver run")

這個示例針對 request_started 訊號做了兩個接收器

  • bound 方法:custom_signal.bound_method
  • 普通函式:common_function

然後執行後看下結果:

bound method receiver run
[
    ((4507063040, 4496364336), <weakref at 0x10ca179c0; to 'function' at 0x10ca45300 (reset_queries)>),
    ((4507064640, 4496364336), <weakref at 0x10ca509a0; to 'function' at 0x10ca45940 (close_old_connections)>),
    (((4522035984, 4521976480), 4496364336), <weakref at 0x10d859310; to 'CustomSignal' at 0x10d88cb10>),
    ((4521976640, 4496364336), <weakref at 0x10d888b80; to 'function' at 0x10d87e340 (common_function)>)
]
common method receiver run

根據執行結果可以清楚的看到

  • 如果是 bound 方法:lookup_key((4522035984, 4521976480), 4496364336),元組的第一個元素也是一個元組,原型即 (id(target.__self__), id(target.__func__))
  • 如果是普通的函式:lookup_key(4521976640, 4496364336)

2.2 為什麼要使用 threading.Lock

DjangoSignal 系統需要處理多執行緒環境中的併發問題。在多執行緒應用中,可能會有多個執行緒同時操作 Signal 物件,例如連線或斷開接收器、傳送訊號等。為了確保 Signal 物件在多執行緒環境中的一致性和執行緒安全,Django 使用 threading.Lock 對關鍵部分的程式碼進行加鎖。

threading.Lock 是 Python 標準庫中提供的一個基本同步原語,用於線上程之間提供互斥訪問共享資源的能力。透過使用鎖,Django 可以確保同一時間只有一個執行緒對 Signal 物件進行修改,避免了潛在的競爭條件和不一致問題。

以下是在 Django Signal 中使用鎖的幾個例子:

  • 在連線接收器時,Django 使用鎖來確保執行緒安全地更新接收器列表。這可以防止在多執行緒環境中同時連線接收器時導致的不一致狀態。
with self.lock:
    self.receivers.append((lookup_key, receiver))
  • 在斷開接收器時,Django 同樣使用鎖來確保執行緒安全地從接收器列表中移除接收器
with self.lock:
    for index in range(len(self.receivers)):
        # ...
  • 在傳送訊號時,Django 使用鎖來確保執行緒安全地遍歷接收器列表並呼叫相應的接收器。這樣可以防止在傳送訊號的過程中修改接收器列表導致的潛在問題
with self.lock:
    for _, receiver in self.receivers:
        # ...

總之,Django 使用 threading.Lock 來確保 Signal 物件在多執行緒環境中的執行緒安全。透過在關鍵程式碼段中使用鎖,可以避免併發修改導致的不一致問題和競爭條件。這使得 Django 的 Signal 系統在多執行緒應用中能夠可靠地工作

2.3 sender 和 sender_robust 的區別

這裡透過原始碼可以清楚看到它們之間的區別。

Django 訊號系統中,sendersender_robust 並沒有直接的區別,因為這兩個詞沒有特定的定義。它們都是 Signal 類的方法,用於傳送訊號,但在處理接收器異常時,行為是不同的。下面詳細解釋這兩個方法的區別:

  • send: send 方法用於傳送訊號。當使用 send 傳送訊號時,它會呼叫與訊號關聯的所有接收器。如果接收器引發異常,send 會將該異常丟擲,導致程式中止。換句話說,send 在遇到接收器異常時沒有錯誤處理機制,因此可能導致程式崩潰。
def send(self, sender, **named):
    # ...省略部分程式碼
    responses = []
    for receiver in self._live_receivers(sender):
        response = receiver(signal=self, sender=sender, **named)
        responses.append((receiver, response))
    return responses
  • send_robust: send_robust 方法與 send 類似,也用於傳送訊號。但是,它會處理接收器丟擲的異常。如果接收器引發異常,send_robust 會捕獲異常並將其新增到響應列表,而不是中斷程式。這意味著,在使用 send_robust 時,程式會繼續執行,即使某個接收器丟擲了異常。
def send_robust(self, sender, **named):
    # ...省略部分程式碼
    responses = []
    for receiver in self._live_receivers(sender):
        try:
            response = receiver(signal=self, sender=sender, **named)
            responses.append((receiver, response))
        except Exception as err:
            responses.append((receiver, err))
    return responses

總之,sendsend_robust 的主要區別在於它們如何處理接收器丟擲的異常。send 方法在遇到異常時會中斷程式,而 send_robust 會捕獲異常並將其新增到響應列表,以便在後續處理。send_robust 為程式提供了更健壯的錯誤處理,因此在處理潛在的接收器錯誤時更安全。

2.4 _live_receivers 輔助函式

_live_receivers 是一個內部輔助函式,用於篩選出有效的接收器列表。在傳送訊號時,Django 需要找到所有活躍的、有效的接收器來響應訊號。由於某些接收器可能使用弱引用(weak reference)來避免迴圈引用問題,當接收器指向的物件被銷燬時,弱引用將不再有效。因此,在傳送訊號前,需要篩選出仍然有效的接收器。

相關文章