[原始碼分析] 分散式任務佇列 Celery 之 傳送Task & AMQP

羅西的思考發表於2021-04-19

[原始碼分析] 分散式任務佇列 Celery 之 傳送Task & AMQP

0x00 摘要

Celery是一個簡單、靈活且可靠的,處理大量訊息的分散式系統,專注於實時處理的非同步任務佇列,同時也支援任務排程。

在之前的文章中,我們看到了關於Task的分析,本文我們重點看看在客戶端如何傳送Task,以及 Celery 的amqp物件如何使用。

在閱讀之前,我們依然要提出幾個問題,以此作為閱讀時候的指引:

  • 客戶端啟動時候,Celery 應用 和 使用者自定義 Task 是如何生成的?
  • Task 裝飾器起到了什麼作用?
  • 傳送 Task 時候,訊息是如何組裝的?
  • 傳送 Task 時候,採用什麼媒介(模組)來傳送?amqp?
  • Task 傳送出去之後,在 Redis 之中如何儲存?

說明:在整理文章時,發現漏發了一篇,從而會影響大家閱讀思路,特此補上,請大家諒解。

[原始碼分析] 訊息佇列 Kombu 之 mailbox

[原始碼分析] 訊息佇列 Kombu 之 Hub

[原始碼分析] 訊息佇列 Kombu 之 Consumer

[原始碼分析] 訊息佇列 Kombu 之 Producer

[原始碼分析] 訊息佇列 Kombu 之 啟動過程

[原始碼解析] 訊息佇列 Kombu 之 基本架構

[原始碼解析] 並行分散式框架 Celery 之架構 (1)

[原始碼解析] 並行分散式框架 Celery 之架構 (2)

[原始碼解析] 並行分散式框架 Celery 之 worker 啟動 (1)

[原始碼解析] 並行分散式框架 Celery 之 worker 啟動 (2)

[原始碼解析] 分散式任務佇列 Celery 之啟動 Consumer

[原始碼解析] 並行分散式任務佇列 Celery 之 Task是什麼

[從原始碼學設計]celery 之 傳送Task & AMQP 就是本文,從客戶端角度講解傳送Task

[原始碼解析] 並行分散式任務佇列 Celery 之 消費動態流程 下一篇文章從服務端角度講解收到 Task 如何消費

[原始碼解析] 並行分散式任務佇列 Celery 之 多程式模型

0x01 示例程式碼

我們首先給出示例程式碼。

1.1 服務端

示例程式碼服務端如下,這裡使用了裝飾器來包裝待執行任務。

from celery import Celery

app = Celery('myTest', broker='redis://localhost:6379')

@app.task
def add(x,y):
    return x+y

if __name__ == '__main__':
    app.worker_main(argv=['worker'])

1.2 客戶端

客戶端傳送程式碼如下,就是呼叫 add Task 來做加法計算:

from myTest import add
re = add.apply_async((2,17))

我們開始具體介紹,以下均是客戶端的執行序列。

0x02 系統啟動

我們首先要介紹 在客戶端,Celery 系統和 task(例項) 是如何啟動的。

2.1 產生Celery

如下程式碼首先會執行 myTest 這個 Celery。

app = Celery('myTest', broker='redis://localhost:6379')

2.2 task 裝飾器

Celery 使用了裝飾器來包裝待執行任務(因為各種語言的類似概念,在本文中可能會混用裝飾器或者註解這兩個術語)

@app.task
def add(x,y):
    return x+y

task這個裝飾器具體執行其實就是返回 _create_task_cls 這個內部函式執行的結果

這個函式返回一個Proxy,Proxy 在真正執行到的時候,會執行 _task_from_fun

_task_from_fun 的作用是:將該task新增到全域性變數中,即 當呼叫 _task_from_fun 時會將該任務新增到app任務列表中,以此達到所有任務共享的目的這樣客戶端才能知道這個 task

    def task(self, *args, **opts):
        """Decorator to create a task class out of any callable. """
        if USING_EXECV and opts.get('lazy', True):
            from . import shared_task
            return shared_task(*args, lazy=False, **opts)

        def inner_create_task_cls(shared=True, filter=None, lazy=True, **opts):
            _filt = filter

            def _create_task_cls(fun):
                if shared:
                    def cons(app):
                        return app._task_from_fun(fun, **opts) # 將該task新增到全域性變數中,當呼叫_task_from_fun時會將該任務新增到app任務列表中,以此達到所有任務共享的目的
                    cons.__name__ = fun.__name__
                    connect_on_app_finalize(cons)
                if not lazy or self.finalized:
                    ret = self._task_from_fun(fun, **opts)
                else:
                    # return a proxy object that evaluates on first use
                    ret = PromiseProxy(self._task_from_fun, (fun,), opts,
                                       __doc__=fun.__doc__)
                    self._pending.append(ret)
                if _filt:
                    return _filt(ret)
                return ret

            return _create_task_cls

        if len(args) == 1:
            if callable(args[0]):
                return inner_create_task_cls(**opts)(*args) #執行在這裡
        return inner_create_task_cls(**opts)

我們具體分析下這個裝飾器。

2.2.1 新增任務

在初始化過程中,為每個app新增該任務時,會呼叫到app._task_from_fun(fun, **options)

具體作用是:

  • 判斷各種引數配置;
  • 動態建立task;
  • 將任務新增到_tasks任務中;
  • 用task的bind方法繫結相關屬性到該例項上;

程式碼如下:

    def _task_from_fun(self, fun, name=None, base=None, bind=False, **options):

        name = name or self.gen_task_name(fun.__name__, fun.__module__)         # 如果傳入了名字則使用,否則就使用moudle name的形式
        base = base or self.Task                                                # 是否傳入Task,否則用類自己的Task類 預設celery.app.task:Task

        if name not in self._tasks:                                             # 如果要加入的任務名稱不再_tasks中
            run = fun if bind else staticmethod(fun)                            # 是否bind該方法是則直接使用該方法,否則就置為靜態方法
            task = type(fun.__name__, (base,), dict({
                'app': self,                                                    # 動態建立Task類例項
                'name': name,                                                   # Task的name
                'run': run,                                                     # task的run方法
                '_decorated': True,                                             # 是否裝飾
                '__doc__': fun.__doc__,
                '__module__': fun.__module__,
                '__header__': staticmethod(head_from_fun(fun, bound=bind)),
                '__wrapped__': run}, **options))()                              
            # for some reason __qualname__ cannot be set in type()
            # so we have to set it here.
            try:
                task.__qualname__ = fun.__qualname__                            
            except AttributeError:
                pass
            self._tasks[task.name] = task                                       # 將任務新增到_tasks任務中
            task.bind(self)  # connects task to this app                        # 呼叫task的bind方法繫結相關屬性到該例項上

            add_autoretry_behaviour(task, **options)
        else:
            task = self._tasks[name]
        return task  

2.2.2 繫結

bind方法的作用是:繫結相關屬性到該例項上,因為只知道 task 名字或者程式碼是不夠的,還需要在執行時候拿到 task 的例項

@classmethod
def bind(cls, app):
    was_bound, cls.__bound__ = cls.__bound__, True
    cls._app = app                                          # 設定類的_app屬性
    conf = app.conf                                         # 獲取app的配置資訊
    cls._exec_options = None  # clear option cache

    if cls.typing is None:
        cls.typing = app.strict_typing

    for attr_name, config_name in cls.from_config:          # 設定類中的預設值
        if getattr(cls, attr_name, None) is None:           # 如果獲取該屬性為空
            setattr(cls, attr_name, conf[config_name])      # 使用app配置中的預設值

    # decorate with annotations from config.
    if not was_bound:
        cls.annotate()

        from celery.utils.threads import LocalStack
        cls.request_stack = LocalStack()                    # 使用執行緒棧儲存資料

    # PeriodicTask uses this to add itself to the PeriodicTask schedule.
    cls.on_bound(app)

    return app

2.3 小結

至此,在客戶端(使用者方),Celery 應用已經啟動,一個task例項也已經生成,其屬性都被繫結在例項上

0x03 amqp類

在客戶端呼叫 apply_async 的時候,會呼叫 app.send_task 來具體傳送任務,其中用到 amqp,所以我們首先講講 amqp 類。

3.1 生成

在 send_task 之中有如下程式碼,就是:

    def send_task(self, ....):
        """Send task by name.
        """
        parent = have_parent = None
        amqp = self.amqp # 此時生成

此時的 self 是 Celery 應用本身,具體內容我們列印出來看看,從下面我們可以看到 Celery 應用是什麼樣子。

self = {Celery} <Celery myTest at 0x1eeb5590488>
 AsyncResult = {type} <class 'celery.result.AsyncResult'>
 Beat = {type} <class 'celery.apps.beat.Beat'>
 GroupResult = {type} <class 'celery.result.GroupResult'>
 Pickler = {type} <class 'celery.app.utils.AppPickler'>
 ResultSet = {type} <class 'celery.result.ResultSet'>
 Task = {type} <class 'celery.app.task.Task'>
 WorkController = {type} <class 'celery.worker.worker.WorkController'>
 Worker = {type} <class 'celery.apps.worker.Worker'>
 amqp = {AMQP} <celery.app.amqp.AMQP object at 0x000001EEB5884188>
 amqp_cls = {str} 'celery.app.amqp:AMQP'
 backend = {DisabledBackend} <celery.backends.base.DisabledBackend object at 0x000001EEB584E248>
 clock = {LamportClock} 0
 control = {Control} <celery.app.control.Control object at 0x000001EEB57B37C8>
 events = {Events} <celery.app.events.Events object at 0x000001EEB56C7188>
 loader = {AppLoader} <celery.loaders.app.AppLoader object at 0x000001EEB5705408>
 main = {str} 'myTest'
 pool = {ConnectionPool} <kombu.connection.ConnectionPool object at 0x000001EEB57A9688>
 producer_pool = {ProducerPool} <kombu.pools.ProducerPool object at 0x000001EEB6297508>
 registry_cls = {type} <class 'celery.app.registry.TaskRegistry'>
 tasks = {TaskRegistry: 10} {'myTest.add': <@task: myTest.add of myTest at 0x1eeb5590488>, 'celery.accumulate': <@task: celery.accumulate of myTest at 0x1eeb5590488>, 'celery.chord_unlock': <@task: celery.chord_unlock of myTest at 0x1eeb5590488>, 'celery.chunks': <@task: celery.chunks of myTest at 0x1eeb5590488>, 'celery.backend_cleanup': <@task: celery.backend_cleanup of myTest at 0x1eeb5590488>, 'celery.group': <@task: celery.group of myTest at 0x1eeb5590488>, 'celery.map': <@task: celery.map of myTest at 0x1eeb5590488>, 'celery.chain': <@task: celery.chain of myTest at 0x1eeb5590488>, 'celery.starmap': <@task: celery.starmap of myTest at 0x1eeb5590488>, 'celery.chord': <@task: celery.chord of myTest at 0x1eeb5590488>}

堆疊為:

amqp, base.py:1205
__get__, objects.py:43
send_task, base.py:705
apply_async, task.py:565
<module>, myclient.py:4

為什麼賦值語句就可以生成 amqp?是因為其被 cached_property 修飾。

使用 cached_property 修飾過的函式,就變成是物件的屬性,該物件第一次引用該屬性時,會呼叫函式,物件第二次引用該屬性時就直接從詞典中取了,即 Caches the return value of the get method on first call。

    @cached_property
    def amqp(self):
        """AMQP related functionality: :class:`~@amqp`."""
        return instantiate(self.amqp_cls, app=self)

3.2 定義

AMQP類就是對amqp協議實現的再一次封裝,在這裡其實就是對 kombu 類的再一次封裝

class AMQP:
    """App AMQP API: app.amqp."""

    Connection = Connection
    Consumer = Consumer
    Producer = Producer

    #: compat alias to Connection
    BrokerConnection = Connection

    queues_cls = Queues

    #: Cached and prepared routing table.
    _rtable = None

    #: Underlying producer pool instance automatically
    #: set by the :attr:`producer_pool`.
    _producer_pool = None

    # Exchange class/function used when defining automatic queues.
    # For example, you can use ``autoexchange = lambda n: None`` to use the
    # AMQP default exchange: a shortcut to bypass routing
    # and instead send directly to the queue named in the routing key.
    autoexchange = None

具體內容我們列印出來看看,我們可以看到 amqp 是什麼樣子。

amqp = {AMQP}  
 BrokerConnection = {type} <class 'kombu.connection.Connection'>
 Connection = {type} <class 'kombu.connection.Connection'>
 Consumer = {type} <class 'kombu.messaging.Consumer'>
 Producer = {type} <class 'kombu.messaging.Producer'>
 app = {Celery} <Celery myTest at 0x252bd2903c8>
 argsrepr_maxsize = {int} 1024
 autoexchange = {NoneType} None
 default_exchange = {Exchange} Exchange celery(direct)
 default_queue = {Queue} <unbound Queue celery -> <unbound Exchange celery(direct)> -> celery>
 kwargsrepr_maxsize = {int} 1024
 producer_pool = {ProducerPool} <kombu.pools.ProducerPool object at 0x00000252BDC8F408>
 publisher_pool = {ProducerPool} <kombu.pools.ProducerPool object at 0x00000252BDC8F408>
 queues = {Queues: 1} {'celery': <unbound Queue celery -> <unbound Exchange celery(direct)> -> celery>}
 queues_cls = {type} <class 'celery.app.amqp.Queues'>
 router = {Router} <celery.app.routes.Router object at 0x00000252BDC6B248>
 routes = {tuple: 0} ()
 task_protocols = {dict: 2} {1: <bound method AMQP.as_task_v1 of <celery.app.amqp.AMQP object at 0x00000252BDC74148>>, 2: <bound method AMQP.as_task_v2 of <celery.app.amqp.AMQP object at 0x00000252BDC74148>>}
 utc = {bool} True
  _event_dispatcher = {EventDispatcher} <celery.events.dispatcher.EventDispatcher object at 0x00000252BE750348>
  _producer_pool = {ProducerPool} <kombu.pools.ProducerPool object at 0x00000252BDC8F408>
  _rtable = {tuple: 0} ()

具體邏輯如下:

+---------+
| Celery  |    +----------------------------+
|         |    |   celery.app.amqp.AMQP     |
|         |    |                            |
|         |    |                            |
|         |    |          BrokerConnection +----->  kombu.connection.Connection
|         |    |                            |
|   amqp+----->+          Connection       +----->  kombu.connection.Connection
|         |    |                            |
+---------+    |          Consumer         +----->  kombu.messaging.Consumer
               |                            |
               |          Producer         +----->  kombu.messaging.Producer
               |                            |
               |          producer_pool    +----->  kombu.pools.ProducerPool
               |                            |
               |          queues           +----->  celery.app.amqp.Queues
               |                            |
               |          router           +----->  celery.app.routes.Router
               +----------------------------+

0x04 傳送Task

我們接著看看客戶端如何傳送task。

from myTest import add
re = add.apply_async((2,17))

總述下邏輯:

  • Producer 初始化過程完成了連線用的內容,比如呼叫self.connect方法,到預定的Transport類中連線載體,並初始化Chanel,self.chanel = self.connection;
  • 呼叫 Message 封裝訊息;
  • Exchange 將 routing_key 轉為 queue;
  • 呼叫 amqp 傳送訊息;
  • Channel 負責最終訊息釋出;

我們下面詳細解讀下。

4.1 apply_async in task

這裡重要的是兩點:

  • 如果是 task_always_eager,則產生一個 Kombu . producer;
  • 否則,呼叫 amqp 來傳送 task(我們主要看這裡);

縮減版程式碼如下:

    def apply_async(self, args=None, kwargs=None, task_id=None, producer=None,
                    link=None, link_error=None, shadow=None, **options):
        """Apply tasks asynchronously by sending a message.
        """
        
        preopts = self._get_exec_options()
        options = dict(preopts, **options) if options else preopts

        app = self._get_app()
        if app.conf.task_always_eager:
            # 獲取 producer
            with app.producer_or_acquire(producer) as eager_producer:      
                serializer = options.get('serializer')
                body = args, kwargs
                content_type, content_encoding, data = serialization.dumps(
                    body, serializer,
                )
                args, kwargs = serialization.loads(
                    data, content_type, content_encoding,
                    accept=[content_type]
                )
            with denied_join_result():
                return self.apply(args, kwargs, task_id=task_id or uuid(),
                                  link=link, link_error=link_error, **options)
        else:
            return app.send_task( #呼叫到這裡
                self.name, args, kwargs, task_id=task_id, producer=producer,
                link=link, link_error=link_error, result_cls=self.AsyncResult,
                shadow=shadow, task_type=self,
                **options
            )

此時如下:

         1  apply_async       +-------------------+
                              |                   |
User  +---------------------> | task: myTest.add  |
                              |                   |
                              +-------------------+

4.2 send_task

此函式作用是生成任務資訊,呼叫amqp傳送任務:

  • 獲取amqp例項;
  • 設定任務id,如果沒有傳入則生成任務id;
  • 生成路由值,如果沒有則使用amqp的router;
  • 生成route資訊;
  • 生成任務資訊;
  • 如果有連線則生成生產者;
  • 傳送任務訊息;
  • 生成非同步任務例項;
  • 返回結果;

具體如下:

def send_task(self, name, ...):
    """Send task by name.
    """
    parent = have_parent = None
    amqp = self.amqp                                                    # 獲取amqp例項
    task_id = task_id or uuid()                                         # 設定任務id,如果沒有傳入則生成任務id
    producer = producer or publisher  # XXX compat                      # 生成這
    router = router or amqp.router                                      # 路由值,如果沒有則使用amqp的router
    options = router.route(
        options, route_name or name, args, kwargs, task_type)           # 生成route資訊

    message = amqp.create_task_message( # 生成任務資訊
        task_id, name, args, kwargs, countdown, eta, group_id, group_index,
        expires, retries, chord,
        maybe_list(link), maybe_list(link_error),
        reply_to or self.thread_oid, time_limit, soft_time_limit,
        self.conf.task_send_sent_event,
        root_id, parent_id, shadow, chain,
        argsrepr=options.get('argsrepr'),
        kwargsrepr=options.get('kwargsrepr'),
    )

    if connection:
        producer = amqp.Producer(connection)                            # 如果有連線則生成生產者
    
    with self.producer_or_acquire(producer) as P:                       
        with P.connection._reraise_as_library_errors():
            self.backend.on_task_call(P, task_id)
            amqp.send_task_message(P, name, message, **options)         # 傳送任務訊息 
    
    result = (result_cls or self.AsyncResult)(task_id)                  # 生成非同步任務例項
    if add_to_parent:
        if not have_parent:
            parent, have_parent = self.current_worker_task, True
        if parent:
            parent.add_trail(result)
    return result                                                       # 返回結果

此時如下:

         1  apply_async       +-------------------+
                              |                   |
User  +---------------------> | task: myTest.add  |
                              |                   |
                              +--------+----------+
                                       |
                                       |
                        2 send_task    |
                                       |
                                       v
                                +------+--------+
                                | Celery myTest |
                                |               |
                                +------+--------+
                                       |
                                       |
                  3 send_task_message  |
                                       |
                                       v
                               +-------+---------+
                               |      amqp       |
                               |                 |
                               |                 |
                               +-----------------+

4.3 生成訊息內容

as_task_v2 會具體生成訊息內容。大家可以看到如果實現一個訊息,需要用到幾個大部分:

  • headers,包括:task name, task id, expires, 等等;
  • 訊息型別 和 編碼方式:content-encoding,content-type;
  • 引數:這些就是 Celery 特有的,用來區分不同佇列的,比如:exchange,routing_key 等等;
  • body : 就是訊息體;

最終具體訊息舉例如下:

{
	"body": "W1syLCA4XSwge30sIHsiY2FsbGJhY2tzIjogbnVsbCwgImVycmJhY2tzIjogbnVsbCwgImNoYWluIjogbnVsbCwgImNob3JkIjogbnVsbH1d",
	"content-encoding": "utf-8",
	"content-type": "application/json",
	"headers": {
		"lang": "py",
		"task": "myTest.add",
		"id": "243aac4a-361b-4408-9e0c-856e2655b7b5",
		"shadow": null,
		"eta": null,
		"expires": null,
		"group": null,
		"group_index": null,
		"retries": 0,
		"timelimit": [null, null],
		"root_id": "243aac4a-361b-4408-9e0c-856e2655b7b5",
		"parent_id": null,
		"argsrepr": "(2, 8)",
		"kwargsrepr": "{}",
		"origin": "gen33652@DESKTOP-0GO3RPO"
	},
	"properties": {
		"correlation_id": "243aac4a-361b-4408-9e0c-856e2655b7b5",
		"reply_to": "b34fcf3d-da9a-3717-a76f-44b6a6362da1",
		"delivery_mode": 2,
		"delivery_info": {
			"exchange": "",
			"routing_key": "celery"
		},
		"priority": 0,
		"body_encoding": "base64",
		"delivery_tag": "fa1bc9c8-3709-4c02-9543-8d0fe3cf4e6c"
	}
}

具體程式碼如下,這裡的 sent_event 是後續傳送時候需要,並不體現在具體訊息內容之中:

def as_task_v2(self, task_id, name, args=None, kwargs=None, ......):

    ......
    
    return task_message(
        headers={
            'lang': 'py',
            'task': name,
            'id': task_id,
            'shadow': shadow,
            'eta': eta,
            'expires': expires,
            'group': group_id,
            'group_index': group_index,
            'retries': retries,
            'timelimit': [time_limit, soft_time_limit],
            'root_id': root_id,
            'parent_id': parent_id,
            'argsrepr': argsrepr,
            'kwargsrepr': kwargsrepr,
            'origin': origin or anon_nodename()
        },
        properties={
            'correlation_id': task_id,
            'reply_to': reply_to or '',
        },
        body=(
            args, kwargs, {
                'callbacks': callbacks,
                'errbacks': errbacks,
                'chain': chain,
                'chord': chord,
            },
        ),
        sent_event={
            'uuid': task_id,
            'root_id': root_id,
            'parent_id': parent_id,
            'name': name,
            'args': argsrepr,
            'kwargs': kwargsrepr,
            'retries': retries,
            'eta': eta,
            'expires': expires,
        } if create_sent_event else None,
    )

4.4 send_task_message in amqp

amqp.send_task_message(P, name, message, **options) 是用來 amqp 傳送任務。

該方法主要是組裝待傳送任務的引數,如connection,queue,exchange,routing_key等,呼叫 producer 的 publish 傳送任務。

基本套路就是:

  • 獲得 queue;
  • 獲得 delivery_mode;
  • 獲得 exchange;
  • 獲取重試策略等;
  • 呼叫 producer 來傳送訊息;
        def send_task_message(producer, name, message,
                              exchange=None, routing_key=None, queue=None,
                              event_dispatcher=None,
                              retry=None, retry_policy=None,
                              serializer=None, delivery_mode=None,
                              compression=None, declare=None,
                              headers=None, exchange_type=None, **kwargs):
    				# 獲得 queue, 獲得 delivery_mode, 獲得 exchange, 獲取重試策略等

            if before_receivers:
                send_before_publish(
                    sender=name, body=body,
                    exchange=exchange, routing_key=routing_key,
                    declare=declare, headers=headers2,
                    properties=properties, retry_policy=retry_policy,
                )
            
            ret = producer.publish(
                body,
                exchange=exchange,
                routing_key=routing_key,
                serializer=serializer or default_serializer,
                compression=compression or default_compressor,
                retry=retry, retry_policy=_rp,
                delivery_mode=delivery_mode, declare=declare,
                headers=headers2,
                **properties
            )
            if after_receivers:
                send_after_publish(sender=name, body=body, headers=headers2,
                                   exchange=exchange, routing_key=routing_key)
 
            .....
  
            if sent_event: # 這裡就處理了sent_event
                evd = event_dispatcher or default_evd
                exname = exchange
                if isinstance(exname, Exchange):
                    exname = exname.name
                sent_event.update({
                    'queue': qname,
                    'exchange': exname,
                    'routing_key': routing_key,
                })
                evd.publish('task-sent', sent_event,
                            producer, retry=retry, retry_policy=retry_policy)
            return ret
        return send_task_message

此時堆疊為:

send_task_message, amqp.py:473
send_task, base.py:749
apply_async, task.py:565
<module>, myclient.py:4

此時變數為:

qname = {str} 'celery'
queue = {Queue} <unbound Queue celery -> <unbound Exchange celery(direct)> -> celery>
 ContentDisallowed = {type} <class 'kombu.exceptions.ContentDisallowed'>
 alias = {NoneType} None
 attrs = {tuple: 18} (('name', None), ('exchange', None), ('routing_key', None), ('queue_arguments', None), ('binding_arguments', None), ('consumer_arguments', None), ('durable', <class 'bool'>), ('exclusive', <class 'bool'>), ('auto_delete', <class 'bool'>), ('no_ack', None), ('alias', None), ('bindings', <class 'list'>), ('no_declare', <class 'bool'>), ('expires', <class 'float'>), ('message_ttl', <class 'float'>), ('max_length', <class 'int'>), ('max_length_bytes', <class 'int'>), ('max_priority', <class 'int'>))
 auto_delete = {bool} False
 binding_arguments = {NoneType} None
 bindings = {set: 0} set()
 can_cache_declaration = {bool} True
 channel = {str} 'Traceback (most recent call last):\n  File "C:\\Program Files\\JetBrains\\PyCharm Community Edition 2020.2.2\\plugins\\python-ce\\helpers\\pydev\\_pydevd_bundle\\pydevd_resolver.py", line 178, in _getPyDictionary\n    attr = getattr(var, n)\n  File "C:\\User
 consumer_arguments = {NoneType} None
 durable = {bool} True
 exchange = {Exchange} Exchange celery(direct)
 exclusive = {bool} False
 expires = {NoneType} None
 is_bound = {bool} False
 max_length = {NoneType} None
 max_length_bytes = {NoneType} None
 max_priority = {NoneType} None
 message_ttl = {NoneType} None
 name = {str} 'celery'
 no_ack = {bool} False
 no_declare = {NoneType} None
 on_declared = {NoneType} None
 queue_arguments = {NoneType} None
 routing_key = {str} 'celery'
  _channel = {NoneType} None
  _is_bound = {bool} False
queues = {Queues: 1} {'celery': <unbound Queue celery -> <unbound Exchange celery(direct)> -> celery>}

此時邏輯如下:

         1  apply_async       +-------------------+
                              |                   |
User  +---------------------> | task: myTest.add  |
                              |                   |
                              +--------+----------+
                                       |
                                       |
                          2 send_task  |
                                       |
                                       v
                                +------+--------+
                                | Celery myTest |
                                |               |
                                +------+--------+
                                       |
                                       |
                  3 send_task_message  |
                                       |
                                       v
                               +-------+---------+
                               |      amqp       |
                               +-------+---------+
                                       |
                                       |
                            4 publish  |
                                       |
                                       v
                                  +----+------+
                                  | producer  |
                                  |           |
                                  +-----------+

4.5 publish in producer

在 produer 之中,呼叫 channel 來傳送資訊

def _publish(self, body, priority, content_type, content_encoding,
             headers, properties, routing_key, mandatory,
             immediate, exchange, declare):
    channel = self.channel
    message = channel.prepare_message(
        body, priority, content_type,
        content_encoding, headers, properties,
    )
    if declare:
        maybe_declare = self.maybe_declare
        [maybe_declare(entity) for entity in declare]

    # handle autogenerated queue names for reply_to
    reply_to = properties.get('reply_to')
    if isinstance(reply_to, Queue):
        properties['reply_to'] = reply_to.name
    return channel.basic_publish( # 傳送訊息
        message,
        exchange=exchange, routing_key=routing_key,
        mandatory=mandatory, immediate=immediate,
    )

變數為:

body = {str} '[[2, 8], {}, {"callbacks": null, "errbacks": null, "chain": null, "chord": null}]'
compression = {NoneType} None
content_encoding = {str} 'utf-8'
content_type = {str} 'application/json'
declare = {list: 1} [<unbound Queue celery -> <unbound Exchange celery(direct)> -> celery>]
delivery_mode = {int} 2
exchange = {str} ''
exchange_name = {str} ''
expiration = {NoneType} None
headers = {dict: 15} {'lang': 'py', 'task': 'myTest.add', 'id': 'af0e4c14-a618-41b4-9340-1479cb7cde4f', 'shadow': None, 'eta': None, 'expires': None, 'group': None, 'group_index': None, 'retries': 0, 'timelimit': [None, None], 'root_id': 'af0e4c14-a618-41b4-9340-1479cb7cde4f', 'parent_id': None, 'argsrepr': '(2, 8)', 'kwargsrepr': '{}', 'origin': 'gen11468@DESKTOP-0GO3RPO'}
immediate = {bool} False
mandatory = {bool} False
priority = {int} 0
properties = {dict: 3} {'correlation_id': 'af0e4c14-a618-41b4-9340-1479cb7cde4f', 'reply_to': '2c938063-64b8-35f5-ac9f-a1c0915b6f71', 'delivery_mode': 2}
retry = {bool} True
retry_policy = {dict: 4} {'max_retries': 3, 'interval_start': 0, 'interval_max': 1, 'interval_step': 0.2}
routing_key = {str} 'celery'
self = {Producer} <Producer: <promise: 0x1eeb62c44c8>>
serializer = {str} 'json'

此時邏輯為:

         1  apply_async       +-------------------+
                              |                   |
User  +---------------------> | task: myTest.add  |
                              |                   |
                              +--------+----------+
                                       |
                          2 send_task  |
                                       |
                                       v
                                +------+--------+
                                | Celery myTest |
                                |               |
                                +------+--------+
                                       |
                  3 send_task_message  |
                                       |
                                       v
                               +-------+---------+
                               |      amqp       |
                               +-------+---------+
                                       |
                            4 publish  |
                                       |
                                       v
                                  +----+------+
                                  | producer  |
                                  |           |
                                  +----+------+
                                       |
                                       |
                      5 basic_publish  |
                                       v
                                  +----+------+
                                  |  channel  |
                                  |           |
                                  +-----------+

至此一個任務就傳送出去,等待著消費者消費掉任務。

4.6 redis 內容

傳送之後,task 就被儲存在redis的佇列之中。在redis 的結果是:

127.0.0.1:6379> keys *
1) "_kombu.binding.reply.testMailbox.pidbox"
2) "_kombu.binding.testMailbox.pidbox"
3) "celery"
4) "_kombu.binding.celeryev"
5) "_kombu.binding.celery"
6) "_kombu.binding.reply.celery.pidbox"
127.0.0.1:6379> lrange celery 0 -1
1) "{\"body\": \"W1syLCA4XSwge30sIHsiY2FsbGJhY2tzIjogbnVsbCwgImVycmJhY2tzIjogbnVsbCwgImNoYWluIjogbnVsbCwgImNob3JkIjogbnVsbH1d\", \"content-encoding\": \"utf-8\", \"content-type\": \"application/json\", \"headers\": {\"lang\": \"py\", \"task\": \"myTest.add\", \"id\": \"243aac4a-361b-4408-9e0c-856e2655b7b5\", \"shadow\": null, \"eta\": null, \"expires\": null, \"group\": null, \"group_index\": null, \"retries\": 0, \"timelimit\": [null, null], \"root_id\": \"243aac4a-361b-4408-9e0c-856e2655b7b5\", \"parent_id\": null, \"argsrepr\": \"(2, 8)\", \"kwargsrepr\": \"{}\", \"origin\": \"gen33652@DESKTOP-0GO3RPO\"}, \"properties\": {\"correlation_id\": \"243aac4a-361b-4408-9e0c-856e2655b7b5\", \"reply_to\": \"b34fcf3d-da9a-3717-a76f-44b6a6362da1\", \"delivery_mode\": 2, \"delivery_info\": {\"exchange\": \"\", \"routing_key\": \"celery\"}, \"priority\": 0, \"body_encoding\": \"base64\", \"delivery_tag\": \"fa1bc9c8-3709-4c02-9543-8d0fe3cf4e6c\"}}"

4.6.1 delivery_tag 作用

可以看到,最終訊息中,有一個 delivery_tag 變數,這裡要特殊說明下。

可以認為 delivery_tag 是訊息在 redis 之中的唯一標示,是 UUID 格式。

具體舉例如下:

"delivery_tag": "fa1bc9c8-3709-4c02-9543-8d0fe3cf4e6c"

後續 QoS 就使用 delivery_tag 來做各種處理,比如 ack, snack。

with self.pipe_or_acquire() as pipe:
    pipe.zadd(self.unacked_index_key, *zadd_args) \
        .hset(self.unacked_key, delivery_tag,
              dumps([message._raw, EX, RK])) \
        .execute()
    super().append(message, delivery_tag)

4.6.2 delivery_tag 何時生成

我們關心的是在傳送訊息時候,何時生成 delivery_tag。

結果發現是在 Channel 的 _next_delivery_tag 函式中,是在傳送訊息之前,對訊息做了進一步增強。

def _next_delivery_tag(self):
    return uuid()

具體堆疊如下:

_next_delivery_tag, base.py:595
_inplace_augment_message, base.py:614
basic_publish, base.py:599
_publish, messaging.py:200
_ensured, connection.py:525
publish, messaging.py:178
send_task_message, amqp.py:532
send_task, base.py:749
apply_async, task.py:565
<module>, myclient.py:4

至此,客戶端傳送 task 的流程已經結束,有興趣的可以看看 [原始碼解析] 並行分散式任務佇列 Celery 之 消費動態流程 此文從服務端角度講解收到 Task 如何消費。

0xFF 參考

celery原始碼分析-Task的初始化與傳送任務

Celery 原始碼解析三: Task 物件的實現

分散式任務佇列 Celery —— 詳解工作流

相關文章