[原始碼解析] 並行分散式任務佇列 Celery 之 消費動態流程
0x00 摘要
Celery是一個簡單、靈活且可靠的,處理大量訊息的分散式系統,專注於實時處理的非同步任務佇列,同時也支援任務排程。
經過多篇文章之後(在文末有連結),我們介紹了 Celery 如何啟動,也介紹了 Task。本文我們就看看收到一個任務之後,Celery(包括 Kombu)內部的消費流程脈絡(到多程式之前)。
目的 是 做一個暫時性總結,梳理目前思路,為下一階段分析多程式做準備。
因為是具體流程梳理,所以會涉及到比較多的堆疊資訊和執行時變數,希望大家理解。
0x01 來由
之前在分析celery的worker的啟動過程中,我們提到了,Celery 最後開啟了loop等待任務來消費,啟動時候定義的回撥函式就是 on_task_received,縮減版堆疊如下。
on_task_received, consumer.py:542
_receive_callback, messaging.py:620
_callback, base.py:630
_deliver, base.py:980
_brpop_read, redis.py:748
on_readable, redis.py:358
create_loop, hub.py:361
asynloop, loops.py:81
start, consumer.py:592
start, bootsteps.py:116
start, consumer.py:311
start, bootsteps.py:365
start, worker.py:204
worker, worker.py:327
caller, base.py:132
new_func, decorators.py:21
invoke, core.py:610
main, core.py:782
start, base.py:358
worker_main, base.py:374
我們可以大致看出一個邏輯流程:
- 在 Kombu 範疇是:
- 在訊息迴圈(hub) 中獲取了訊息;
- 經由 Broker抽象(Transport)和 執行引擎(MultiChannelPoller)處理之後,把訊息解讀出來;
- 開始呼叫 Celery 的回撥函式;
- 在 Celery 範疇是:
- 經由回撥函式這個邏輯入口開始,先呼叫到了 Strategy 根據不同條件做不同的處理;
- 然後把 工作 委託給 Worker,就是由 Worker 來執行使用者 task;
- 因為需要提高效率,所以需要有多執行緒處理,就是執行多個執行緒執行多個 worker;
僅憑堆疊沒有一個整體概念,本文我們就看看 Celery 是如何消費訊息的。
具體我們從 poll 開始看起,即 Redis 之中有一個新的任務訊息,Celery 的 BRPOP 對應的 FD 收到了 Poll 響應。
0x02 邏輯 in kombu
我們從 kombu 開始看。
首先給出 Kombu 部分的整理邏輯圖,這樣大家就有了一個整體直觀的瞭解:
+-------------+ +-------------------+ +-------------------------+
| hub | 1 | Transport | 2 |MultiChannelPoller |
| | fileno | | cycle.on_readable(fileno) | |
| cb +--------------> on_readable +-------------------------------------> _fd_to_chan[fileno] |
| | | | | |
| poll | | +-<---------------+ | chan.handlers[type]+---------------+
+-------------+ | _callbacks[queue]| | | | |
| + | | +-------------------------+ |
| | | | |
+-------------------+ | |
| | |
| | +-----------------------+ |
| | | Channel | 3 |
| | | | _brpop_read |
| | | | |
| +----------------+ connection +<------------+
| _deliver(message, queue)| |
| 5 4 | |
| callback(message) | |
+----------------------------------------------> callback(message)+---------------+
+-----------------------+ |
|
+----------------------+ |
| Consumer | |
on_m(message) | | |
+---------------------------+ on_message | <------------+
| | | _receive_callback
kombu | +----------------------+ 6
|
+-----------------------------------------------------------------------------------------------------------------------+
|
Celery |
+---------------------------+
| Consumer | |
| | |
| v |
| on_task_received |
| |
| |
+---------------------------+
手機如下:
我們在上圖中可以看到邏輯上分為 Kombu 和 Celery 兩個範疇,訊息先從 Kombu 開始,然後來到了 Celery。
2.1 訊息迴圈 -- hub in kombu
我們首先從訊息迴圈 hub 開始入手。
在 kombu/asynchronous/hub.py 中有如下程式碼:
可以看到,當 poll 有訊息,就會呼叫 readers[fd] 配置的 cb。這裡的 td 就是 redis socket 對應的 fd。
簡略版程式碼如下:
def create_loop(self,
generator=generator, sleep=sleep, min=min, next=next,
Empty=Empty, StopIteration=StopIteration,
KeyError=KeyError, READ=READ, WRITE=WRITE, ERR=ERR):
readers, writers = self.readers, self.writers
poll = self.poller.poll
while 1:
if readers or writers:
to_consolidate = []
try:
events = poll(poll_timeout)
for fd, event in events or ():
if event & READ:
cb, cbargs = readers[fd]
cb(*cbargs)
2.2 Broker抽象 -- Transport in kombu
readers[fd] 之中註冊的是 Transport 類的 on_readable 回撥函式,所以程式碼來到 Transport。
其作用為呼叫 MultiChannelPoller 處理。
程式碼位置為:kombu/transport/redis.py,這裡的 cycle 就是 Transport。
def on_readable(self, fileno):
"""Handle AIO event for one of our file descriptors."""
self.cycle.on_readable(fileno)
此時變數為:
fileno = {int} 34
self = {Transport} <kombu.transport.redis.Transport object at 0x7fcbaeeb6710>
如下,邏輯跑到了 Transport:
+-------------+ +---------------+
| hub | | Transport |
| | fileno | |
| cb +--------------> on_readable |
| | | |
| poll | | |
+-------------+ +---------------+
2.3 執行引擎 --- MultiChannelPoller in kombu
此時程式碼來到 MultiChannelPoller。由前面系列文章我們知道,MultiChannelPoller 的作用是把 Channel 和 Poll 聯絡起來。其作用為呼叫 poll fd 對應的 Channel 進一步處理。
從程式碼能看到,每一個 fd 對應一個 Channel,因為 poll 只是告訴 Celery 某個 fd 有訊息,但是具體怎麼讀訊息,還需要 Celery 進一步處理。
因為 Celery 任務 使用的是 redis BRPOP 操作實現,所以此時獲取的是 BRPOP 對應的回撥函式 _brpop_read。
程式碼位置為:kombu/transport/redis.py。
def on_readable(self, fileno):
chan, type = self._fd_to_chan[fileno]
if chan.qos.can_consume():
chan.handlers[type]()
此時變數如下,我們可以看到對應的各個邏輯部分:
chan.handlers[type] = {method} <bound method Channel._brpop_read of <kombu.transport.redis.Channel object at 0x7fcbaeeb68d0>>
chan = {Channel} <kombu.transport.redis.Channel object at 0x7fcbaeeb68d0>
fileno = {int} 34
self = {MultiChannelPoller} <kombu.transport.redis.MultiChannelPoller object at 0x7fcbaddfd048>
type = {str} 'BRPOP'
2.4 解讀訊息 -- Channel in kombu
此時 程式碼來到 Channel。程式碼為:kombu/transport/redis.py。
Channel 這部分的作用為呼叫 redis client進行讀訊息,對訊息進行解讀,從而提出其中的 queue(就是程式碼片段裡面的 dest 變數),這樣就知道應該哪個使用者(即 queue 對應的使用者)來處理訊息。然後使用 self.connection._deliver 對訊息進行相應分發。
具體 _brpop_read 程式碼如下:
def _brpop_read(self, **options):
try:
try:
dest__item = self.client.parse_response(self.client.connection,
'BRPOP',
**options)
except self.connection_errors:
# if there's a ConnectionError, disconnect so the next
# iteration will reconnect automatically.
self.client.connection.disconnect()
raise
if dest__item:
dest, item = dest__item
dest = bytes_to_str(dest).rsplit(self.sep, 1)[0]
self._queue_cycle.rotate(dest)
self.connection._deliver(loads(bytes_to_str(item)), dest) # 訊息分發
return True
else:
raise Empty()
finally:
self._in_poll = None
此時變數為:
dest = {str} 'celery'
dest__item = {tuple: 2}
0 = {bytes: 6} b'celery'
1 = {bytes: 861} b'{"body": "W1syLCAxN10sIHt9LCB7ImNhbGxiYWNrcyI6IG51bGwsICJlcnJiYWNrcyI6IG51bGwsICJjaGFpbiI6IG51bGwsICJjaG9yZCI6IG51bGx9XQ==", "content-encoding": "utf-8", "content-type": "application/json", "headers": {"lang": "py", "task": "myTest.add", "id": "863cf9b2-
item = b'{"body": "W1syLCAxN10sIHt9LCB7ImNhbGxiYWNrcyI6IG51bGwsICJlcnJiYWNrcyI6IG51bGwsICJjaGFpbiI6IG51bGwsICJjaG9yZCI6IG51bGx9XQ==", "content-encoding": "utf-8", "content-type": "application/json", "headers": {"lang": "py", "task": "myTest.add", "id": "863cf9b2-
self = {Channel} <kombu.transport.redis.Channel object at 0x7fcbaeeb68d0>
2.5 開始回撥 -- Transport in kombu
程式碼回到 Transport。
此時程式碼作用為呼叫 self._callbacks
的 回撥函式 進行處理。可以看出來,這裡記錄了對於 queue 的 回撥。
_callback
為:<function Channel.basic_consume.
而且可以看出來任務訊息的具體格式和內容,比如 {'exchange': '', 'routing_key': 'celery'},從這裡就能知道 對應的 queue 是什麼。
程式碼位置為:transport/virtual/base.py。
def _deliver(self, message, queue):
try:
callback = self._callbacks[queue]
except KeyError:
logger.warning(W_NO_CONSUMERS, queue)
self._reject_inbound_message(message)
else:
callback(message)
變數如下,我們可以看到,Celery 此時的三個不同的回撥就對應了三個不同功能。
celeryev.c755f81c-415e-478f-bb51-def341a96c0c
就是對應 Event處理;celery@.celery.pidbox
就是對應 control;celery
就是正常訊息消費;
self._callbacks = {dict: 3}
'celeryev.c755f81c-415e-478f-bb51-def341a96c0c' = {function} <function Channel.basic_consume.<locals>._callback at 0x7fcbaef23048>
'celery@.celery.pidbox' = {function} <function Channel.basic_consume.<locals>._callback at 0x7fcbaef56488>
'celery' = {function} <function Channel.basic_consume.<locals>._callback at 0x7fcbaef56d08>
message = {dict: 5} {'body': 'W1syLCAxN10sIHt9LCB7ImNhbGxiYWNrcyI6IG51bGwsICJlcnJiYWNrcyI6IG51bGwsICJjaGFpbiI6IG51bGwsICJjaG9yZCI6IG51bGx9XQ==', 'content-encoding': 'utf-8', 'content-type': 'application/json', 'headers': {'lang': 'py', 'task': 'myTest.add', 'id': '863cf9b2-8440-4ea2-8ac4-06b3dcd2fd1f', 'shadow': None, 'eta': None, 'expires': None, 'group': None, 'group_index': None, 'retries': 0, 'timelimit': [None, None], 'root_id': '863cf9b2-8440-4ea2-8ac4-06b3dcd2fd1f', 'parent_id': None, 'argsrepr': '(2, 17)', 'kwargsrepr': '{}', 'origin': 'gen19806@ demini'}, 'properties': {'correlation_id': '863cf9b2-8440-4ea2-8ac4-06b3dcd2fd1f', 'reply_to': 'ef1b446d-e3a9-3345-b027-b7bd8a93aa93', 'delivery_mode': 2, 'delivery_info': {'exchange': '', 'routing_key': 'celery'}, 'priority': 0, 'body_encoding': 'base64', 'delivery_tag': 'cfa3a261-c9b4-4d7e-819c-37608c0bb0cc'}}
queue = {str} 'celery'
self = {Transport} <kombu.transport.redis.Transport object at 0x7fcbaeeb6710>
此時邏輯圖如下:
+-------------+ +-------------------+ +-------------------------+
| hub | 1 | Transport | 2 |MultiChannelPoller |
| | fileno | | cycle.on_readable(fileno) | |
| cb +--------------> on_readable +-------------------------------------> _fd_to_chan[fileno] |
| | | | | |
| poll | | +-<---------------+ | chan.handlers[type]+------------+
+-------------+ | _callbacks[queue]| | | | |
| | | +-------------------------+ |
| | | |
+-------------------+ | |
| |
| +-----------------+ |
| | Channel | 3 |
| | | _brpop_read |
| | | |
+----------------+ connection | <--------------+
_deliver(message, queue)| |
4 | |
+-----------------+
手機如下:
2.6 開始回撥 -- Channel in kombu
程式碼繼續回撥到 kombu/transport/virtual/base.py。
就是 queue 的 回撥函式 basic_consume。因為此時 channel 得到了 queue 對應的 redis 訊息,所以 Channel 就需要呼叫這個 queue 對應的回撥函式。就是 呼叫 Consumer 的回撥函式。
def basic_consume(self, queue, no_ack, callback, consumer_tag, **kwargs):
"""Consume from `queue`."""
self._tag_to_queue[consumer_tag] = queue
self._active_queues.append(queue)
def _callback(raw_message):
message = self.Message(raw_message, channel=self)
if not no_ack:
self.qos.append(message, message.delivery_tag)
return callback(message)
self.connection._callbacks[queue] = _callback
self._consumers.add(consumer_tag)
self._reset_cycle()
此時 變數為:
callback = {method} <bound method Consumer._receive_callback of <Consumer: [<Queue celery -> <Exchange celery(direct) bound to chan:1> -> celery bound to chan:1>]>>
message = {Message} <Message object at 0x7fcbaef3eaf8 with details {'state': 'RECEIVED', 'content_type': 'application/json', 'delivery_tag': 'cfa3a261-c9b4-4d7e-819c-37608c0bb0cc', 'body_length': 82, 'properties': {'correlation_id': '863cf9b2-8440-4ea2-8ac4-06b3dcd2fd1f'}, 'd
raw_message = {dict: 5} {'body': 'W1syLCAxN10sIHt9LCB7ImNhbGxiYWNrcyI6IG51bGwsICJlcnJiYWNrcyI6IG51bGwsICJjaGFpbiI6IG51bGwsICJjaG9yZCI6IG51bGx9XQ==', 'content-encoding': 'utf-8', 'content-type': 'application/json', 'headers': {'lang': 'py', 'task': 'myTest.add', 'id': '863cf9b2-8440-4ea2-8ac4-06b3dcd2fd1f', 'shadow': None, 'eta': None, 'expires': None, 'group': None, 'group_index': None, 'retries': 0, 'timelimit': [None, None], 'root_id': '863cf9b2-8440-4ea2-8ac4-06b3dcd2fd1f', 'parent_id': None, 'argsrepr': '(2, 17)', 'kwargsrepr': '{}', 'origin': 'gen19806@ demini'}, 'properties': {'correlation_id': '863cf9b2-8440-4ea2-8ac4-06b3dcd2fd1f', 'reply_to': 'ef1b446d-e3a9-3345-b027-b7bd8a93aa93', 'delivery_mode': 2, 'delivery_info': {'exchange': '', 'routing_key': 'celery'}, 'priority': 0, 'body_encoding': 'base64', 'delivery_tag': 'cfa3a261-c9b4-4d7e-819c-37608c0bb0cc'}}
self = {Channel} <kombu.transport.redis.Channel object at 0x7fcbaeeb68d0>
此時邏輯圖如下:
+-------------+ +-------------------+ +-------------------------+
| hub | 1 | Transport | 2 |MultiChannelPoller |
| | fileno | | cycle.on_readable(fileno) | |
| cb +--------------> on_readable +-------------------------------------> _fd_to_chan[fileno] |
| | | | | |
| poll | | +-<---------------+ | chan.handlers[type]+---------------+
+-------------+ | _callbacks[queue]| | | | |
| + | | +-------------------------+ |
| | | | |
+-------------------+ | |
| | |
| | +-----------------------+ |
| | | Channel | 3 |
| | | | _brpop_read |
| | | | |
| +----------------+ connection +<------------+
| _deliver(message, queue)| |
| 5 4 | |
| callback(message) | |
+----------------------------------------------> callback(message)+--------------->
+-----------------------+
手機如下:
2.7 呼叫回撥 -- Consumer in kombu
Kombu Consumer 回撥的程式碼位於:kombu/messaging.py
具體是呼叫 使用者註冊在 Kombu Consumer 的回撥函式。注意的是: Kombu Comsumer 的使用者就是 Celery,所以這裡馬上就呼叫到了 Celery 之前註冊的回撥函式。
def _receive_callback(self, message):
accept = self.accept
on_m, channel, decoded = self.on_message, self.channel, None
try:
m2p = getattr(channel, 'message_to_python', None)
if m2p:
message = m2p(message)
if accept is not None:
message.accept = accept
if message.errors:
return message._reraise_error(self.on_decode_error)
decoded = None if on_m else message.decode()
except Exception as exc:
if not self.on_decode_error:
raise
self.on_decode_error(message, exc)
else:
return on_m(message) if on_m else self.receive(decoded, message)
變數為:
on_m = {function} <function Consumer.create_task_handler.<locals>.on_task_received at 0x7fcbaef562f0>
accept = {set: 1} {'application/json'}
channel = {Channel} <kombu.transport.redis.Channel object at 0x7fcbaeeb68d0>
m2p = {method} <bound method Channel.message_to_python of <kombu.transport.redis.Channel object at 0x7fcbaeeb68d0>>
message = {Message} <Message object at 0x7fcbaef3eaf8 with details {'state': 'RECEIVED', 'content_type': 'application/json', 'delivery_tag': 'cfa3a261-c9b4-4d7e-819c-37608c0bb0cc', 'body_length': 82, 'properties': {'correlation_id': '863cf9b2-8440-4ea2-8ac4-06b3dcd2fd1f'}, 'delivery_info': {'exchange': '', 'routing_key': 'celery'}}>
self = {Consumer} <Consumer: [<Queue celery -> <Exchange celery(direct) bound to chan:1> -> celery bound to chan:1>]>
具體邏輯如下:
+-------------+ +-------------------+ +-------------------------+
| hub | 1 | Transport | 2 |MultiChannelPoller |
| | fileno | | cycle.on_readable(fileno) | |
| cb +--------------> on_readable +-------------------------------------> _fd_to_chan[fileno] |
| | | | | |
| poll | | +-<---------------+ | chan.handlers[type]+---------------+
+-------------+ | _callbacks[queue]| | | | |
| + | | +-------------------------+ |
| | | | |
+-------------------+ | |
| | |
| | +-----------------------+ |
| | | Channel | 3 |
| | | | _brpop_read |
| | | | |
| +----------------+ connection +<------------+
| _deliver(message, queue)| |
| 5 4 | |
| callback(message) | |
+----------------------------------------------> callback(message)+---------------+
+-----------------------+ |
|
+----------------------+ |
| Consumer | |
on_m(message) | | |
+---------------------------+ on_message | <------------+
| | | _receive_callback
| +----------------------+ 6
|
+-----------------------------------------------------------------------------------------------------------------------+
|
v
手機如下:
2.8 來到 Celery 範疇 -- Consumer in Celery
既然呼叫到了 Celery 之前註冊的回撥函式,我們實際就來到了 Celery 領域。
2.8.1 配置回撥
需要回憶下 Celery 何時配置回撥函式。
在 celery/worker/loops.py 中有如下程式碼,這樣就讓consumer可以回撥:
def asynloop(obj, connection, consumer, blueprint, hub, qos,
heartbeat, clock, hbrate=2.0):
"""Non-blocking event loop."""
consumer.on_message = on_task_received
2.8.2 回撥函式
回撥函式位於:celery/worker/consumer/consumer.py
可以看到,create_task_handler 函式中,返回了on_task_received,這就是回撥函式。
def create_task_handler(self, promise=promise):
strategies = self.strategies
on_unknown_message = self.on_unknown_message
on_unknown_task = self.on_unknown_task
on_invalid_task = self.on_invalid_task
callbacks = self.on_task_message
call_soon = self.call_soon
def on_task_received(message):
# payload will only be set for v1 protocol, since v2
# will defer deserializing the message body to the pool.
payload = None
try:
type_ = message.headers['task'] # protocol v2
except TypeError:
return on_unknown_message(None, message)
except KeyError:
try:
payload = message.decode()
except Exception as exc: # pylint: disable=broad-except
return self.on_decode_error(message, exc)
try:
type_, payload = payload['task'], payload # protocol v1
except (TypeError, KeyError):
return on_unknown_message(payload, message)
try:
strategy = strategies[type_]
except KeyError as exc:
return on_unknown_task(None, message, exc)
else:
try:
strategy(
message, payload,
promise(call_soon, (message.ack_log_error,)),
promise(call_soon, (message.reject_log_error,)),
callbacks,
)
except (InvalidTaskError, ContentDisallowed) as exc:
return on_invalid_task(payload, message, exc)
except DecodeError as exc:
return self.on_decode_error(message, exc)
return on_task_received
此時 變數為:
call_soon = {method} <bound method Consumer.call_soon of <Consumer: celery@ demini (running)>>
callbacks = {set: 0} set()
message = {Message} <Message object at 0x7fcbaef3eaf8 with details {'state': 'RECEIVED', 'content_type': 'application/json', 'delivery_tag': 'cfa3a261-c9b4-4d7e-819c-37608c0bb0cc', 'body_length': 82, 'properties': {'correlation_id': '863cf9b2-8440-4ea2-8ac4-06b3dcd2fd1f'}, 'delivery_info': {'exchange': '', 'routing_key': 'celery'}}>
on_invalid_task = {method} <bound method Consumer.on_invalid_task of <Consumer: celery@ demini (running)>>
on_unknown_message = {method} <bound method Consumer.on_unknown_message of <Consumer: celery@ demini (running)>>
on_unknown_task = {method} <bound method Consumer.on_unknown_task of <Consumer: celery@ demini (running)>>
self = {Consumer} <Consumer: celery@ demini (running)>
strategies = {dict: 10} {'celery.chunks': <function default.<locals>.task_message_handler at 0x7fcbaef230d0>, 'celery.backend_cleanup': <function default.<locals>.task_message_handler at 0x7fcbaef23620>, 'celery.chord_unlock': <function default.<locals>.task_message_handler at 0x7fcbaef238c8>, 'celery.group': <function default.<locals>.task_message_handler at 0x7fcbaef23b70>, 'celery.map': <function default.<locals>.task_message_handler at 0x7fcbaef23e18>, 'celery.chain': <function default.<locals>.task_message_handler at 0x7fcbaef48158>, 'celery.starmap': <function default.<locals>.task_message_handler at 0x7fcbaef48400>, 'celery.chord': <function default.<locals>.task_message_handler at 0x7fcbaef486a8>, 'myTest.add': <function default.<locals>.task_message_handler at 0x7fcbaef48950>, 'celery.accumulate': <function default.<locals>.task_message_handler at 0x7fcbaef48bf8>}
此時邏輯為:
+-------------+ +-------------------+ +-------------------------+
| hub | 1 | Transport | 2 |MultiChannelPoller |
| | fileno | | cycle.on_readable(fileno) | |
| cb +--------------> on_readable +-------------------------------------> _fd_to_chan[fileno] |
| | | | | |
| poll | | +-<---------------+ | chan.handlers[type]+---------------+
+-------------+ | _callbacks[queue]| | | | |
| + | | +-------------------------+ |
| | | | |
+-------------------+ | |
| | |
| | +-----------------------+ |
| | | Channel | 3 |
| | | | _brpop_read |
| | | | |
| +----------------+ connection +<------------+
| _deliver(message, queue)| |
| 5 4 | |
| callback(message) | |
+----------------------------------------------> callback(message)+---------------+
+-----------------------+ |
|
+----------------------+ |
| Consumer | |
on_m(message) | | |
+---------------------------+ on_message | <------------+
| | | _receive_callback
kombu | +----------------------+ 6
|
+-----------------------------------------------------------------------------------------------------------------------+
|
Celery |
+---------------------------+
| Consumer | |
| | |
| v |
| on_task_received |
| |
| |
+---------------------------+
手機如下:
0x03 邏輯 in Celery
至此,我們開始在 Celery 之中活動。
3.1 邏輯入口 --- consumer in Celery
首先來到了 Celery 的 Consumer 元件,這裡從概念上說是消費的邏輯入口。
Celery Consumer 的程式碼位於:celery/worker/consumer/consumer.py,其作用如下:
-
解析 message,從 header 中拿到 task 名字,比如 'myTest.add';
-
根據 task 名字,獲得 strategy;
-
呼叫 strategy;
程式碼為:
def create_task_handler(self, promise=promise):
strategies = self.strategies
on_unknown_message = self.on_unknown_message
on_unknown_task = self.on_unknown_task
on_invalid_task = self.on_invalid_task
callbacks = self.on_task_message
call_soon = self.call_soon
def on_task_received(message):
# payload will only be set for v1 protocol, since v2
# will defer deserializing the message body to the pool.
payload = None
try:
type_ = message.headers['task'] # protocol v2
except TypeError:
return on_unknown_message(None, message)
except KeyError:
try:
payload = message.decode()
except Exception as exc: # pylint: disable=broad-except
return self.on_decode_error(message, exc)
try:
type_, payload = payload['task'], payload # protocol v1
except (TypeError, KeyError):
return on_unknown_message(payload, message)
try:
strategy = strategies[type_]
except KeyError as exc:
return on_unknown_task(None, message, exc)
else:
try:
strategy(
message, payload,
promise(call_soon, (message.ack_log_error,)),
promise(call_soon, (message.reject_log_error,)),
callbacks,
)
except (InvalidTaskError, ContentDisallowed) as exc:
return on_invalid_task(payload, message, exc)
except DecodeError as exc:
return self.on_decode_error(message, exc)
return on_task_received
變數為:
self.app.tasks = {TaskRegistry: 10} {'celery.chunks': <@task: celery.chunks of myTest at 0x7fcbade229e8>, 'celery.backend_cleanup': <@task: celery.backend_cleanup of myTest at 0x7fcbade229e8>, 'celery.chord_unlock': <@task: celery.chord_unlock of myTest at 0x7fcbade229e8>, 'celery.group': <@
call_soon = {method} <bound method Consumer.call_soon of <Consumer: celery@ demini (running)>>
callbacks = {set: 0} set()
message = {Message} <Message object at 0x7fcbaef3eaf8 with details {'state': 'RECEIVED', 'content_type': 'application/json', 'delivery_tag': 'cfa3a261-c9b4-4d7e-819c-37608c0bb0cc', 'body_length': 82, 'properties': {'correlation_id': '863cf9b2-8440-4ea2-8ac4-06b3dcd2fd1f'}, 'delivery_info': {'exchange': '', 'routing_key': 'celery'}}>
on_invalid_task = {method} <bound method Consumer.on_invalid_task of <Consumer: celery@ demini (running)>>
on_unknown_message = {method} <bound method Consumer.on_unknown_message of <Consumer: celery@ demini (running)>>
on_unknown_task = {method} <bound method Consumer.on_unknown_task of <Consumer: celery@ demini (running)>>
self = {Consumer} <Consumer: celery@ demini (running)>
strategies = {dict: 10}
{'celery.chunks': <function default.<locals>.task_message_handler at 0x7fcbaef230d0>, 'celery.backend_cleanup': <function default.<locals>.task_message_handler at 0x7fcbaef23620>, 'celery.chord_unlock': <function default.<locals>.task_message_handler at 0x7fcbaef238c8>, 'celery.group': <function default.<locals>.task_message_handler at 0x7fcbaef23b70>, 'celery.map': <function default.<locals>.task_message_handler at 0x7fcbaef23e18>, 'celery.chain': <function default.<locals>.task_message_handler at 0x7fcbaef48158>, 'celery.starmap': <function default.<locals>.task_message_handler at 0x7fcbaef48400>, 'celery.chord': <function default.<locals>.task_message_handler at 0x7fcbaef486a8>, 'myTest.add': <function default.<locals>.task_message_handler at 0x7fcbaef48950>, 'celery.accumulate': <function default.<locals>.task_message_handler at 0x7fcbaef48bf8>}
3.1.1 解析 message
通過 如下程式碼獲得需要對應哪個 task,這裡就為 'myTest.add'。
type_ = message.headers['task']
message.headers如下,我們可以看出來定義一個 message 都需要考慮哪些方面。
message.headers = {dict: 15}
'lang' = {str} 'py'
'task' = {str} 'myTest.add'
'id' = {str} '863cf9b2-8440-4ea2-8ac4-06b3dcd2fd1f'
'shadow' = {NoneType} None
'eta' = {NoneType} None
'expires' = {NoneType} None
'group' = {NoneType} None
'group_index' = {NoneType} None
'retries' = {int} 0
'timelimit' = {list: 2} [None, None]
'root_id' = {str} '863cf9b2-8440-4ea2-8ac4-06b3dcd2fd1f'
'parent_id' = {NoneType} None
'argsrepr' = {str} '(2, 17)'
'kwargsrepr' = {str} '{}'
'origin' = {str} 'gen19806@ demini'
__len__ = {int} 15
3.1.2 獲得 strategy
依據 task,這裡就為 'myTest.add',從 strategies 獲得對應的回撥 function,回撥 function就是開始處理 任務訊息。
strategies = {dict: 10}
'celery.chunks' = {function} <function default.<locals>.task_message_handler at 0x7fcbaef230d0>
'celery.backend_cleanup' = {function} <function default.<locals>.task_message_handler at 0x7fcbaef23620>
'celery.chord_unlock' = {function} <function default.<locals>.task_message_handler at 0x7fcbaef238c8>
'celery.group' = {function} <function default.<locals>.task_message_handler at 0x7fcbaef23b70>
'celery.map' = {function} <function default.<locals>.task_message_handler at 0x7fcbaef23e18>
'celery.chain' = {function} <function default.<locals>.task_message_handler at 0x7fcbaef48158>
'celery.starmap' = {function} <function default.<locals>.task_message_handler at 0x7fcbaef48400>
'celery.chord' = {function} <function default.<locals>.task_message_handler at 0x7fcbaef486a8>
'myTest.add' = {function} <function default.<locals>.task_message_handler at 0x7fcbaef48950>
'celery.accumulate' = {function} <function default.<locals>.task_message_handler at 0x7fcbaef48bf8>
__len__ = {int} 10
3.1.3 呼叫 strategy
既然得到 strategy,比如:
<function default.<locals>.task_message_handler at 0x7fcbaef48950>
因此會呼叫這個函式,具體呼叫如下:
strategy(
message, payload,
promise(call_soon, (message.ack_log_error,)),
promise(call_soon, (message.reject_log_error,)),
callbacks,
)
3.2 策略 --- strategy
Strategy 的作用是在 Consumer 和 Worker 之間做一箇中間層,用來根據不同條件做不同的處理,也就是策略的本意。
3.2.1 邏輯 in strategy
程式碼為:celery/worker/strategy.py,
功能具體就是:
- 進一步解析訊息;
- 根據訊息構建內部的Req;
- 如果需要傳送,則傳送 'task-received'’;
- 進行時間 eta 處理;
- 進行qos 和 limit 處理;
- 呼叫 Req ,即來到 Worker;
具體如下:
def task_message_handler(message, body, ack, reject, callbacks,
to_timestamp=to_timestamp):
if body is None and 'args' not in message.payload:
body, headers, decoded, utc = (
message.body, message.headers, False, app.uses_utc_timezone(),
)
else:
if 'args' in message.payload:
body, headers, decoded, utc = hybrid_to_proto2(message,
message.payload)
else:
body, headers, decoded, utc = proto1_to_proto2(message, body)
req = Req(
message,
on_ack=ack, on_reject=reject, app=app, hostname=hostname,
eventer=eventer, task=task, connection_errors=connection_errors,
body=body, headers=headers, decoded=decoded, utc=utc,
)
if (req.expires or req.id in revoked_tasks) and req.revoked():
return
signals.task_received.send(sender=consumer, request=req)
if task_sends_events:
send_event(
'task-received',
uuid=req.id, name=req.name,
args=req.argsrepr, kwargs=req.kwargsrepr,
root_id=req.root_id, parent_id=req.parent_id,
retries=req.request_dict.get('retries', 0),
eta=req.eta and req.eta.isoformat(),
expires=req.expires and req.expires.isoformat(),
)
bucket = None
eta = None
if req.eta:
try:
if req.utc:
eta = to_timestamp(to_system_tz(req.eta))
else:
eta = to_timestamp(req.eta, app.timezone)
except (OverflowError, ValueError) as exc:
error("Couldn't convert ETA %r to timestamp: %r. Task: %r",
req.eta, exc, req.info(safe=True), exc_info=True)
req.reject(requeue=False)
if rate_limits_enabled:
bucket = get_bucket(task.name)
if eta and bucket:
consumer.qos.increment_eventually()
return call_at(eta, limit_post_eta, (req, bucket, 1),
priority=6)
if eta:
consumer.qos.increment_eventually()
call_at(eta, apply_eta_task, (req,), priority=6)
return task_message_handler
if bucket:
return limit_task(req, bucket, 1)
task_reserved(req)
if callbacks:
[callback(req) for callback in callbacks]
handle(req) # 在這裡
return task_message_handler
具體還要看看細節。
3.2.2 獲得例項
Strategy 中,以下目的是為了 根據 task 例項 構建一個 Request,從而把 broker 訊息,consumer,多程式都聯絡起來。
具體可以看到 Request. execute_using_pool 這裡就會和多程式處理開始關聯,比如和 comsumer 的 pool 程式池聯絡起來。
Req = create_request_cls(Request, task, consumer.pool, hostname, eventer)
task 例項為:
myTest.add[863cf9b2-8440-4ea2-8ac4-06b3dcd2fd1f]
獲得Requst程式碼為:
def create_request_cls(base, task, pool, hostname, eventer,
ref=ref, revoked_tasks=revoked_tasks,
task_ready=task_ready, trace=trace_task_ret):
default_time_limit = task.time_limit
default_soft_time_limit = task.soft_time_limit
apply_async = pool.apply_async
acks_late = task.acks_late
events = eventer and eventer.enabled
class Request(base):
def execute_using_pool(self, pool, **kwargs):
task_id = self.task_id
if (self.expires or task_id in revoked_tasks) and self.revoked():
raise TaskRevokedError(task_id)
time_limit, soft_time_limit = self.time_limits
result = apply_async(
trace,
args=(self.type, task_id, self.request_dict, self.body,
self.content_type, self.content_encoding),
accept_callback=self.on_accepted,
timeout_callback=self.on_timeout,
callback=self.on_success,
error_callback=self.on_failure,
soft_timeout=soft_time_limit or default_soft_time_limit,
timeout=time_limit or default_time_limit,
correlation_id=task_id,
)
# cannot create weakref to None
# pylint: disable=attribute-defined-outside-init
self._apply_result = maybe(ref, result)
return result
def on_success(self, failed__retval__runtime, **kwargs):
failed, retval, runtime = failed__retval__runtime
if failed:
if isinstance(retval.exception, (
SystemExit, KeyboardInterrupt)):
raise retval.exception
return self.on_failure(retval, return_ok=True)
task_ready(self)
if acks_late:
self.acknowledge()
if events:
self.send_event(
'task-succeeded', result=retval, runtime=runtime,
)
return Request
此時邏輯如下:
+
Consumer |
message |
v strategy +------------------------------------+
+------------+------+ | strategies |
| on_task_received | <--------+ | |
| | |[myTest.add : task_message_handler] |
+------------+------+ +------------------------------------+
|
|
+---------------------------------------------------------------------------------------+
|
strategy |
|
v Request [myTest.add]
+------------+-------------+ +---------------------+
| task_message_handler | <-------------------+ | create_request_cls |
| | | |
+--------------------------+ +---------------------+
3.2.3 呼叫例項
task_message_handler 最終呼叫 handle(req),就是開始呼叫例項。
handle 函式實際對應了 WorkController._process_task_sem。
程式碼如下:
def task_message_handler(message, body, ack, reject, callbacks,
to_timestamp=to_timestamp):
req = Req(
message,
on_ack=ack, on_reject=reject, app=app, hostname=hostname,
eventer=eventer, task=task, connection_errors=connection_errors,
body=body, headers=headers, decoded=decoded, utc=utc,
)
task_reserved(req)
if callbacks:
[callback(req) for callback in callbacks]
handle(req)
return task_message_handler
Request 為:
req = {Request} myTest.add[863cf9b2-8440-4ea2-8ac4-06b3dcd2fd1f]
acknowledged = {bool} False
app = {Celery} <Celery myTest at 0x7fcbade229e8>
args = {list: 2} [2, 17]
argsrepr = {str} '(2, 17)'
body = {bytes: 82} b'[[2, 17], {}, {"callbacks": null, "errbacks": null, "chain": null, "chord": null}]'
chord = {NoneType} None
connection_errors = {tuple: 8} (<class 'amqp.exceptions.ConnectionError'>, <class 'kombu.exceptions.InconsistencyError'>, <class 'OSError'>, <class 'OSError'>, <class 'OSError'>, <class 'redis.exceptions.ConnectionError'>, <class 'redis.exceptions.AuthenticationError'>, <class 'redis.exceptions.TimeoutError'>)
content_encoding = {str} 'utf-8'
content_type = {str} 'application/json'
correlation_id = {str} '863cf9b2-8440-4ea2-8ac4-06b3dcd2fd1f'
delivery_info = {dict: 4} {'exchange': '', 'routing_key': 'celery', 'priority': 0, 'redelivered': None}
errbacks = {NoneType} None
eta = {NoneType} None
eventer = {EventDispatcher} <celery.events.dispatcher.EventDispatcher object at 0x7fcbaeef31d0>
expires = {NoneType} None
group = {NoneType} None
group_index = {NoneType} None
hostname = {str} 'celery@ demini'
id = {str} '863cf9b2-8440-4ea2-8ac4-06b3dcd2fd1f'
kwargs = {dict: 0} {}
kwargsrepr = {str} '{}'
message = {Message} <Message object at 0x7fcbaef3eaf8 with details {'state': 'RECEIVED', 'content_type': 'application/json', 'delivery_tag': 'cfa3a261-c9b4-4d7e-819c-37608c0bb0cc', 'body_length': 82, 'properties': {'correlation_id': '863cf9b2-8440-4ea2-8ac4-06b3dcd2fd1f'}, 'delivery_info': {'exchange': '', 'routing_key': 'celery'}}>
name = {str} 'myTest.add'
on_ack = {promise} <promise@0x7fcbaeecc210 --> <bound method Consumer.call_soon of <Consumer: celery@ demini (running)>>>
on_reject = {promise} <promise@0x7fcbaeeccf20 --> <bound method Consumer.call_soon of <Consumer: celery@ demini (running)>>>
parent_id = {NoneType} None
reply_to = {str} 'ef1b446d-e3a9-3345-b027-b7bd8a93aa93'
request_dict = {dict: 25} {'lang': 'py', 'task': 'myTest.add', 'id': '863cf9b2-8440-4ea2-8ac4-06b3dcd2fd1f', 'shadow': None, 'eta': None, 'expires': None, 'group': None, 'group_index': None, 'retries': 0, 'timelimit': [None, None], 'root_id': '863cf9b2-8440-4ea2-8ac4-06b3dcd2fd1f', 'parent_id': None, 'argsrepr': '(2, 17)', 'kwargsrepr': '{}', 'origin': 'gen19806@ demini', 'reply_to': 'ef1b446d-e3a9-3345-b027-b7bd8a93aa93', 'correlation_id': '863cf9b2-8440-4ea2-8ac4-06b3dcd2fd1f', 'hostname': 'celery@ demini', 'delivery_info': {'exchange': '', 'routing_key': 'celery', 'priority': 0, 'redelivered': None}, 'args': [2, 17], 'kwargs': {}, 'callbacks': None, 'errbacks': None, 'chain': None, 'chord': None}
root_id = {str} '863cf9b2-8440-4ea2-8ac4-06b3dcd2fd1f'
store_errors = {bool} True
task = {add} <@task: myTest.add of myTest at 0x7fcbade229e8>
task_id = {str} '863cf9b2-8440-4ea2-8ac4-06b3dcd2fd1f'
task_name = {str} 'myTest.add'
time_limits = {list: 2} [None, None]
time_start = {NoneType} None
type = {str} 'myTest.add'
tzlocal = {NoneType} None
utc = {bool} True
worker_pid = {NoneType} None
handle 為:
handle = {method} <bound method WorkController._process_task_sem of <Worker: celery@ demini (running)>>
headers = {dict: 25} {'lang': 'py', 'task': 'myTest.add', 'id': '863cf9b2-8440-4ea2-8ac4-06b3dcd2fd1f', 'shadow': None, 'eta': None, 'expires': None, 'group': None, 'group_index': None, 'retries': 0, 'timelimit': [None, None], 'root_id': '863cf9b2-8440-4ea2-8ac4-06b3dcd2fd1f', 'parent_id': None, 'argsrepr': '(2, 17)', 'kwargsrepr': '{}', 'origin': 'gen19806@ demini', 'reply_to': 'ef1b446d-e3a9-3345-b027-b7bd8a93aa93', 'correlation_id': '863cf9b2-8440-4ea2-8ac4-06b3dcd2fd1f', 'hostname': 'celery@ demini', 'delivery_info': {'exchange': '', 'routing_key': 'celery', 'priority': 0, 'redelivered': None}, 'args': [2, 17], 'kwargs': {}, 'callbacks': None, 'errbacks': None, 'chain': None, 'chord': None}
此時邏輯如下:
+
Consumer |
message |
v strategy +------------------------------------+
+------------+------+ | strategies |
| on_task_received | <--------+ | |
| | |[myTest.add : task_message_handler] |
+------------+------+ +------------------------------------+
|
|
+------------------------------------------------------------------------------------+
strategy |
|
|
v Request [myTest.add]
+------------+-------------+ +---------------------+
| task_message_handler | <-------------------+ | create_request_cls |
| | | |
+------------+-------------+ +---------------------+
| _process_task_sem
|
+--------------------------------------------------------------------------------------+
Worker | req[{Request} myTest.add]
v
+--------+-------+
| WorkController |
+----------------+
手機如下:
3.3 打工人 -- Worker in Celery
程式來到了Worker in Celery。Worker 是具體執行 task 的地方。
程式碼位於:celery/worker/worker.py
可以看到,就是:
- _process_task_sem 呼叫了 _process_task;
- _process_task 呼叫了req.execute_using_pool(self.pool);
具體如下:
class WorkController:
"""Unmanaged worker instance."""
def register_with_event_loop(self, hub):
self.blueprint.send_all(
self, 'register_with_event_loop', args=(hub,),
description='hub.register',
)
def _process_task_sem(self, req):
return self._quick_acquire(self._process_task, req)
def _process_task(self, req):
"""Process task by sending it to the pool of workers."""
try:
req.execute_using_pool(self.pool)
變數為:
req = {Request} myTest.add[863cf9b2-8440-4ea2-8ac4-06b3dcd2fd1f]
self = {Worker} celery
3.3.1 Request in Celery
程式來到了Worker in Celery。程式碼位於:celery/worker/request.py
因為有:
apply_async = pool.apply_async
所以呼叫到:pool.apply_async
變數為:
apply_async = {method} <bound method BasePool.apply_async of <celery.concurrency.prefork.TaskPool object at 0x7fcbaddfa2e8>>
pool = {TaskPool} <celery.concurrency.prefork.TaskPool object at 0x7fcbaddfa2e8>
revoked_tasks = {LimitedSet: 0} <LimitedSet(0): maxlen=50000, expires=10800, minlen=0>
self = {Request} myTest.add[863cf9b2-8440-4ea2-8ac4-06b3dcd2fd1f]
程式碼為:
class Request(base):
def execute_using_pool(self, pool, **kwargs):
task_id = self.task_id# 獲取任務id
if (self.expires or task_id in revoked_tasks) and self.revoked():# 檢查是否過期或者是否已經執行過
raise TaskRevokedError(task_id)
time_limit, soft_time_limit = self.time_limits# 獲取時間
result = apply_async(# 執行對應的func並返回結果
trace,
args=(self.type, task_id, self.request_dict, self.body,
self.content_type, self.content_encoding),
accept_callback=self.on_accepted,
timeout_callback=self.on_timeout,
callback=self.on_success,
error_callback=self.on_failure,
soft_timeout=soft_time_limit or default_soft_time_limit,
timeout=time_limit or default_time_limit,
correlation_id=task_id,
)
# cannot create weakref to None
# pylint: disable=attribute-defined-outside-init
self._apply_result = maybe(ref, result)
return result
此時邏輯為:
+
Consumer |
message |
v strategy +------------------------------------+
+------------+------+ | strategies |
| on_task_received | <--------+ | |
| | |[myTest.add : task_message_handler] |
+------------+------+ +------------------------------------+
|
|
+------------------------------------------------------------------------------------+
strategy |
|
|
v Request [myTest.add]
+------------+-------------+ +---------------------+
| task_message_handler | <-------------------+ | create_request_cls |
| | | |
+------------+-------------+ +---------------------+
| _process_task_sem
|
+--------------------------------------------------------------------------------------+
Worker | req[{Request} myTest.add]
v
+--------+-----------+
| WorkController |
| |
| pool +-------------------------+
+--------+-----------+ |
| |
| apply_async v
+-----------+----------+ +---+-------+
|{Request} myTest.add | +---------------> | TaskPool |
+----------------------+ +-----------+
myTest.add
手機如下:
3.3.2 BasePool in Celery
apply_async 程式碼來到了Celery 的 Pool,注意,這 還不是 多程式的具體實現,只是來到了多程式實現的入口。
此時就把 任務資訊具體傳遞給了Pool,比如:
args = {tuple: 6}
0 = {str} 'myTest.add'
1 = {str} 'af6ed084-efc6-4608-a13a-d3065f457cd5'
2 = {dict: 21} {'lang': 'py', 'task': 'myTest.add', 'id': 'af6ed084-efc6-4608-a13a-d3065f457cd5', 'shadow': None, 'eta': None, 'expires': None, 'group': None, 'group_index': None, 'retries': 0, 'timelimit': [None, None], 'root_id': 'af6ed084-efc6-4608-a13a-d3065f457cd5', 'parent_id': None, 'argsrepr': '(2, 8)', 'kwargsrepr': '{}', 'origin': 'gen1100@DESKTOP-0GO3RPO', 'reply_to': 'afb85541-d08c-3191-b89d-918e15f9e0bf', 'correlation_id': 'af6ed084-efc6-4608-a13a-d3065f457cd5', 'hostname': 'celery@DESKTOP-0GO3RPO', 'delivery_info': {'exchange': '', 'routing_key': 'celery', 'priority': 0, 'redelivered': None}, 'args': [2, 8], 'kwargs': {}}
3 = {bytes: 81} b'[[2, 8], {}, {"callbacks": null, "errbacks": null, "chain": null, "chord": null}]'
4 = {str} 'application/json'
5 = {str} 'utf-8'
檔案位於:celery/concurrency/base.py,具體為:
class BasePool:
"""Task pool."""
def apply_async(self, target, args=None, kwargs=None, **options):
"""Equivalent of the :func:`apply` built-in function.
Callbacks should optimally return as soon as possible since
otherwise the thread which handles the result will get blocked.
"""
kwargs = {} if not kwargs else kwargs
args = [] if not args else args
return self.on_apply(target, args, kwargs,
waitforslot=self.putlocks,
callbacks_propagate=self.callbacks_propagate,
**options)
此時變數為:
options = {dict: 7} {'accept_callback': <bound method Request.on_accepted of <Request: myTest.add[863cf9b2-8440-4ea2-8ac4-06b3dcd2fd1f] (2, 17) {}>>, 'timeout_callback': <bound method Request.on_timeout of <Request: myTest.add[863cf9b2-8440-4ea2-8ac4-06b3dcd2fd1f] (2, 17) {}>
self = {TaskPool} <celery.concurrency.prefork.TaskPool object at 0x7fcbaddfa2e8>
3.3.3 AsynPool in Celery
apply_async 程式碼位於:celery/billiard/pool.py。
這裡在 __init__
之中,self._initargs = initargs
就是 (<Celery myTest at 0x2663db3fe48>, 'celery@DESKTOP-0GO3RPO')
,這樣就把 Celery 應用傳遞了進來。
這裡依據作業系統的而不同,會呼叫 self._taskqueue.put 或者 self._quick_put 來給 多程式 pool 傳送任務訊息。
def apply_async(self, func, args=(), kwds={},
callback=None, error_callback=None, accept_callback=None,
timeout_callback=None, waitforslot=None,
soft_timeout=None, timeout=None, lost_worker_timeout=None,
callbacks_propagate=(),
correlation_id=None):
'''
Asynchronous equivalent of `apply()` method.
Callback is called when the functions return value is ready.
The accept callback is called when the job is accepted to be executed.
Simplified the flow is like this:
>>> def apply_async(func, args, kwds, callback, accept_callback):
... if accept_callback:
... accept_callback()
... retval = func(*args, **kwds)
... if callback:
... callback(retval)
'''
if self._state == RUN:
waitforslot = self.putlocks if waitforslot is None else waitforslot
if waitforslot and self._putlock is not None:
self._putlock.acquire()
result = ApplyResult(
self._cache, callback, accept_callback, timeout_callback,
error_callback, soft_timeout, timeout, lost_worker_timeout,
on_timeout_set=self.on_timeout_set,
on_timeout_cancel=self.on_timeout_cancel,
callbacks_propagate=callbacks_propagate,
send_ack=self.send_ack if self.synack else None,
correlation_id=correlation_id,
)
if timeout or soft_timeout:
# start the timeout handler thread when required.
self._start_timeout_handler()
if self.threads:
self._taskqueue.put(([(TASK, (result._job, None,
func, args, kwds))], None))
else:
self._quick_put((TASK, (result._job, None, func, args, kwds)))
return result
變數為:
accept_callback = {method} <bound method Request.on_accepted of <Request: myTest.add[863cf9b2-8440-4ea2-8ac4-06b3dcd2fd1f] (2, 17) {}>>
args = {tuple: 6} ('myTest.add', '863cf9b2-8440-4ea2-8ac4-06b3dcd2fd1f', {'lang': 'py', 'task': 'myTest.add', 'id': '863cf9b2-8440-4ea2-8ac4-06b3dcd2fd1f', 'shadow': None, 'eta': None, 'expires': None, 'group': None, 'group_index': None, 'retries': 0, 'timelimit': [None, None], 'root_id': '863cf9b2-8440-4ea2-8ac4-06b3dcd2fd1f', 'parent_id': None, 'argsrepr': '(2, 17)', 'kwargsrepr': '{}', 'origin': 'gen19806@ demini', 'reply_to': 'ef1b446d-e3a9-3345-b027-b7bd8a93aa93', 'correlation_id': '863cf9b2-8440-4ea2-8ac4-06b3dcd2fd1f', 'hostname': 'celery@ demini', 'delivery_info': {'exchange': '', 'routing_key': 'celery', 'priority': 0, 'redelivered': None}, 'args': [2, 17], 'kwargs': {}, 'callbacks': None, 'errbacks': None, 'chain': None, 'chord': None}, b'[[2, 17], {}, {"callbacks": null, "errbacks": null, "chain": null, "chord": null}]', 'application/json', 'utf-8')
callback = {method} <bound method create_request_cls.<locals>.Request.on_success of <Request: myTest.add[863cf9b2-8440-4ea2-8ac4-06b3dcd2fd1f] (2, 17) {}>>
error_callback = {method} <bound method Request.on_failure of <Request: myTest.add[863cf9b2-8440-4ea2-8ac4-06b3dcd2fd1f] (2, 17) {}>>
self = {AsynPool} <celery.concurrency.asynpool.AsynPool object at 0x7fcbaee2ea20>
timeout_callback = {method} <bound method Request.on_timeout of <Request: myTest.add[863cf9b2-8440-4ea2-8ac4-06b3dcd2fd1f] (2, 17) {}>>
waitforslot = {bool} False
3.3.3.1 部分變數事先設定
我們首先說說之前的部分變數設定。
比如如下程式碼中有:
inq, outq, synq = self.get_process_queues() 和 self._process_register_queues(w, (inq, outq, synq))
就是具體設定父程式和子程式之前的管道。
def _create_worker_process(self, i):
sentinel = self._ctx.Event() if self.allow_restart else None
inq, outq, synq = self.get_process_queues()
on_ready_counter = self._ctx.Value('i')
w = self.WorkerProcess(self.Worker(
inq, outq, synq, self._initializer, self._initargs,
self._maxtasksperchild, sentinel, self._on_process_exit,
# Need to handle all signals if using the ipc semaphore,
# to make sure the semaphore is released.
sigprotection=self.threads,
wrap_exception=self._wrap_exception,
max_memory_per_child=self._max_memory_per_child,
on_ready_counter=on_ready_counter,
))
self._pool.append(w)
self._process_register_queues(w, (inq, outq, synq))
w.name = w.name.replace('Process', 'PoolWorker')
w.daemon = True
w.index = i
w.start()
self._poolctrl[w.pid] = sentinel
self._on_ready_counters[w.pid] = on_ready_counter
if self.on_process_up:
self.on_process_up(w)
return w
比如下面是管道的建立。
def _setup_queues(self):
self._inqueue = self._ctx.SimpleQueue()
self._outqueue = self._ctx.SimpleQueue()
self._quick_put = self._inqueue._writer.send
self._quick_get = self._outqueue._reader.recv
以及管道相關檔案的確立。
def _create_write_handlers(self, hub,
pack=pack, dumps=_pickle.dumps,
protocol=HIGHEST_PROTOCOL):
"""Create handlers used to write data to child processes."""
fileno_to_inq = self._fileno_to_inq
fileno_to_synq = self._fileno_to_synq
outbound = self.outbound_buffer
pop_message = outbound.popleft
put_message = outbound.append
所以最終預置變數具體如下:
self._taskqueue = {Queue} <queue.Queue object at 0x7fcbaee57b00>
self._quick_put = {function} <function AsynPool._create_write_handlers.<locals>.send_job at 0x7fcbaef569d8>
self._outqueue = {NoneType} None
self._inqueue = {NoneType} None
self._fileno_to_synq = {dict: 1} {None: <ForkProcess(ForkPoolWorker-4, started daemon)>}
self._quick_get = {NoneType} None
self._fileno_to_inq = {dict: 0} {}
self.outbound_buffer = {deque: 1} deque([<%s: 0 ack:False ready:False>])
self = {Pool} <billiard.pool.Pool object at 0x000002663FD6E948>
ResultHandler = {type} <class 'billiard.pool.ResultHandler'>
SoftTimeLimitExceeded = {type} <class 'billiard.exceptions.SoftTimeLimitExceeded'>
Supervisor = {type} <class 'billiard.pool.Supervisor'>
TaskHandler = {type} <class 'billiard.pool.TaskHandler'>
TimeoutHandler = {type} <class 'billiard.pool.TimeoutHandler'>
Worker = {type} <class 'billiard.pool.Worker'>
3.3.3.2 傳送給子程式
在 windows 就是
if self.threads: self._taskqueue.put(([(TASK, (result._job, None, func, args, kwds))], None))
*nix 就是:這裡建立了job,並且傳送。就是通過 put_message(job) 往子程式 pipe發訊息。
def send_job(tup):
# Schedule writing job request for when one of the process
# inqueues are writable.
body = dumps(tup, protocol=protocol)
body_size = len(body)
header = pack('>I', body_size)
# index 1,0 is the job ID.
job = get_job(tup[1][0])
job._payload = buf_t(header), buf_t(body), body_size
put_message(job)
self._quick_put = send_job
此時邏輯為:
+
Consumer |
message |
v strategy +------------------------------------+
+------------+------+ | strategies |
| on_task_received | <--------+ | |
| | |[myTest.add : task_message_handler] |
+------------+------+ +------------------------------------+
|
|
+------------------------------------------------------------------------------------+
strategy |
|
|
v Request [myTest.add]
+------------+-------------+ +---------------------+
| task_message_handler | <-------------------+ | create_request_cls |
| | | |
+------------+-------------+ +---------------------+
| _process_task_sem
|
+------------------------------------------------------------------------------------+
Worker | req[{Request} myTest.add]
v
+--------+-----------+
| WorkController |
| |
| pool +-------------------------+
+--------+-----------+ |
| |
| apply_async v
+-----------+----------+ +---+-------------------+
|{Request} myTest.add | +---------------> | TaskPool |
+----------------------+ +----+------------------+
myTest.add |
|
+--------------------------------------------------------------------------------------+
|
v
+----+------------------+
| billiard.pool.Pool |
+-------+---------------+
|
|
Pool +---------------------------+ |
| TaskHandler | |
| | | self._taskqueue.put
| _taskqueue | <---------------+
| |
+------------+--------------+
|
| put(task)
|
+--------------------------------------------------------------------------------------+
|
Sub process |
v
手機如下:
於是從下文開始,我們正式進入多程式是如何處理訊息的。
0xEE 本系列文章
本系列目前文章如下:
[原始碼解析] 並行分散式框架 Celery 之架構 (1)
[原始碼解析] 並行分散式框架 Celery 之架構 (2)
[原始碼解析] 並行分散式框架 Celery 之 worker 啟動 (1)
[原始碼解析] 並行分散式框架 Celery 之 worker 啟動 (2)
[原始碼解析] 分散式任務佇列 Celery 之啟動 Consumer
[原始碼解析] 並行分散式任務佇列 Celery 之 Task是什麼