Python Enhanced Generator － Coroutine

發表於2017-07-25

本文主要介紹python中Enhanced generator即coroutine相關內容，包括基本語法、使用場景、注意事項，以及與其他語言協程實現的異同。

enhanced generator

在上文介紹了yield和generator的使用場景和主意事項，只用到了generator的next方法，事實上generator還有更強大的功能。PEP 342為generator增加了一系列方法來使得generator更像一個協程Coroutine。做主要的變化在於早期的yield只能返回值（作為資料的產生者），而新增加的send方法能在generator恢復的時候消費一個數值，而去caller（generator的呼叫著）也可以通過throw在generator掛起的主動丟擲異常。

首先看看增強版本的yield，語法格式如下：

back_data = yield cur_ret

1	back_data = yield cur_ret

這段程式碼的意思是：當執行到這條語句時，返回cur_ret給呼叫者；並且當generator通過next()或者send(some_data)方法恢復的時候，將some_data賦值給back_data.例如：

def gen(data):
    print 'before yield', data
    back_data = yield data
    print 'after resume', back_data
    
if __name__ == '__main__':
    g = gen(1)
    print g.next()
    try:
        g.send(0)
    except StopIteration:
        pass

def gen(data):

print 'before yield', data

back_data = yield data

print 'after resume', back_data

if __name__ == '__main__':

g = gen(1)

print g.next()

try:

g.send(0)

except StopIteration:

pass

輸出：
before yield 1
1
after resume 0

兩點需要注意：

（1） next() 等價於 send(None)
（2）第一次呼叫時，需要使用next()語句或是send(None)，不能使用send傳送一個非None的值，否則會出錯的，因為沒有Python yield語句來接收這個值。

應用場景

當generator可以接受資料（在從掛起狀態恢復的時候）而不僅僅是返回資料時， generator就有了消費資料（push）的能力。下面的例子來自這裡:

word_map = {}
def consume_data_from_file(file_name, consumer):
    for line in file(file_name):
        consumer.send(line)

def consume_words(consumer):
    while True:
        line = yield
        for word in (w for w in line.split() if w.strip()):
            consumer.send(word)

def count_words_consumer():
    while True:
        word  = yield
        if word not in word_map:
            word_map[word] = 0
        word_map[word] += 1
    print word_map

if __name__ == '__main__':
    cons = count_words_consumer()
    cons.next()
    cons_inner = consume_words(cons)
    cons_inner.next()
    c = consume_data_from_file('test.txt', cons_inner)
    print word_map

word_map = {}

def consume_data_from_file(file_name, consumer):

for line in file(file_name):

consumer.send(line)

def consume_words(consumer):

while True:

line = yield

for word in (w for w in line.split() if w.strip()):

consumer.send(word)

def count_words_consumer():

while True:

word = yield

if word not in word_map:

word_map[word] = 0

word_map[word] += 1

print word_map

if __name__ == '__main__':

cons = count_words_consumer()

cons.next()

cons_inner = consume_words(cons)

cons_inner.next()

c = consume_data_from_file('test.txt', cons_inner)

print word_map

上面的程式碼中，真正的資料消費者是count_words_consumer，最原始的資料生產者是consume_data_from_file，資料的流向是主動從生產者推向消費者。不過上面第22、24行分別呼叫了兩次next，這個可以使用一個decorator封裝一下。

def consumer(func):
    def wrapper(*args,**kw):
        gen = func(*args, **kw)
        gen.next()
        return gen
    wrapper.__name__ = func.__name__
    wrapper.__dict__ = func.__dict__
    wrapper.__doc__  = func.__doc__
    return wrapper

def consumer(func):

def wrapper(*args,**kw):

gen = func(*args, **kw)

gen.next()

return gen

wrapper.__name__ = func.__name__

wrapper.__dict__ = func.__dict__

wrapper.__doc__ = func.__doc__

return wrapper

修改後的程式碼：

def consumer(func):
    def wrapper(*args,**kw):
        gen = func(*args, **kw)
        gen.next()
        return gen
    wrapper.__name__ = func.__name__
    wrapper.__dict__ = func.__dict__
    wrapper.__doc__  = func.__doc__
    return wrapper

word_map = {}
def consume_data_from_file(file_name, consumer):
    for line in file(file_name):
        consumer.send(line)

@consumer
def consume_words(consumer):
    while True:
        line = yield
        for word in (w for w in line.split() if w.strip()):
            consumer.send(word)

@consumer
def count_words_consumer():
    while True:
        word  = yield
        if word not in word_map:
            word_map[word] = 0
        word_map[word] += 1
    print word_map

if __name__ == '__main__':
    cons = count_words_consumer()
    cons_inner = consume_words(cons)
    c = consume_data_from_file('test.txt', cons_inner)
    print word_map

example_with_deco

def consumer(func):

def wrapper(*args,**kw):

gen = func(*args, **kw)

gen.next()

return gen

wrapper.__name__ = func.__name__

wrapper.__dict__ = func.__dict__

wrapper.__doc__ = func.__doc__

return wrapper

word_map = {}

def consume_data_from_file(file_name, consumer):

for line in file(file_name):

consumer.send(line)

@consumer

def consume_words(consumer):

while True:

line = yield

for word in (w for w in line.split() if w.strip()):

consumer.send(word)

@consumer

def count_words_consumer():

while True:

word = yield

if word not in word_map:

word_map[word] = 0

word_map[word] += 1

print word_map

if __name__ == '__main__':

cons = count_words_consumer()

cons_inner = consume_words(cons)

c = consume_data_from_file('test.txt', cons_inner)

print word_map

example_with_deco

generator throw

除了next和send方法，generator還提供了兩個實用的方法，throw和close，這兩個方法加強了caller對generator的控制。send方法可以傳遞一個值給generator，throw方法在generator掛起的地方丟擲異常，close方法讓generator正常結束（之後就不能再呼叫next send了）。下面詳細介紹一下throw方法。

throw(type[, value[, traceback]])

1	throw(type[, value[, traceback]])

在generator yield的地方丟擲type型別的異常，並且返回下一個被yield的值。如果type型別的異常沒有被捕獲，那麼會被傳給caller。另外，如果generator不能yield新的值，那麼向caller丟擲StopIteration異常：

@consumer
def gen_throw():
    value = yield 
    try:
        yield value
    except Exception, e:
        yield str(e) # 如果註釋掉這行，那麼會丟擲StopIteration

if __name__ == '__main__':
    g = gen_throw()
    assert g.send(5) == 5
    assert g.throw(Exception, 'throw Exception') == 'throw Exception'

@consumer

def gen_throw():

value = yield

try:

yield value

except Exception, e:

yield str(e) # 如果註釋掉這行，那麼會丟擲StopIteration

if __name__ == '__main__':

g = gen_throw()

assert g.send(5) == 5

assert g.throw(Exception, 'throw Exception') == 'throw Exception'

第一次呼叫send，程式碼返回value（5）之後在第5行掛起，然後generator throw之後會被第6行catch住。如果第7行沒有重新yield，那麼會重新丟擲StopIteration異常。

注意事項

如果一個生成器已經通過send開始執行，那麼在其再次yield之前，是不能從其他生成器再次排程到該生成器

@consumer
def funcA():
    while True:
        data = yield
        print 'funcA recevie', data
        fb.send(data * 2)

@consumer
def funcB():
    while True:
        data = yield
        print 'funcB recevie', data
        fa.send(data * 2)

fa = funcA()
fb = funcB()
if __name__ == '__main__':
    fa.send(10)

@consumer

def funcA():

while True:

data = yield

print 'funcA recevie', data

fb.send(data * 2)

@consumer

def funcB():

while True:

data = yield

print 'funcB recevie', data

fa.send(data * 2)

fa = funcA()

fb = funcB()

if __name__ == '__main__':

fa.send(10)

輸出：

funcA recevie 10
funcB recevie 20
ValueError: generator already executing

Generator 與 Coroutine

回到Coroutine，可參見維基百科解釋，而我自己的理解比較簡單（或者片面）：程式設計師可控制的併發流程，不管是程式還是執行緒，其切換都是作業系統在排程，而對於協程，程式設計師可以控制什麼時候切換出去，什麼時候切換回來。協程比程式執行緒輕量級很多，較少了上下文切換的開銷。另外，由於是程式設計師控制排程，一定程度上也能避免一個任務被中途中斷.。協程可以用在哪些場景呢，我覺得可以歸納為非阻塞等待的場景，如遊戲程式設計，非同步IO，事件驅動。
Python中，generator的send和throw方法使得generator很像一個協程（coroutine）, 但是generator只是一個半協程（semicoroutines），python doc是這樣描述的：

“All of this makes generator functions quite similar to coroutines; they yield multiple times, they have more than one entry point and their execution can be suspended. The only difference is that a generator function cannot control where should the execution continue after it yields; the control is always transferred to the generator’s caller.”

儘管如此，利用enhanced generator也能實現更強大的功能。比如上文中提到的yield_dec的例子，只能被動的等待時間到達之後繼續執行。在某些情況下比如觸發了某個事件，我們希望立即恢復執行流程，而且我們也關心具體是什麼事件，這個時候就需要在generator send了。另外一種情形，我們需要終止這個執行流程，那麼刻意呼叫close，同時在程式碼裡面做一些處理，虛擬碼如下：

@yield_dec
def do(a):
    print 'do', a
    try：
        event ＝ yield 5
        print 'post_do', a， event
    finally：
        print 'do sth'

@yield_dec

def do(a):

print 'do', a

try：

event ＝ yield 5

print 'post_do', a， event

finally：

print 'do sth'

至於之前提到的另一個例子，服務（程式）之間的非同步呼叫，也是非常適合實用協程的例子。callback的方式會割裂程式碼，把一段邏輯分散到多個函式，協程的方式會好很多，至少對於程式碼閱讀而言。其他語言，比如C＃、Go語言，協程都是標準實現，特別對於go語言，協程是高併發的基石。在python3.x中，通過asyncio和async\await也增加了對協程的支援。在筆者所使用的2.7環境下，也可以使用greenlet，之後會有博文介紹。

參考

https://www.python.org/dev/peps/pep-0342/
http://www.dabeaz.com/coroutines/
https://en.wikipedia.org/wiki/Coroutine#Implementations_for_Python

理解Python的協程(Coroutine)
2019-03-02
Python
Python中協程（coroutine）詳解
2024-04-09
Python
python - function list generator
2020-11-11
PythonFunction
Python Yield Generator 詳解
2017-07-24
Python
python generator iterator和iterable object
2020-04-07
PythonObject
Sphinx Introducation: Python Documentation Generator
2012-09-30
Python
Python “黑魔法” 之 Generator Coroutines
2016-05-17
Python
Generator(生成器),入門初基,Coroutine(原生協程),登峰造極,Python3.10併發非同步程式設計async底層實現
2022-12-28
Python非同步程式設計
Implementing a generator/yield in a Python C extension
2012-08-15
Python
PEM (Privacy Enhanced Mail) Encoding
2015-07-22
AIEncoding
cloudwu/coroutine 原始碼分析
2022-06-06
Cloud原始碼
stackless/stackfull coroutine 筆記
2018-05-16
筆記
深入理解Python生成器（Generator）
2016-07-24
Python
Python中的 List Comprehension 以及 Generator
2015-05-24
Python
An Enhanced Prototype of Producer-Consumer
2019-04-14
CPS 與 Kotlin coroutine
2019-03-07
Kotlin
Kotlin 協程一 —— Coroutine
2022-01-15
Kotlin
Enhanced Tablespace Point-In-Time Recovery (TSPITR)
2011-12-02
Python迭代器：捕獲Generator的返回值
2016-08-02
Python
tornado 原始碼之 coroutine 分析
2019-02-28
原始碼
Kotlin Coroutine(協程)簡介
2019-04-15
Kotlin
boost.coroutine學習筆記
2014-10-27
筆記
mybatis generator
2018-08-11
MyBatis
Oracle Database 10g Enhanced wait model
2010-10-20
OracleDatabaseAI
Enhanced Invertible Encoding for Learned Image Compression
2024-04-26
Encoding
Kotlin Coroutine(協程) 基本知識
2019-04-15
Kotlin
Kotlin coroutine之協程基礎
2018-10-31
Kotlin
asio學習筆記8——stackfull coroutine
2014-10-27
筆記
Mybatis-Generator
2020-06-29
MyBatis
Async & generator & Promise
2019-04-08
Promise
如何理解Generator
2018-02-09
MyBatis generator配置
2024-03-21
MyBatis
PDFsam Enhanced 7,專業PDF編輯軟體
2021-11-30
XTask與Kotlin Coroutine的使用對比
2022-04-24
Kotlin
Unity 協程(Coroutine)原理與用法詳解
2021-04-29
Unity
Unity協程(Coroutine)管理類——TaskManager工具分享
2014-05-03
Unity
WindowBlinds v3.0 enhanced 破解 (17千字)
2002-02-28
Generator 基礎指南
2019-03-24

Python Enhanced Generator － Coroutine

enhanced generator

應用場景

generator throw

注意事項

Generator 與 Coroutine

相關文章