引言
functools
, itertools
, operator
是Python標準庫為我們提供的支援函數語言程式設計的三大模組,合理的使用這三個模組,我們可以寫出更加簡潔可讀的Pythonic程式碼,接下來我們通過一些example來了解三大模組的使用。
functools的使用
functools是Python中很重要的模組,它提供了一些非常有用的高階函式。高階函式就是說一個可以接受函式作為引數或者以函式作為返回值的函式,因為Python中函式也是物件,因此很容易支援這樣的函式式特性
partial
1 2 3 4 |
>>> from functools import partial >>> basetwo = partial(int, base=2) >>> basetwo('10010') 18 |
basetwo('10010')
實際上等價於呼叫int('10010', base=2)
,當函式的引數個數太多的時候,可以通過使用functools.partial來建立一個新的函式來簡化邏輯從而增強程式碼的可讀性,而partial內部實際上就是通過一個簡單的閉包來實現的。
1 2 3 4 5 6 7 8 9 |
def partial(func, *args, **keywords): def newfunc(*fargs, **fkeywords): newkeywords = keywords.copy() newkeywords.update(fkeywords) return func(*args, *fargs, **newkeywords) newfunc.func = func newfunc.args = args newfunc.keywords = keywords return newfunc |
partialmethod
partialmethod和partial類似,但是對於繫結一個非物件自身的方法
的時候,這個時候就只能使用partialmethod了,我們通過下面這個例子來看一下兩者的差異。
1 2 3 4 5 6 7 8 9 10 11 12 |
from functools import partial, partialmethod def standalone(self, a=1, b=2): "Standalone function" print(' called standalone with:', (self, a, b)) if self is not None: print(' self.attr =', self.attr) class MyClass: "Demonstration class for functools" def __init__(self): self.attr = 'instance attribute' method1 = functools.partialmethod(standalone) # 使用partialmethod method2 = functools.partial(standalone) # 使用partial |
1 2 3 4 5 6 7 8 9 |
>>> o = MyClass() >>> o.method1() called standalone with: (<__main__.MyClass object at 0x7f46d40cc550>, 1, 2) self.attr = instance attribute # 不能使用partial >>> o.method2() Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: standalone() missing 1 required positional argument: 'self' |
singledispatch
雖然Python不支援同名方法允許有不同的引數型別,但是我們可以借用singledispatch來動態指定相應的方法所接收的引數型別
,而不用把引數判斷放到方法內部去判斷從而降低程式碼的可讀性。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
from functools import singledispatch class TestClass(object): @singledispatch def test_method(arg, verbose=False): if verbose: print("Let me just say,", end=" ") print(arg) @test_method.register(int) def _(arg): print("Strength in numbers, eh?", end=" ") print(arg) @test_method.register(list) def _(arg): print("Enumerate this:") for i, elem in enumerate(arg): print(i, elem) |
下面通過@test_method.register(int)和@test_method.register(list)指定當test_method的第一個引數為int或者list的時候,分別呼叫不同的方法來進行處理
1 2 3 4 5 6 7 8 9 |
>>> TestClass.test_method(55555) # call @test_method.register(int) Strength in numbers, eh? 55555 >>> TestClass.test_method([33, 22, 11]) # call @test_method.register(list) Enumerate this: 0 33 1 22 2 11 >>> TestClass.test_method('hello world', verbose=True) # call default Let me just say, hello world |
wraps
裝飾器會遺失被裝飾函式的__name__和__doc__等屬性,可以使用@wraps來恢復。
1 2 3 4 5 6 7 8 9 10 11 12 |
from functools import wraps def my_decorator(f): @wraps(f) def wrapper(): """wrapper_doc""" print('Calling decorated function') return f() return wrapper @my_decorator def example(): """example_doc""" print('Called example function') |
1 2 3 4 5 6 7 8 9 |
>>> example.__name__ 'example' >>> example.__doc__ 'example_doc' # 嘗試去掉@wraps(f)來看一下執行結果,example自身的__name__和__doc__都已經喪失了 >>> example.__name__ 'wrapper' >>> example.__doc__ 'wrapper_doc' |
我們也可以使用update_wrapper來改寫。
1 2 3 4 5 6 7 8 |
from itertools import update_wrapper def g(): ... g = update_wrapper(g, f) # 等價於 @wraps(f) def g(): ... |
@wraps內部實際上就是基於update_wrapper來實現的。
1 2 3 4 |
def wraps(wrapped, assigned=WRAPPER_ASSIGNMENTS, updated=WRAPPER_UPDATES): def decorator(wrapper): return update_wrapper(wrapper, wrapped=wrapped...) return decorator |
total_ordering
Python2中可以通過自定義__cmp__的返回值0/-1/1來比較物件的大小,在Python3中廢棄了__cmp__,但是我們可以通過totalordering然後修改 _lt__() , __le__() , __gt__(), __ge__(), __eq__(), __ne__() 等魔術方法來自定義類的比較規則。p.s: 如果使用必須在類裡面定義 __lt__() , __le__() , __gt__(), __ge__()中的一個,以及給類新增一個__eq__() 方法。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
import functools @functools.total_ordering class MyObject: def __init__(self, val): self.val = val def __eq__(self, other): print(' testing __eq__({}, {})'.format( self.val, other.val)) return self.val == other.val def __gt__(self, other): print(' testing __gt__({}, {})'.format( self.val, other.val)) return self.val > other.val a = MyObject(1) b = MyObject(2) for expr in ['a < b', 'a <= b', 'a == b', 'a >= b', 'a > b']: print('\n{:<6}:'.format(expr)) result = eval(expr) print(' result of {}: {}'.format(expr, result)) |
下面是執行結果:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
a < b : testing __gt__(1, 2) testing __eq__(1, 2) result of a < b: True a <= b: testing __gt__(1, 2) result of a <= b: True a == b: testing __eq__(1, 2) result of a == b: False a >= b: testing __gt__(1, 2) testing __eq__(1, 2) result of a >= b: False a > b : testing __gt__(1, 2) result of a > b: False |
itertools的使用
itertools為我們提供了非常有用的用於操作迭代物件的函式
無限迭代器
count
count(start=0, step=1) 會返回一個無限的整數iterator,每次增加1。可以選擇提供起始編號,預設為0。
1 2 3 4 5 |
>>> from itertools import count >>> for i in zip(count(1), ['a', 'b', 'c']): ... print(i, end=' ') ... (1, 'a') (2, 'b') (3, 'c') |
cycle
cycle(iterable) 會把傳入的一個序列無限重複下去,不過可以提供第二個引數就可以制定重複次數。
1 2 3 4 5 |
>>> from itertools import cycle >>> for i in zip(range(6), cycle(['a', 'b', 'c'])): ... print(i, end=' ') ... (0, 'a') (1, 'b') (2, 'c') (3, 'a') (4, 'b') (5, 'c') |
repeat
repeat(object[, times]) 返回一個元素無限重複下去的iterator,可以提供第二個引數就可以限定重複次數。
1 2 3 4 5 6 7 8 9 |
>>> from itertools import repeat >>> for i, s in zip(count(1), repeat('over-and-over', 5)): ... print(i, s) ... 1 over-and-over 2 over-and-over 3 over-and-over 4 over-and-over 5 over-and-over |
Iterators terminating on the shortest input sequence
accumulate
accumulate(iterable[, func])
1 2 3 4 5 6 |
>>> from itertools import accumulate >>> import operator >>> list(accumulate([1, 2, 3, 4, 5], operator.add)) [1, 3, 6, 10, 15] >>> list(accumulate([1, 2, 3, 4, 5], operator.mul)) [1, 2, 6, 24, 120] |
chain
itertools.chain(*iterables)可以將多個iterable組合成一個iterator。
1 2 3 |
>>> from itertools import chain >>> list(chain([1, 2, 3], ['a', 'b', 'c'])) [1, 2, 3, 'a', 'b', 'c'] |
chain的實現原理如下:
1 2 3 4 5 |
def chain(*iterables): # chain('ABC', 'DEF') --> A B C D E F for it in iterables: for element in it: yield element |
chain.from_iterable
chain.from_iterable(iterable)和chain類似,但是隻是接收單個iterable,然後將這個iterable中的元素組合成一個iterator。
1 2 3 |
>>> from itertools import chain >>> list(chain.from_iterable(['ABC', 'DEF'])) ['A', 'B', 'C', 'D', 'E', 'F'] |
實現原理也和chain類似。
1 2 3 4 5 |
def from_iterable(iterables): # chain.from_iterable(['ABC', 'DEF']) --> A B C D E F for it in iterables: for element in it: yield element |
compress
compress(data, selectors)接收兩個iterable作為引數,只返回selectors中對應的元素為True的data,當data/selectors之一用盡時停止。
1 2 |
>>> list(compress([1, 2, 3, 4, 5], [True, True, False, False, True])) [1, 2, 5] |
zip_longest
zip_longest(*iterables, fillvalue=None)和zip類似,但是zip的缺陷是iterable中的某一個元素被遍歷完,整個遍歷都會停止,具體差異請看下面這個例子:
1 2 3 4 5 6 7 8 9 |
from itertools import zip_longest r1 = range(3) r2 = range(2) print('zip stops early:') print(list(zip(r1, r2))) r1 = range(3) r2 = range(2) print('\nzip_longest processes all of the values:') print(list(zip_longest(r1, r2))) |
下面是輸出結果:
1 2 3 4 |
zip stops early: [(0, 0), (1, 1)] zip_longest processes all of the values: [(0, 0), (1, 1), (2, None)] |
islice
islice(iterable, stop) or islice(iterable, start, stop[, step]) 與Python的字串和列表切片有一些類似,只是不能對start、start和step使用負值。
1 2 3 4 5 |
>>> from itertools import islice >>> for i in islice(range(100), 0, 100, 10): ... print(i, end=' ') ... 0 10 20 30 40 50 60 70 80 90 |
tee
tee(iterable, n=2) 返回n個獨立的iterator,n預設為2。
1 2 3 4 5 6 7 8 9 |
from itertools import islice, tee r = islice(count(), 5) i1, i2 = tee(r) print('i1:', list(i1)) print('i2:', list(i2)) for i in r: print(i, end=' ') if i > 1: break |
下面是輸出結果,注意tee(r)後,r作為iterator已經失效,所以for迴圈沒有輸出值:
1 2 |
i1: [0, 1, 2, 3, 4] i2: [0, 1, 2, 3, 4] |
starmap
starmap(func, iterable)假設iterable將返回一個元組流,並使用這些元組作為引數呼叫func:
1 2 3 4 5 6 7 |
>>> from itertools import starmap >>> import os >>> iterator = starmap(os.path.join, ... [('/bin', 'python'), ('/usr', 'bin', 'java'), ... ('/usr', 'bin', 'perl'), ('/usr', 'bin', 'ruby')]) >>> list(iterator) ['/bin/python', '/usr/bin/java', '/usr/bin/perl', '/usr/bin/ruby'] |
filterfalse
filterfalse(predicate, iterable) 與filter()相反,返回所有predicate返回False的元素
1 2 |
itertools.filterfalse(is_even, itertools.count()) => 1, 3, 5, 7, 9, 11, 13, 15, ... |
takewhile
takewhile(predicate, iterable) 只要predicate返回True,不停地返回iterable中的元素。一旦predicate返回False,iteration將結束。
1 2 3 4 5 6 |
def less_than_10(x): return x < 10 itertools.takewhile(less_than_10, itertools.count()) => 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 itertools.takewhile(is_even, itertools.count()) => 0 |
dropwhile
dropwhile(predicate, iterable) 在predicate返回True時捨棄元素,然後返回其餘迭代結果:、
1 2 3 4 |
itertools.dropwhile(less_than_10, itertools.count()) => 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, ... itertools.dropwhile(is_even, itertools.count()) => 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, ... |
groupby
groupby(iterable, key=None) 把iterator中相鄰的重複元素
挑出來放在一起。p.s: The input sequence needs to be sorted on the key value in order for the groupings to work out as expected.
- [k for k, g in groupby(‘AAAABBBCCDAABBB’)] –> A B C D A B
- [list(g) for k, g in groupby(‘AAAABBBCCD’)] –> AAAA BBB CC D
1 2 3 4 5 6 7 8 9 10 |
>>> import itertools >>> for key, group in itertools.groupby('AAAABBBCCDAABBB'): ... print(key, list(group)) ... A ['A', 'A', 'A', 'A'] B ['B', 'B', 'B'] C ['C', 'C'] D ['D'] A ['A', 'A'] B ['B', 'B', 'B'] |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
city_list = [('Decatur', 'AL'), ('Huntsville', 'AL'), ('Selma', 'AL'), ('Anchorage', 'AK'), ('Nome', 'AK'), ('Flagstaff', 'AZ'), ('Phoenix', 'AZ'), ('Tucson', 'AZ'), ... ] def get_state(city_state): return city_state[1] itertools.groupby(city_list, get_state) => ('AL', iterator-1), ('AK', iterator-2), ('AZ', iterator-3), ... iterator-1 => ('Decatur', 'AL'), ('Huntsville', 'AL'), ('Selma', 'AL') iterator-2 => ('Anchorage', 'AK'), ('Nome', 'AK') iterator-3 => ('Flagstaff', 'AZ'), ('Phoenix', 'AZ'), ('Tucson', 'AZ') |
Combinatoric generators
product
product(*iterables, repeat=1)
- product(A, B) returns the same as ((x,y) for x in A for y in B)
- product(A, repeat=4) means the same as product(A, A, A, A)
1 2 3 4 5 6 7 8 9 10 11 |
from itertools import product def show(iterable): for i, item in enumerate(iterable, 1): print(item, end=' ') if (i % 3) == 0: print() print() print('Repeat 2:\n') show(product(range(3), repeat=2)) print('Repeat 3:\n') show(product(range(3), repeat=3)) |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
Repeat 2: (0, 0) (0, 1) (0, 2) (1, 0) (1, 1) (1, 2) (2, 0) (2, 1) (2, 2) Repeat 3: (0, 0, 0) (0, 0, 1) (0, 0, 2) (0, 1, 0) (0, 1, 1) (0, 1, 2) (0, 2, 0) (0, 2, 1) (0, 2, 2) (1, 0, 0) (1, 0, 1) (1, 0, 2) (1, 1, 0) (1, 1, 1) (1, 1, 2) (1, 2, 0) (1, 2, 1) (1, 2, 2) (2, 0, 0) (2, 0, 1) (2, 0, 2) (2, 1, 0) (2, 1, 1) (2, 1, 2) (2, 2, 0) (2, 2, 1) (2, 2, 2) |
permutations
permutations(iterable, r=None)返回長度為r的所有可能的組合:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
from itertools import permutations def show(iterable): first = None for i, item in enumerate(iterable, 1): if first != item[0]: if first is not None: print() first = item[0] print(''.join(item), end=' ') print() print('All permutations:\n') show(permutations('abcd')) print('\nPairs:\n') show(permutations('abcd', r=2)) |
下面是輸出結果:
1 2 3 4 5 6 7 8 9 10 |
All permutations: abcd abdc acbd acdb adbc adcb bacd badc bcad bcda bdac bdca cabd cadb cbad cbda cdab cdba dabc dacb dbac dbca dcab dcba Pairs: ab ac ad ba bc bd ca cb cd da db dc |
combinations
combinations(iterable, r) 返回一個iterator,提供iterable中所有元素可能組合的r元組。每個元組中的元素保持與iterable返回的順序相同。下面的例項中,不同於上面的permutations,a總是在bcd之前,b總是在cd之前,c總是在d之前。
1 2 3 4 5 6 7 8 9 10 11 12 |
from itertools import combinations def show(iterable): first = None for i, item in enumerate(iterable, 1): if first != item[0]: if first is not None: print() first = item[0] print(''.join(item), end=' ') print() print('Unique pairs:\n') show(combinations('abcd', r=2)) |
下面是輸出結果:
1 2 3 4 |
Unique pairs: ab ac ad bc bd cd |
combinations_with_replacement
combinations_with_replacement(iterable, r)函式放寬了一個不同的約束:元素可以在單個元組中重複,即可以出現aa/bb/cc/dd等組合:
1 2 3 4 5 6 7 8 9 10 11 12 |
from itertools import combinations_with_replacement def show(iterable): first = None for i, item in enumerate(iterable, 1): if first != item[0]: if first is not None: print() first = item[0] print(''.join(item), end=' ') print() print('Unique pairs:\n') show(combinations_with_replacement('abcd', r=2)) |
下面是輸出結果:
1 2 3 4 |
aa ab ac ad bb bc bd cc cd dd |
operator的使用
attrgetter
operator.attrgetter(attr)和operator.attrgetter(*attrs)
- After f = attrgetter(‘name’), the call f(b) returns b.name.
- After f = attrgetter(‘name’, ‘date’), the call f(b) returns (b.name, b.date).
- After f = attrgetter(‘name.first’, ‘name.last’), the call f(b) returns (b.name.first, b.name.last).
我們通過下面這個例子來了解一下itergetter的用法:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
>>> class Student: ... def __init__(self, name, grade, age): ... self.name = name ... self.grade = grade ... self.age = age ... def __repr__(self): ... return repr((self.name, self.grade, self.age)) >>> student_objects = [ ... Student('john', 'A', 15), ... Student('jane', 'B', 12), ... Student('dave', 'B', 10), ... ] >>> sorted(student_objects, key=lambda student: student.age) # 傳統的lambda做法 [('dave', 'B', 10), ('jane', 'B', 12), ('john', 'A', 15)] >>> from operator import itemgetter, attrgetter >>> sorted(student_objects, key=attrgetter('age')) [('dave', 'B', 10), ('jane', 'B', 12), ('john', 'A', 15)] # 但是如果像下面這樣接受雙重比較,Python脆弱的lambda就不適用了 >>> sorted(student_objects, key=attrgetter('grade', 'age')) [('john', 'A', 15), ('dave', 'B', 10), ('jane', 'B', 12)] |
attrgetter的實現原理:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
def attrgetter(*items): if any(not isinstance(item, str) for item in items): raise TypeError('attribute name must be a string') if len(items) == 1: attr = items[0] def g(obj): return resolve_attr(obj, attr) else: def g(obj): return tuple(resolve_attr(obj, attr) for attr in items) return g def resolve_attr(obj, attr): for name in attr.split("."): obj = getattr(obj, name) return obj |
itemgetter
operator.itemgetter(item)和operator.itemgetter(*items)
- After f = itemgetter(2), the call f(r) returns r[2].
- After g = itemgetter(2, 5, 3), the call g(r) returns (r[2], r[5], r[3]).
我們通過下面這個例子來了解一下itergetter的用法:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
>>> student_tuples = [ ... ('john', 'A', 15), ... ('jane', 'B', 12), ... ('dave', 'B', 10), ... ] >>> sorted(student_tuples, key=lambda student: student[2]) # 傳統的lambda做法 [('dave', 'B', 10), ('jane', 'B', 12), ('john', 'A', 15)] >>> from operator import attrgetter >>> sorted(student_tuples, key=itemgetter(2)) [('dave', 'B', 10), ('jane', 'B', 12), ('john', 'A', 15)] # 但是如果像下面這樣接受雙重比較,Python脆弱的lambda就不適用了 >>> sorted(student_tuples, key=itemgetter(1,2)) [('john', 'A', 15), ('dave', 'B', 10), ('jane', 'B', 12)] |
itemgetter的實現原理
1 2 3 4 5 6 7 8 9 |
def itemgetter(*items): if len(items) == 1: item = items[0] def g(obj): return obj[item] else: def g(obj): return tuple(obj[item] for item in items) return g |
methodcaller
operator.methodcaller(name[, args…])
- After f = methodcaller(‘name’), the call f(b) returns b.name().
- After f = methodcaller(‘name’, ‘foo’, bar=1), the call f(b) returns b.name(‘foo’, bar=1).
methodcaller的實現原理:
1 2 3 4 |
def methodcaller(name, *args, **kwargs): def caller(obj): return getattr(obj, name)(*args, **kwargs) return caller |
References
DOCUMENTATION-FUNCTOOLS
DOCUMENTATION-ITERTOOLS
DOCUMENTATION-OPERATOR
HWOTO-FUNCTIONAL
HWOTO-SORTING
PYMOTW