每週一個 Python 模組 | heapq

yongxinz發表於2018-12-11

原文網址 : https://juejin.im/post/5c0f9c1fe51d451ac27c470a

heapq 實現了適用於 Python 列表的最小堆排序演算法。

堆是一個樹狀的資料結構，其中的子節點與父節點屬於排序關係。可以使用列表或陣列來表示二進位制堆，使得元素 N 的子元素位於 2 * N + 1 和 2 * N + 2 的位置（對於從零開始的索引）。這種佈局使得可以在適當的位置重新排列堆，因此在新增或刪除資料時無需重新分配記憶體。

max-heap 確保父級大於或等於其子級。min-heap 要求父項小於或等於其子級。Python 的heapq模組實現了一個 min-heap。

示例資料

本節中的示例使用資料heapq_heapdata.py。

# heapq_heapdata.py 
# This data was generated with the random module.

data = [19, 9, 4, 10, 11]
複製程式碼

堆輸出使用列印heapq_showtree.py。

# heapq_showtree.py 
import math
from io import StringIO


def show_tree(tree, total_width=36, fill=' '):
    """Pretty-print a tree."""
    output = StringIO()
    last_row = -1
    for i, n in enumerate(tree):
        if i:
            row = int(math.floor(math.log(i + 1, 2)))
        else:
            row = 0
        if row != last_row:
            output.write('\n')
        columns = 2 ** row
        col_width = int(math.floor(total_width / columns))
        output.write(str(n).center(col_width, fill))
        last_row = row
    print(output.getvalue())
    print('-' * total_width)
    print()
複製程式碼

建立堆

建立堆有兩種基本方法：heappush() 和 heapify()。

import heapq
from heapq_showtree import show_tree
from heapq_heapdata import data

heap = []
print('random :', data)
print()

for n in data:
    print('add {:>3}:'.format(n))
    heapq.heappush(heap, n)
    show_tree(heap)
    
# output
# random : [19, 9, 4, 10, 11]
# 
# add  19:
# 
#                  19
# ------------------------------------
# 
# add   9:
# 
#                  9
#         19
# ------------------------------------
# 
# add   4:
# 
#                  4
#         19                9
# ------------------------------------
# 
# add  10:
# 
#                  4
#         10                9
#     19
# ------------------------------------
# 
# add  11:
# 
#                  4
#         10                9
#     19       11
# ------------------------------------
複製程式碼

當使用heappush()時，當新元素新增時，堆得順序被保持了。

如果資料已經在記憶體中，則使用 heapify() 來更有效地重新排列列表中的元素。

import heapq
from heapq_showtree import show_tree
from heapq_heapdata import data

print('random    :', data)
heapq.heapify(data)
print('heapified :')
show_tree(data)

# output
# random    : [19, 9, 4, 10, 11]
# heapified :
# 
#                  4
#         9                 19
#     10       11
# ------------------------------------
複製程式碼

訪問堆的內容

正確建立堆後，使用heappop()刪除具有最小值的元素。

import heapq
from heapq_showtree import show_tree
from heapq_heapdata import data

print('random    :', data)
heapq.heapify(data)
print('heapified :')
show_tree(data)
print()

for i in range(2):
    smallest = heapq.heappop(data)
    print('pop    {:>3}:'.format(smallest))
    show_tree(data)
    
# output
# random    : [19, 9, 4, 10, 11]
# heapified :
# 
#                  4
#         9                 19
#     10       11
# ------------------------------------
# 
# 
# pop      4:
# 
#                  9
#         10                19
#     11
# ------------------------------------
# 
# pop      9:
# 
#                  10
#         11                19
# ------------------------------------
複製程式碼

在這個例子中，使用 heapify() 和 heappop() 進行排序。

要刪除現有元素，並在一次操作中用新值替換它們，使用heapreplace()。

import heapq
from heapq_showtree import show_tree
from heapq_heapdata import data

heapq.heapify(data)
print('start:')
show_tree(data)

for n in [0, 13]:
    smallest = heapq.heapreplace(data, n)
    print('replace {:>2} with {:>2}:'.format(smallest, n))
    show_tree(data)
    
# output
# start:
# 
#                  4
#         9                 19
#     10       11
# ------------------------------------
# 
# replace  4 with  0:
# 
#                  0
#         9                 19
#     10       11
# ------------------------------------
# 
# replace  0 with 13:
# 
#                  9
#         10                19
#     13       11
# ------------------------------------
複製程式碼

替換元素可以維護固定大小的堆，例如按優先順序排序的 jobs 佇列。

堆的資料極值

heapq 還包括兩個函式來檢查 iterable 並找到它包含的最大或最小值的範圍。

import heapq
from heapq_heapdata import data

print('all       :', data)
print('3 largest :', heapq.nlargest(3, data))
print('from sort :', list(reversed(sorted(data)[-3:])))
print('3 smallest:', heapq.nsmallest(3, data))
print('from sort :', sorted(data)[:3])

# output
# all       : [19, 9, 4, 10, 11]
# 3 largest : [19, 11, 10]
# from sort : [19, 11, 10]
# 3 smallest: [4, 9, 10]
# from sort : [4, 9, 10]
複製程式碼

使用nlargest()和nsmallest()僅對 n > 1 的相對較小的值有效，但在少數情況下仍然可以派上用場。

有效地合併排序序列

將幾個排序的序列組合成一個新序列對於小資料集來說很容易。

list(sorted(itertools.chain(*data)))
複製程式碼

對於較大的資料集，將會佔用大量記憶體。不是對整個組合序列進行排序，而是使用 merge() 一次生成一個新序列。

import heapq
import random


random.seed(2016)

data = []
for i in range(4):
    new_data = list(random.sample(range(1, 101), 5))
    new_data.sort()
    data.append(new_data)

for i, d in enumerate(data):
    print('{}: {}'.format(i, d))

print('\nMerged:')
for i in heapq.merge(*data):
    print(i, end=' ')
print()

# output
# 0: [33, 58, 71, 88, 95]
# 1: [10, 11, 17, 38, 91]
# 2: [13, 18, 39, 61, 63]
# 3: [20, 27, 31, 42, 45]
# 
# Merged:
# 10 11 13 17 18 20 27 31 33 38 39 42 45 58 61 63 71 88 91 95
複製程式碼

因為merge()使用堆的實現，它根據被合併的序列元素個數消耗記憶體，而不是所有序列中的元素個數。

相關文件：

pymotw.com/3/heapq/ind…

每週一個 Python 模組 | copy
2019-03-03
Python
每週一個 Python 模組 | functools
2018-11-12
Python
每週一個 Python 模組 | json
2019-04-01
PythonJSON
每週一個 Python 模組 | string
2019-03-03
Python
每週一個 Python 模組 | socket
2019-01-03
Python
每週一個 Python 模組 | enum
2018-12-09
Python
每週一個 Python 模組 | itertools
2018-11-15
Python
每週一個 Python 模組 | time
2019-03-04
Python
每週一個 Python 模組 | bisect
2018-12-13
Python
每週一個 Python 模組 | Queue
2018-12-14
Python
每週一個 Python 模組 | struct
2018-12-17
PythonStruct
每週一個 Python 模組 | signal
2018-12-07
Python
每週一個 Python 模組 | unittest
2018-11-28
Python
每週一個 Python 模組 | linecache
2019-03-09
Python
每週一個 Python 模組 | pathlib
2019-02-11
Python
每週一個 Python 模組 | hashlib
2019-02-01
Python
每週一個 Python 模組 | glob
2019-01-30
Python
每週一個 Python 模組 | contextlib
2019-03-03
PythonContext
每週一個 Python 模組 | fnmatch
2019-02-18
Python
每週一個 Python 模組 | ipaddress
2018-12-25
PythoniPad
每週一個 Python 模組 | os.path
2019-01-28
Python
python之排序操作及heapq模組
2019-02-16
Python排序
python - 建立一個自定義模組
2020-02-12
Python
每週分享五個 PyCharm 使用技巧（一）
2019-03-25
PyCharm
python 模組：itsdangerous 模組
2020-02-16
Python
Python模組：time模組
2021-09-09
Python
Python 操作 Excel，總有一個模組適合自己
2018-11-08
PythonExcel
scrapy工作流程和每個模組的具體作用
2018-12-19
Python模組之urllib模組
2020-10-30
Python
python模組之collections模組
2019-01-04
Python
每週一書《Python資料科學手冊》分享！
2019-02-22
Python資料科學
每週一書：162頁《笨辦法學 Python》分享！
2018-07-30
Python
使用 Python 和 Pygame 模組構建一個遊戲框架
2019-05-13
PythonGAM遊戲框架
Python 模組
2021-11-23
Python
[Python模組學習] glob模組
2018-05-26
Python
Python中模組是什麼？Python有哪些模組?
2021-09-15
Python
每週一算：Move Zeros
2018-10-23
ROS
Android每週一輪子：Volley
2019-03-04
Android

每週一個 Python 模組 | heapq

示例資料

建立堆

訪問堆的內容

堆的資料極值

有效地合併排序序列

相關文章