[PY3]——求TopN/BtmN 和排序問題的解決

Jelly_lyj發表於2017-03-18

排序

需求

K長的序列，求TopN
K長的序列，求BtmN
排序問題

解決

heap.nlargest(）、heap.nsmallest( )
sorted( )+切片
max( )、min( )

總結和比較

1）在Top N問題中，如果 N=1，則直接用max(iterable)/min(iterable) 即可（效率最高）。

2）如果N很大，接近集合元素，則為了提高效率，採用 sort+切片 的效率會更高，如：

 求最大的N個元素：sorted(iterable, key=key, reverse=True)[:N]

 求最小的N個元素：sorted(iterable, key=key)[:N]

3）當要查詢的元素個數相對比較小的時候，使用 nlargest() 和 nsmallest() 是很合適的

詳解max( )/min( )函式用法

求簡單的序列TopN/BtmN（N=1）問題
```
lst=[1,2,3,4,5]

print(max(lst))

5
```
通過key屬性的使用，設定函式條件為判斷的標準
```
a=[-9,-8,1,3,-4,6]

print(max(a,key=lambda x:abs(x)))

-9
```

找出字典中值最大的那組資料

prices = {

'A':123,

'B':450.1,

'C':12,

'E':444,

}

//在對字典進行資料操作的時候，預設只會處理key，而不是value

//先使用zip把字典的keys和values翻轉過來，再用max取出值最大的那組資料

max_prices=max(zip(prices.values(),prices.keys()))

print(max_prices)

(450.1, 'B')

nlargest( )/nsmallest( )詳解

nlargest(n,iterable) 求序列iterable中的TopN | nsmallest(n,iterable) 求序列iterable中的BtmN

import heapq

nums=[16,7,3,20,17,8,-1]

print(heapq.nlargest(3,nums))

print(heapq.nsmallest(3,nums))

[20, 17, 16]

[-1, 3, 7]

nlargest(n, iterable, key=lambda) | nsmallest(n, iterable, key=lambda) key接受關鍵字引數，用於更復雜的資料結構中

def print_price(dirt):

for i in dirt:

for x,y in i.items():

if x=='price':

print(x,y)

portfolio = [

{'name': 'IBM', 'shares': 100, 'price': 91.1},

{'name': 'AAPL', 'shares': 50, 'price': 543.22},

{'name': 'FB', 'shares': 200, 'price': 21.09},

{'name': 'HPQ', 'shares': 35, 'price': 31.75},

{'name': 'YHOO', 'shares': 45, 'price': 16.35},

{'name': 'ACME', 'shares': 75, 'price': 115.65}]

cheap=heapq.nsmallest(3,portfolio,key=lambda x:x['price'])

expensive=heapq.nlargest(3,portfolio,key=lambda y:y['price'])

print_price(cheap)

print_price(expensive)

price 16.35

price 21.09

price 31.75

price 543.22

price 115.65

price 91.1

sorted( )詳解

sorted(iterable, key=None, reverse=False)

reverse=True 逆序

nums=[16,7,3,20,17,8,-1]

print(sorted(nums))

print(sorted(nums,reverse=True))

[-1, 3, 7, 8, 16, 17, 20]

[20, 17, 16, 8, 7, 3, -1]

str=['b','a','A','s']

print(sorted(str))

print(sorted(str,reverse=True))

['A', 'a', 'b', 's']

['s', 'b', 'a', 'A']

key接受一個函式，且是個只接受一個元素的函式

多條件的key應該怎麼寫？

//按長度排序

L = [{1:5,3:4},{1:3,6:3},{1:1,2:4,5:6},{1:9}]

print(sorted(L,key=lambda x: len(x)))

[{1: 9}, {1: 5, 3: 4}, {1: 3, 6: 3}, {1: 1, 2: 4, 5: 6}]

//根據指定的值來排序(例如字典中的某個key)

L = [

('john', 'A', 15),

('jane', 'B', 10),

('dave', 'B', 12),]

print(sorted(L,key=lambda x:x[2],reverse=True))

[('john', 'A', 15), ('dave', 'B', 12), ('jane', 'B', 10)]

portfolio = [

{'name': 'IBM', 'shares': 100, 'price': 91.1},

{'name': 'AAPL', 'shares': 50, 'price': 543.22},

{'name': 'FB', 'shares': 200, 'price': 21.09},

{'name': 'HPQ', 'shares': 35, 'price': 31.75},

{'name': 'YHOO', 'shares': 45, 'price': 16.35},

{'name': 'ACME', 'shares': 75, 'price': 115.65}]

print(sorted(portfolio,key=lambda x:x['price']))

[{'shares': 45, 'name': 'YHOO', 'price': 16.35}, {'shares': 200, 'name': 'FB', 'price': 21.09}, {'shares': 35, 'name': 'HPQ', 'price': 31.75}, {'shares': 100, 'name': 'IBM', 'price': 91.1}, {'shares': 75, 'name': 'ACME', 'price': 115.65}, {'shares': 50, 'name': 'AAPL', 'price': 543.22}]

//不規則字串，按“小寫-大寫-奇數-偶數”順序排序

s = 'asdf234GDSdsf23'

print("".join(sorted(s, key=lambda x: (x.isdigit(),x.isdigit() and int(x) % 2 == 0,x.isupper(),x))))

addffssDGS33224

//一道面試題：要求：正數在前負數在後 2.整數從小到大 3.負數從大到小

list1=[7, -8, 5, 4, 0, -2, -5]

print(sorted(list1,key=lambda x:(x<0,abs(x))))

[0, 4, 5, 7, -2, -5, -8]

用operator中的函式加快速度和進行多級排序

from operator import itemgetter, attrgetter

暫不討論

比較三種方法的效率

只求TopN=1/BtmN=1時，比較max( )和nlargest( )兩種效率

In [8]: nums=random.sample(range(1,10000),999)

In [9]: print(max(nums))

9999

In [10]: %time

CPU times: user 0 ns, sys: 0 ns, total: 0 ns

Wall time: 13.1 µs

In [11]: heapq.nlargest(1,nums)

Out[11]: [9999]

In [12]: %time

CPU times: user 0 ns, sys: 0 ns, total: 0 ns

Wall time: 14.1 µs

當K為10，N為9(即N無限接近K時)，比較了sorted( )+切片和nlargest( )兩種方法的效率

In [23]: nums=random.sample(range(1,10000),10)

In [24]: sorted(nums,reverse=True)[:9]

Out[24]: [8814, 7551, 7318, 5597, 5257, 4437, 4211, 2776, 2440]

In [25]: %time

CPU times: user 0 ns, sys: 0 ns, total: 0 ns

Wall time: 11.4 µs

In [26]: heapq.nlargest(9,nums)

Out[26]: [8814, 7551, 7318, 5597, 5257, 4437, 4211, 2776, 2440]

In [27]: %time

CPU times: user 0 ns, sys: 0 ns, total: 0 ns

Wall time: 154 µs

當N較小時，比較了nlargest( )和sorted( )+切片兩種方法

In [18]: nums=[16,7,3,20,17,8,-1]

In [19]: heapq.nlargest(3,nums)

Out[19]: [20, 17, 16]

In [20]: %time

CPU times: user 0 ns, sys: 0 ns, total: 0 ns

Wall time: 4.05 µs

In [21]: sorted(nums,reverse=True)[:3]

Out[21]: [20, 17, 16]

In [22]: %time

CPU times: user 0 ns, sys: 0 ns, total: 0 ns

Wall time: 5.48 µs

以上程式碼用到的import和show_tree( )
import math

import io

from io import StringIO

import heapq

import random

import time

from functools import wraps

def show_tree(tree, total_width=36, fill=' '):

output =io.StringIO() #建立stringio物件

last_row = -1

for i, n in enumerate(tree): #

if i:

row = int(math.floor(math.log(i+1, 2)))

else:

row = 0

if row != last_row:

output.write('\n')

columns = 2**row

col_width = int(math.floor((total_width * 1.0) / columns))

output.write(str(n).center(col_width, fill))

last_row = row

print(output.getvalue())

print('-' * total_width)

print(' ')

return
參考資料
python3-cookbook-1.4-查詢最大或最小元素
詳解Python中heapq模組的用法（這裡有實現show_tree的函式程式碼）
理解堆和堆排序的文章1
理解堆和堆排序的文章2
理解堆和堆排序的文章3
python奇技淫巧——max/min函式的用法

正視問題的存在和積極尋求途徑解決問題
2011-03-21
C語言解決排序問題
2020-11-06
C語言排序
[PY3]——heap模組和堆排序
2017-03-18
排序
分治思想--快速排序解決TopK問題
2019-06-01
排序TopK
解決webpack不能匹配post請求的問題
2019-02-16
Web
[PY3]——字典排序問題總結—(zip()函式、OrderedDict、itemgetter函式）
2017-03-18
排序函式
前端http請求跨域問題解決
2018-03-09
前端HTTP跨域
JSONP解決跨域請求問題
2017-07-16
JSON跨域
csrf解決Ajax請求跨站問題
2024-08-09
解決URL請求中的中文亂碼問題
2016-08-03
解決問題的方法和途徑-問題管理
2008-09-18
Java介面返回JSON排序無需的問題解決
2024-08-22
JavaJSON排序
透過Treeset解決隨機數排序問題
2024-03-13
隨機排序
jboss get請求中文亂碼問題的解決[zt]
2012-04-17
新手開發遇到問題，求幫助解決！！！
2013-04-16
解決「問題」，不要解決問題
2012-03-21
CSS Hacks 和問題解決
2008-02-19
CSS
maven的編碼問題、解決和疑問
2008-08-18
Maven
URL請求不能解決中文請求的問題
2018-03-26
Java 之 Map 的鍵，值多重排序問題解決方案
2018-09-18
Java排序
SQL SERVER和ORACLE的排序問題
2007-09-18
SQLServerOracle排序
解決 Laravel 接收非簡單請求時，只有收到 OPTIONS 請求的問題
2019-02-17
Laravel
一行程式碼解決求重問題
2019-12-05
行程
jQuery ajax請求返回401問題解決方案
2017-04-18
jQuery
Ubuntu 16.04 Vysor 破解和黑屏問題解決+ 閃屏問題解決
2016-12-15
Ubuntu
eMarketer:消費者線上尋求健康問題解決方法和診斷工具
2013-01-28
js ajax請求封裝及解決node請求跨域問題
2020-10-23
JS封裝跨域
iOS請求的json資料解析錯誤問題解決
2017-05-23
iOSJSON
急求：如何使用反射機制解決這個問題！！！？？？
2008-04-07
反射
[疑問] [已解決] updateOrCreate () 這類方法應對併發請求的問題
2018-12-06
netty 解決粘包和分包的問題
2018-06-20
Netty
PNP的子集和問題終於解決了
2014-12-04
Oracle分頁查詢中排序與效率問題解決方法詳解
2014-05-30
Oracle排序
RocetMQ搭建攻略和問題解決之道
2021-01-03
MQ
GO Modules的理解和遇到的問題解決方法
2021-06-25
Go
解決問題
2018-01-05
記錄一次解決服務請求的跨域問題
2022-01-01
跨域
解決 jquery使用ajax請求發生跨域問題的辦法
2020-12-17
jQuery跨域

[PY3]——求TopN/BtmN 和 排序問題的解決

需求