建檔日期: 2019/12/08
更新日期: None
相關軟體資訊:
Win 10 | Python 3.7.2 |
說明:所有內容歡迎引用,只需註明來源及作者,本文內容如有錯誤或用詞不當,敬請指正.
主題: 004.01 不同Python資料型別的搜尋
最近在作資料搜尋比對的案子的時候, 發現大量的資料在搜尋比對時, 速度變的非常慢, 慢到完全無法接受, 我想要的是’立即’有結果, 結果卻是要等好幾小時, 暈 ! 雖然以Python來說, 肯定比不上C或Assembly語言, 但是還是要想辦法提升一下速度. 以下是在一萬筆資料中, 找一萬筆資料的各種方法以及所需的時間, 雖然最後一個方法index_list_sort()
, 速度快了多, 但是我還是覺得不夠快, 而且這裡還只是整數的搜尋, 如果是字串呢? 如果是副字串呢? 各位如果有更好的方法, 也請提示, 謝謝 !
結果:
0:00:04.734338 : index_sequence
0:00:01.139984 : index_list
0:00:00.330116 : index_np
0:00:00.233343 : index_np_sort
0:00:00.223401 : index_dict
0:00:00.213462 : index_set
0:00:00.007977 : index_list_sort
程式碼:
from datetime import datetime
import numpy as np
import bisect
import time
import random
import inspect
import copy
size = 10000
value = size-1
db = random.sample(range(size), size)
db_sort = copy.deepcopy(db)
db_sort.sort()
db_set = set(db)
db_dict = {db[i]:i for i in range(size)}
db_np = np.array(db)
value = [i for i in range(size)]
def call(func):
# Call function and calculate execution time, then print duration and function name
start_time = datetime.now()
func()
print(datetime.now() - start_time,':',func.__name__)
def do_something():
# Do something here, it may get duration different when multi-loop method used
for i in range(1000):
pass
def index_sequence():
# List unsort and just by Python without any method used or built-in function.
for i in range(size):
for j in range(size):
if value[j] == db[i]:
index = j
do_something()
break
def index_list():
# Unsorted list, use list.index()
for i in range(size):
try:
index = db.index(value[i])
except:
index = -1
if index >= 0:
do_something()
def index_np():
# By using numpy and np(where)
for i in range(size):
result = np.where(db_np==value[i])
if len(result[0])!=0:
do_something()
def index_np_sort():
# By using numpy and sorted numpy array
for i in range(size):
result = np.searchsorted(db_np, value[i])
if result != size:
do_something()
def index_list_sort():
# By using bisect library
for i in range(size):
index = bisect.bisect_left(db, value[i])
if index < size-1 and value[index]==db[index]:
do_something()
def index_set():
# Set serach
for i in range(size):
if value[i] in db_set:
do_something()
def index_dict():
# Dictionary search
for i in range(size):
try:
index = db_dict[value[i]]
except:
index = -1
if index >= 0:
do_something()
# Test execution time
call(index_sequence)
call(index_list)
call(index_np)
call(index_np_sort)
call(index_dict)
call(index_set)
call(index_list_sort)
本作品採用《CC 協議》,轉載必須註明作者和本文連結