有效使用Django的QuerySets

oschina發表於2013-06-20

　　物件關係對映 (ORM) 使得與SQL資料庫互動更為簡單，不過也被認為效率不高，比原始的SQL要慢。

　　要有效的使用ORM，意味著需要多少要明白它是如何查詢資料庫的。本文我將重點介紹如何有效使用 Django ORM系統訪問中到大型的資料集。

　Django的queryset是惰性的

　　Django的queryset對應於資料庫的若干記錄（row），通過可選的查詢來過濾。例如，下面的程式碼會得到資料庫中名字為‘Dave’的所有的人:

person_set = Person.objects.filter(first_name="Dave")

　　上面的程式碼並沒有執行任何的資料庫查詢。你可以使用person_set，給它加上一些過濾條件，或者將它傳給某個函式，這些操作都不會傳送給資料庫。這是對的，因為資料庫查詢是顯著影響web應用效能的因素之一。

　　要真正從資料庫獲得資料，你需要遍歷queryset:

for person in person_set:
    print(person.last_name)

　Django的queryset是具有cache的

　　當你遍歷queryset時，所有匹配的記錄會從資料庫獲取，然後轉換成Django的model。這被稱為執行（evaluation）。這些model會儲存在queryset內建的cache中，這樣如果你再次遍歷這個queryset，你不需要重複執行通用的查詢。

　　例如，下面的程式碼只會執行一次資料庫查詢：

pet_set = Pet.objects.filter(species="Dog")
# The query is executed and cached.
for pet in pet_set:
    print(pet.first_name)
# The cache is used for subsequent iteration.
for pet in pet_set:
    print(pet.last_name)

　if語句會觸發queryset的執行

　　queryset的cache最有用的地方是可以有效的測試queryset是否包含資料，只有有資料時才會去遍歷：

restaurant_set = Restaurant.objects.filter(cuisine="Indian")
# `if`語句會觸發queryset的執行。
if restaurant_set:
    # 遍歷時用的是cache中的資料
    for restaurant in restaurant_set:
        print(restaurant.name)

　如果不需要所有資料，queryset的cache可能會是個問題

　　有時候，你也許只想知道是否有資料存在，而不需要遍歷所有的資料。這種情況，簡單的使用if語句進行判斷也會完全執行整個queryset並且把資料放入cache，雖然你並不需要這些資料！

city_set = City.objects.filter(name="Cambridge")
# `if`語句會執行queryset.。
if city_set:
    # 我們並不需要所有的資料，但是ORM仍然會獲取所有記錄！
    print("At least one city called Cambridge still stands!")

　　為了避免這個，可以用exists()方法來檢查是否有資料：

tree_set = Tree.objects.filter(type="deciduous")
# `exists()`的檢查可以避免資料放入queryset的cache。
if tree_set.exists():
    # 沒有資料從資料庫獲取，從而節省了頻寬和記憶體
    print("There are still hardwood trees in the world!")

　當queryset非常巨大時，cache會成為問題

　　處理成千上萬的記錄時，將它們一次裝入記憶體是很浪費的。更糟糕的是，巨大的queryset可能會鎖住系統程式，讓你的程式瀕臨崩潰。

　　要避免在遍歷資料的同時產生queryset cache，可以使用iterator()方法來獲取資料，處理完資料就將其丟棄。

star_set = Star.objects.all()
# `iterator()`可以一次只從資料庫獲取少量資料，這樣可以節省記憶體
for star in star_set.iterator():
    print(star.name)

　　當然，使用iterator()方法來防止生成cache，意味著遍歷同一個queryset時會重複執行查詢。所以使用iterator()的時候要當心，確保你的程式碼在操作一個大的queryset時沒有重複執行查詢

　如果查詢集很大的話，if 語句是個問題

　　如前所述，查詢集快取對於組合 if 語句和 for 語句是很強大的，它允許在一個查詢集上進行有條件的迴圈。然而對於很大的查詢集，則不適合使用查詢集快取。

　　最簡單的解決方案是結合使用exists()和iterator(), 通過使用兩次資料庫查詢來避免使用查詢集快取。

molecule_set = Molecule.objects.all()
# One database query to test if any rows exist.
if molecule_set.exists():
    # Another database query to start fetching the rows in batches.
    for molecule in molecule_set.iterator():
        print(molecule.velocity)

　　一個更復雜點的方案是使用 Python 的“ 高階迭代方法 ”在開始迴圈前先檢視一下 iterator() 的第一個元素再決定是否進行迴圈。

atom_set = Atom.objects.all()
# One database query to start fetching the rows in batches.
atom_iterator = atom_set.iterator()
# Peek at the first item in the iterator.
try:
    first_atom = next(atom_iterator)
except StopIteration:
    # No rows were found, so do nothing.
    pass
else:
    # At least one row was found, so iterate over
    # all the rows, including the first one.
    from itertools import chain
    for atom in chain([first_atom], atom_set):
        print(atom.mass)

　防止不當的優化

　　queryset的cache是用於減少程式對資料庫的查詢，在通常的使用下會保證只有在需要的時候才會查詢資料庫。

　　使用exists()和iterator()方法可以優化程式對記憶體的使用。不過，由於它們並不會生成queryset cache，可能會造成額外的資料庫查詢。

　　所以編碼時需要注意一下，如果程式開始變慢，你需要看看程式碼的瓶頸在哪裡，是否會有一些小的優化可以幫到你。

　　英文原文：Using Django querysets effectively

使用django 的cache設定token的有效期
2018-12-14
Django
django中orm的使用
2024-06-15
DjangoORM
Django定時任務Django-crontab的使用
2022-07-07
Django
[轉] 高效使用 django 的 queryset
2024-06-25
Django
Django 中 Aggregation聚合的使用
2021-03-06
Django
使用Django annotation，提升django查詢效能
2019-02-05
Django
Django中content_type的使用
2019-03-04
Django
Django（56）Mixins工具集的使用
2021-06-10
Django
python django中restful框架的使用
2021-06-21
PythonDjangoREST框架
Django中F函式的使用
2021-03-06
Django函式
django中使用celery
2024-03-12
Django
django專案使用
2024-05-31
Django
Django中使用ElasticSearch
2021-06-22
DjangoElasticsearch
Django的使用者認證元件
2018-11-01
Django元件
Django學習(二) 之模板的使用
2023-12-04
Django
Django（68）drf分頁器的使用
2021-06-22
Django
django下載excel，使用django-excel外掛
2018-09-30
DjangoExcel
使用dwebsocket在Django中使用Websocket
2020-11-04
WebDjango
在django中使用celery
2019-02-16
Django
Django之mako模板使用
2018-10-11
Django
Django路由使用問題
2024-03-22
Django路由
Django | 訊號使用思考
2023-04-20
Django
在django如何使用中文
2020-04-06
Django
django框架使用基本流程
2020-11-08
Django框架
Django筆記四十一之Django中使用es
2023-11-19
Django筆記
django開發-定時任務的使用
2019-02-16
Django
使用Django而不是FastAPI的10個理由
2024-04-07
DjangoASTAPI
Django中 render() 函式的使用方法
2021-04-21
Django函式
Django 2.1.3 中介軟體使用
2018-11-25
Django
Django框架之分頁器使用
2024-04-08
Django框架
使用FastAPI整合Gradio和Django
2024-10-31
ASTAPIDjango
在Django中使用Channels功能
2021-08-17
Django
Django專案中使用Celery
2021-03-07
Django
Django-中介軟體的介紹及使用
2021-09-11
Django
Django（45）drf序列化類的使用(Serializer)
2021-06-04
Django
Django（46）drf序列化類的使用(ModelSerializer)
2021-06-06
Django
django 的縮圖sorl-thumbnail的使用連線地址
2018-05-15
DjangoAI
Django自帶後臺使用配置
2018-11-11
Django
Django使用心得（一）善用migrations
2018-12-18
Django

有效使用Django的QuerySets

Django的queryset是惰性的

Django的queryset是具有cache的

if語句會觸發queryset的執行

如果不需要所有資料，queryset的cache可能會是個問題

當queryset非常巨大時，cache會成為問題

如果查詢集很大的話，if 語句是個問題

防止不當的優化

相關文章

　Django的queryset是惰性的

　Django的queryset是具有cache的

　if語句會觸發queryset的執行

　如果不需要所有資料，queryset的cache可能會是個問題

　當queryset非常巨大時，cache會成為問題

　如果查詢集很大的話，if 語句是個問題

　防止不當的優化