在 Python 眾多的 HTTP 客戶端中,最有名的莫過於requests
、aiohttp
和httpx
。在不借助其他第三方庫的情況下,requests
只能傳送同步請求;aiohttp
只能傳送非同步請求;httpx
既能傳送同步請求,又能傳送非同步請求。
所謂的同步請求,是指在單程式單執行緒的程式碼中,發起一次請求後,在收到返回結果之前,不能發起下一次請求。所謂非同步請求,是指在單程式單執行緒的程式碼中,發起一次請求後,在等待網站返回結果的時間裡,可以繼續傳送更多請求。
今天我們來一個淺度測評,僅僅以多次傳送 GET 請求這個角度來對比這三個庫的效能。
當然測試結果與網速有關,不過在同一段時間的同一個網路測試出來,還是能看得出來問題的。
requests 傳送請求
import requests
url = 'https://www.baidu.com/'
headers = headers = {
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.100 Safari/537.36'}
def main():
res = requests.get(url,headers=headers)
print(res.status_code)
if __name__ == '__main__':
main()
httpx傳送請求
使用 httpx 傳送同步請求:
import httpx
url = 'https://www.baidu.com/'
headers = headers = {
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.100 Safari/537.36'}
def main():
res = httpx.get(url,headers=headers)
print(res.status_code)
if __name__ == '__main__':
main()
httpx 的同步模式與 requests 程式碼重合度99%,只需要把requests改成httpx即可正常執行。
使用 httpx 傳送非同步請求:
import httpx
import asyncio
url = 'https://www.baidu.com/'
headers = {
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.100 Safari/537.36'}
async def main():
async with httpx.AsyncClient() as client:
resp = await client.get(url, headers=headers)
print(resp.status_code)
if __name__ == '__main__':
asyncio.run(main())
aiohttp 傳送請求
import asyncio
import aiohttp
url = 'https://www.baidu.com/'
headers = {
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.100 Safari/537.36'}
async def main():
async with aiohttp.ClientSession() as client:
async with client.get(url, headers=headers) as resp:
print(resp.text)
print(resp.status)
if __name__ == '__main__':
asyncio.run(main())
aiohttp 的程式碼與 httpx 非同步模式的程式碼重合度90%,只不過把AsyncClient換成了ClientSession
使用requests.post每次都會建立新的連線,速度較慢。而如果首先初始化一個 Session,那麼 requests 會保持連線,從而大大提高請求速度。所以在這次測評中,我們分別對兩種情況進行測試
requests
requests 不保持連線
import time
import requests
url = 'https://www.baidu.com/'
headers = {
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.100 Safari/537.36'}
def make_request():
resp = requests.get(url, headers=headers)
print(resp.status_code)
def main():
start = time.time()
for _ in range(100):
make_request()
end = time.time()
print(f'傳送100次請求,耗時:{end - start}')
if __name__ == '__main__':
main()
傳送100次請求,耗時:10.295854091644287
requests 保持連線
import time
import requests
session = requests.session()
url = 'https://www.baidu.com/'
headers = {
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.100 Safari/537.36'}
def make_request():
resp = session.get(url, headers=headers)
print(resp.status_code)
def main():
start = time.time()
for _ in range(100):
make_request()
end = time.time()
print(f'傳送100次請求,耗時:{end - start}')
if __name__ == '__main__':
main()
傳送100次請求,耗時:4.679062128067017,很明顯快了接近 6s
httpx
httpx同步模式
import time
import httpx
url = 'https://www.baidu.com/'
headers = {
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.100 Safari/537.36'}
def make_request():
resp = httpx.get(url, headers=headers)
print(resp.status_code)
def main():
start = time.time()
for _ in range(100):
make_request()
end = time.time()
print(f'傳送100次請求,耗時:{end - start}')
if __name__ == '__main__':
main()
傳送100次請求,耗時:16.60569405555725
httpx非同步模式:只建立一次 httpx.AsyncClient()
import httpx
import asyncio
import time
url = 'https://www.baidu.com/'
headers = {
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.100 Safari/537.36'}
async def make_request(client):
resp = await client.get(url, headers=headers)
print(resp.status_code)
async def main():
async with httpx.AsyncClient() as client:
start = time.time()
tasks = [asyncio.create_task(make_request(client)) for _ in range(100)]
await asyncio.gather(*tasks)
end = time.time()
print(f'傳送100次請求,耗時:{end - start}')
if __name__ == '__main__':
asyncio.run(main())
傳送100次請求,耗時:4.359861135482788
httpx非同步模式:每次都建立 httpx.AsyncClient()
import httpx
import asyncio
import time
url = 'https://www.baidu.com/'
headers = {
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.100 Safari/537.36'}
async def make_request():
async with httpx.AsyncClient() as client:
resp = await client.get(url, headers=headers)
print(resp.status_code)
async def main():
start = time.time()
tasks = [asyncio.create_task(make_request()) for _ in range(100)]
await asyncio.gather(*tasks)
end = time.time()
print(f'傳送100次請求,耗時:{end - start}')
if __name__ == '__main__':
asyncio.run(main())
傳送100次請求,耗時:6.378381013870239
aiohttp
aiohttp:只建立一次 aiohttp.ClientSession()
import time
import asyncio
import aiohttp
url = 'https://www.baidu.com/'
headers = {
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.100 Safari/537.36'}
async def make_request(client):
async with client.get(url, headers=headers) as resp:
print(resp.status)
async def main():
async with aiohttp.ClientSession() as client:
start = time.time()
tasks = [asyncio.create_task(make_request(client)) for _ in range(100)]
await asyncio.gather(*tasks)
end = time.time()
print(f'傳送100次請求,耗時:{end - start}')
if __name__ == '__main__':
asyncio.run(main())
傳送100次請求,耗時:2.235464334487915
aiohttp:每次都建立 aiohttp.ClientSession()
import time
import asyncio
import aiohttp
url = 'https://www.baidu.com/'
headers = {
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.100 Safari/537.36'}
async def make_request():
async with aiohttp.ClientSession() as client:
async with client.get(url, headers=headers) as resp:
print(resp.status)
def main():
start = time.time()
tasks = [asyncio.ensure_future(make_request()) for _ in range(100)]
loop = asyncio.get_event_loop()
loop.run_until_complete(asyncio.wait(tasks))
end = time.time()
print(f'傳送100次請求,耗時:{end - start}')
if __name__ == '__main__':
main()
傳送100次請求,耗時:2.6662471294403076
請求100次速度排名
aiohttp(只建立一次client)> aiohttp(每次都建立client)> httpx非同步 (只建立一次 client) > requests.session > httpx 非同步(每次都建立 client) > requests > http 非同步
總結
- 如果你只發幾條請求。那麼使用 requests 或者 httpx 的同步模式,程式碼最簡單。
- requests 是否建立一個 session 保持連線,速度差別比較大,在沒有反爬的情況下,只追求速度,建議用 requests.session()
- 如果你要傳送很多請求,但是有些地方要傳送同步請求,有些地方要傳送非同步請求,那麼使用 httpx 最省事。
- 如果你要傳送很多請求,並且越快越好,那麼使用 aiohttp 最快。
本作品採用《CC 協議》,轉載必須註明作者和本文連結