python 爬蟲代理池

Mandy。發表於2019-03-09

原文網址 : https://blog.csdn.net/weixin_43751840/article/details/88372777

首先找一個免費的代理網站
在這裡插入圖片描述
獲取請求地址

檢視網頁原始碼，分析提取規則

根據奇數偶數分別抓取xpath元素，再合併列表

設定詳細提取規則，提取ip地址和連線速度

設定篩選條件，速度太慢的不要

這裡直接判斷第一位是0，因為一秒以下會抓取到一個字串比如0.177之類的

最後上結果
在這裡插入圖片描述
一頁爬下來，可用的大概70幾個

以下是完整程式碼：

# 代理池
import requests
from lxml import etree

url = 'https://www.xicidaili.com/nn/'
headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36'
}
response = requests.get(url,headers=headers)  # 先抓一頁 暫時夠用了
eroot = etree.HTML(response.text)
ip_odd_element_list = eroot.xpath('//*[@id="ip_list"]/tr[@class="odd"]')
ip_even_element_list = eroot.xpath('//*[@id="ip_list"]/tr[@class=""]')
ip_element_raw = ip_odd_element_list + ip_even_element_list
# 獲取所有ip元素列表，因為一會兒要從同一個元素中取出速度資訊

ip_list = []
for ip_element in ip_element_raw:
    ip = ip_element.xpath('./td[2]/text()')
    speed = ip_element.xpath('./td[7]/div[1]/@title')
    if speed[0][0] == '0' :       # 只要連線速度在一秒以內的
        ip_list.append(ip)

print(ip_list)
print(len(ip_list))

scrapy爬蟲代理池
2018-08-28
爬蟲
Python 爬蟲IP代理池的實現
2018-12-17
Python爬蟲
python爬蟲利用requests製作代理池s
2019-12-04
Python爬蟲
如何用海外HTTP代理設定python爬蟲代理ip池？
2022-08-30
HTTPPython爬蟲
爬蟲之代理池維護
2018-08-18
爬蟲
如何建立爬蟲代理ip池
2019-04-25
爬蟲
爬蟲如何使用ip代理池
2021-09-11
爬蟲
【PhpSelenium】3.定時爬蟲 + 多工爬蟲 + 代理池
2019-12-17
PHP爬蟲
[PhpSelenium] 3.定時爬蟲 + 多工爬蟲 + 代理池
2019-12-17
PHP爬蟲
代理ip池對爬蟲有多重要
2021-09-11
爬蟲
手把手教你爬蟲代理ip池的建立
2021-09-11
爬蟲
代理ip池對爬蟲有什麼好處
2021-09-11
爬蟲
用Python爬蟲抓取代理IP
2019-04-17
Python爬蟲
Python爬蟲技巧---設定代理IP
2018-07-12
Python爬蟲
python和爬蟲代理的關聯
2020-08-05
Python爬蟲
python 代理在爬蟲中的作用
2020-10-18
Python爬蟲
python爬蟲實戰：爬取西刺代理的代理ip（二）
2019-02-16
Python爬蟲
爬蟲搭建代理池、爬取某網站影片案例、爬取新聞案例
2023-03-16
爬蟲網站
Python代理IP爬蟲的簡單使用
2019-03-04
Python爬蟲
python爬蟲進階必備之代理
2021-12-23
Python爬蟲
Python3網路爬蟲(十一)：爬蟲黑科技之讓你的爬蟲程式更像人類使用者的行為(代理IP池等)
2019-01-07
Python爬蟲
【Python學習】爬蟲爬蟲爬蟲爬蟲~
2018-05-03
Python爬蟲
爬蟲採集自建代理ip池的三大優勢
2022-05-18
爬蟲
Python爬蟲怎麼設定動態IP代理，Python爬蟲要注意哪些事項?
2023-10-13
Python爬蟲
Python爬蟲工作對代理IP有哪些需求？
2022-05-10
Python爬蟲
代理IP幫助Python爬蟲分析市場
2023-03-28
Python爬蟲
python爬蟲利用代理IP分析大資料
2020-12-01
Python爬蟲大資料
python爬蟲從ip池獲取隨機IP
2021-09-11
Python爬蟲隨機
如何建立爬蟲IP池？
2022-06-07
爬蟲
python爬蟲之反爬蟲（隨機user-agent，獲取代理ip，檢測代理ip可用性）
2019-01-03
Python爬蟲隨機
selenium+python設定爬蟲代理IP的方法
2019-04-17
Python爬蟲
Python爬蟲動態ip代理防止被封的方法
2019-08-12
Python爬蟲
Python爬蟲需要了解的代理IP知識
2023-04-04
Python爬蟲
python爬蟲---網頁爬蟲，圖片爬蟲，文章爬蟲，Python爬蟲爬取新聞網站新聞
2019-01-04
Python爬蟲網頁網站
爬蟲筆記：提高資料採集效率！代理池和執行緒池的使用
2022-02-13
爬蟲筆記執行緒
python就是爬蟲嗎-python就是爬蟲嗎
2020-10-29
Python爬蟲
python 爬蟲 ip池怎麼做，有什麼思路？
2023-03-01
Python爬蟲
爬蟲代理怎麼用
2021-09-11
爬蟲

python 爬蟲 代理池

相關文章

python 爬蟲代理池