為了更好的閱讀體驗,建議訪問我的個人部落格:點我
前言
專案地址 : https://github.com/jhao104/proxy_pool
這個專案是github上一個大佬基於python爬蟲製作的定時獲取免費可用代理併入池的代理池專案
我們來具體實現一下。
具體操作
1.安裝配置redis
將自動爬取的代理入池需要redis資料庫,首先就得安裝redis。
redis官方建議我們在linux上安裝,安裝方式主要有兩種,直接包獲取或手動安裝。
- 指令安裝
apt-get install redis-server
- 手動安裝
在官網下載最新redis安裝包,匯入Linux。
tar -zxvf redis-6.2.6.tar.gz
cd redis-6.2.6/
make
make install
cd /usr/local/bin
mkdir config
cp /opt/redis-6.2.6/redis.conf config # 預設安裝位置為/opt
配置檔案修改
修改redis配置檔案(注意兩種安裝方式的配置檔案位置不同,自動安裝在/etc/redis/redis.conf
,手動安裝在/opt/redis-6.2.6/redis.conf
),進行如下修改:
daemonize yes # 守護程式開啟
protected-mode no # 關閉保護模式
# bind 127.0.0.1 ::1 # 此條為僅允許本地訪問,必須註釋掉
port 6379 # redis 開放埠(如果是有防火牆的伺服器需要開啟該埠)
開啟redis
redis-server config/redis.conf
redis-cli
如需停止:
shutdown
exit
2.拉取並使用指令碼
根據專案文件,可以手動配置也可以使用docker部署(推薦)
docker 使用方法見另一篇部落格
docker pull jhao104/proxy_pool
docker run --env DB_CONN=redis://:[password]@[ip]:[port]/[db] -p 5010:5010 jhao104/proxy_pool:latest
password 沒有可為空
db 預設0
執行成功應如圖:
3.生成配置檔案並匯入Proxyfier
首先pip安裝redis包
pip install redis
編譯以下程式碼,注意修改第8行的ip和port(redis)
# -*- coding:utf8 -*-
import redis
import json
from xml.etree import ElementTree
def RedisProxyGet():
ConnectString = []
pool = redis.ConnectionPool(host='[ip]', port=[port], db=0, decode_responses=True)
use_proxy = redis.Redis(connection_pool=pool)
key = use_proxy.hkeys('use_proxy')
for temp in key:
try:
ConnectString.append(json.loads(use_proxy.hget('use_proxy',temp)))
except json.JSONDecodeError: # JSON解析異常處理
pass
return ConnectString
def xmlOutputs(data):
i = 101
ProxyIDList = []
ProxifierProfile = ElementTree.Element("ProxifierProfile")
ProxifierProfile.set("version", str(i))
ProxifierProfile.set("platform", "Windows")
ProxifierProfile.set("product_id", "0")
ProxifierProfile.set("product_minver", "310")
Options = ElementTree.SubElement(ProxifierProfile, "Options")
Resolve = ElementTree.SubElement(Options, "Resolve")
AutoModeDetection = ElementTree.SubElement(Resolve, "AutoModeDetection")
AutoModeDetection.set("enabled", "false")
ViaProxy = ElementTree.SubElement(Resolve, "ViaProxy")
ViaProxy.set("enabled", "false")
TryLocalDnsFirst = ElementTree.SubElement(ViaProxy, "TryLocalDnsFirst")
TryLocalDnsFirst.set("enabled", "false")
ExclusionList = ElementTree.SubElement(Resolve, "ExclusionList")
ExclusionList.text = "%ComputerName%; localhost; *.local"
Encryption = ElementTree.SubElement(Options, "Encryption")
Encryption.set("mode", 'basic')
Encryption = ElementTree.SubElement(Options, "HttpProxiesSupport")
Encryption.set("enabled", 'true')
Encryption = ElementTree.SubElement(Options, "HandleDirectConnections")
Encryption.set("enabled", 'false')
Encryption = ElementTree.SubElement(Options, "ConnectionLoopDetection")
Encryption.set("enabled", 'true')
Encryption = ElementTree.SubElement(Options, "ProcessServices")
Encryption.set("enabled", 'false')
Encryption = ElementTree.SubElement(Options, "ProcessOtherUsers")
Encryption.set("enabled", 'false')
ProxyList = ElementTree.SubElement(ProxifierProfile, "ProxyList")
for temp in data:
i += 1 # 從101開始增加
Proxy = ElementTree.SubElement(ProxyList, "Proxy")
Proxy.set("id", str(i))
if not temp['https']:
Proxy.set("type", "HTTP")
else:
Proxy.set("type", "HTTPS")
Proxy.text = str(i)
ProxyIDList.append(i)
Address = ElementTree.SubElement(Proxy, "Address")
Address.text = temp['proxy'].split(":", 1)[0]
Port = ElementTree.SubElement(Proxy, "Port")
Port.text = temp['proxy'].split(":", 1)[1]
Options = ElementTree.SubElement(Proxy, "Options")
Options.text = "48"
ChainList = ElementTree.SubElement(ProxifierProfile, "ChainList")
Chain = ElementTree.SubElement(ChainList, "Chain")
Chain.set("id", str(i))
Chain.set("type", "simple")
Name = ElementTree.SubElement(Chain, "Name")
Name.text="AgentPool"
for temp_id in ProxyIDList:
Proxy = ElementTree.SubElement(Chain, "Proxy")
Proxy.set("enabled", "true")
Proxy.text=str(temp_id)
RuleList = ElementTree.SubElement(ProxifierProfile, "RuleList")
Rule = ElementTree.SubElement(RuleList, "Rule")
Rule.set("enabled", "true")
Name = ElementTree.SubElement(Rule,"Name")
Applications = ElementTree.SubElement(Rule,"Applications")
Action = ElementTree.SubElement(Rule,"Action")
Name.text="御劍後臺掃描工具.exe [auto-created]"
Applications.text="御劍後臺掃描工具.exe"
Action.set("type","Direct")
# Rule
Rule = ElementTree.SubElement(RuleList, "Rule")
Rule.set("enabled", "true")
Name = ElementTree.SubElement(Rule,"Name")
Targets = ElementTree.SubElement(Rule,"Targets")
Action = ElementTree.SubElement(Rule,"Action")
Name.text="Localhost"
Targets.text="localhost; 127.0.0.1; %ComputerName%"
Action.set("type", "Direct")
# Rule
Rule = ElementTree.SubElement(RuleList, "Rule")
Rule.set("enabled", "true")
Name = ElementTree.SubElement(Rule, "Name")
Action = ElementTree.SubElement(Rule, "Action")
Name.text = "Default"
Action.text = "102"
Action.set("type", "Proxy")
tree = ElementTree.ElementTree(ProxifierProfile)
tree.write("ProxifierConf.ppx", encoding="UTF-8", xml_declaration=True)
if __name__ == '__main__':
proxy_data = RedisProxyGet()
xmlOutputs(proxy_data)
print("ProxifierConf.ppx配置檔案建立完成....")
編譯成功生成ProxyfierConf.ppx
檔案。雙擊匯入proxyfier即可
這裡proxyfier的版本不能太高,否則會報錯,建議3.3.1