Python爬蟲學習筆記-2.Requests庫

bypass發表於2017-05-20

  Requests是Python的一個優雅而簡單的HTTP庫,它比Pyhton內建的urllib庫,更加強大。

0X01 基本使用

  安裝 Requests,只要在你的終端中執行這個簡單命令即可:

pip install requests

  基本HTTP 請求型別:

r = requests.get(`http://httpbin.org/get`)
r = requests.post("http://httpbin.org/post")
r = requests.put("http://httpbin.org/put")
r = requests.delete("http://httpbin.org/delete")
r = requests.head("http://httpbin.org/get")
r = requests.options("http://httpbin.org/get")

  簡單的一個請求:

import requests
r = requests.get(`http://192.168.125.129/config/sql.php?id=1`)
print r.headers
print r.status_code
print r.url
print r.text
print r.content

  GET方式:

import requests
payload ={`id`:1}
r = requests.get(`http://192.168.125.129/config/sql.php`,params=payload)
print r.url
print r.content

  POST方式:

import requests
payload ={`id`:1}
r = requests.post(`http://192.168.125.129/config/sql.php`,data=payload)
print r.content

0X02 高階用法

1、設定headers

import requests
url=`http://192.168.125.129/config/sql.php?id=1`
headers={`User-Agent`:`Mozilla/5.0 (Windows NT 10.0; WOW64; rv:55.0) Gecko/20100101 Firefox/55.0`}
r= requests.get(url,headers=headers)
print r.text

 2、模擬登入和抓取資料的簡單示例

s = requests.session()
data = {`user`:`使用者名稱`,`passdw`:`密碼`}
#post 換成登入的地址,
res=s.post(`http://www.xxx.com/login.php`,data);
#換成抓取的地址
s.get(`http://www.xxx.com/admin/config.php`);

 3、已知cookie,進行登入

import requests
raw_cookies="PHPSESSID=0c1e5a748e064e93e91cca1714708339; security=impossible"
cookies={}
for line in raw_cookies.split(`;`):  
    key,value=line.split(`=`,1)
    cookies[key]=value  
testurl=`http://192.168.125.129/vulnerabilities/upload/`  
s=requests.get(testurl,cookies=cookies)  
print s.text

 4、SSL證照驗證問題

result=requests.get(`https://www.v2ex.com`, verify=False)

忽略驗證SSL證照,不然會報錯

  5、302重定向

result=s.post(loginUrl,data=postdata,headers=header,verify=False,allow_redirects=False)

   6、使用Python Requests上傳表單資料和檔案

import requests
url = "http://www.xxx.cn/upload.php"
files ={"username":(None,"test"),
        `filename`:(`1.jpg`,open(`1.jpg`,`rb`),`image/jpeg`),
        "password":(None,"test123!")}
res = requests.post(url, files=files)
print res.request.body
print res.request.headers

輸出請求體、請求頭效果如下:

--5e800fd12507423aa2e4a024db7b1fa1
Content-Disposition: form-data; name="username"

test
--5e800fd12507423aa2e4a024db7b1fa1
Content-Disposition: form-data; name="password"

test123!
--5e800fd12507423aa2e4a024db7b1fa1
Content-Disposition: form-data; name="filename"; filename="1.jpg"
Content-Type: image/jpeg


11111111111111111
1111111111111
11111111111111111

--5e800fd12507423aa2e4a024db7b1fa1--

{`Content-Length`: `667`, `Accept-Encoding`: `gzip, deflate`, `Accept`: `*/*`, `User-Agent`: `python-requests/2.12.4`, `Connection`: `keep-alive`, `Content-Type`: `multipart/form-data; boundary=5e800fd12507423aa2e4a024db7b1fa1`}

 

 

參考資料:

   http://cn.python-requests.org/zh_CN/latest/user/quickstart.html

 


相關文章