【Python學習】爬蟲爬蟲爬蟲爬蟲~

愛夕發表於2018-05-03

原文網址 : https://blog.csdn.net/Aixixxx/article/details/80174089

第八天
網上好多爬蟲都是py2的(:з」∠)
今天找了條py3的爬蟲嘗試爬學校的門戶

import io
import sys
import urllib.request
web_header = {
'User-Agent':'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36',
'Cookie':'iPlanetDirectoryPro=AQIC5wM2LY4SfczGmS5S1wsHjs3f8d%2FvQadvCPz780%2B9%2B1o%3D%40AAJTSQACMDI%3D%23; JSESSIONID=0000g4W05n040WWJHMgWwYK6u41:172u4qcnp'}
sys.stdout = io.TextIOWrapper(sys.stdout.buffer,encoding='utf8')
url_mh='http://xxxx.xxxx.xxx.xx/index.portal'
req=urllib.request.Request(url=url_mh,headers=web_header)
resp=urllib.request.urlopen(req)
data=resp.read()
print(data.decode('utf-8'))

分析下這些程式碼web_header 是頭部資訊包含了cookie之前cookie都是單寫在cookie模組裡用這個方法很簡單= =(:з」∠)
sys.stdout 重定向頁面的編碼為utf8
url_mh 門戶登陸成功的介面
req為設定好的關聯性修改頭部資訊
resq傳送post請求返回的資料通過read引數獲取
然後顯示出來用utf-8的格式
兩個utf-8一定要設定好了一個是顯示的格式一個是解碼的格式

經過諸多除錯可以爬到公告了

import io
import sys
import urllib.request
import re
import requests
web_header = {
'User-Agent':'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36',
'Cookie':'iPlanetDirectoryPro=AQIC5wM2LY4SfczGmS5S1wsHjs3f8d%2FvQadvCPz780%2B9%2B1o%3D%40AAJTSQACMDI%3D%23; JSESSIONID=0000g4W05n040WWJHMgWwYK6u41:172u4qcnp'}
sys.stdout = io.TextIOWrapper(sys.stdout.buffer,encoding='utf8')
url_mh='http://XXX/index.portal'
req=urllib.request.Request(url=url_mh,headers=web_header)
#resp=urllib.request.urlopen(req)
# data=resp.read()
#  print(data.decode('utf8'))
resp = requests.get(url=url_mh,headers=web_header)
resp.encoding = 'utf-8'
# print(resp.text)
gonggao = re.findall('<img src="images/s.gif" alt="" /></a><a   title='"(.*?)"' class="rss-title" οnclick=',resp.text,re.S)
for each in gonggao:
    print (each)

這個正則沒寫好這樣寫就沒問題了

import io
import sys
import urllib.request
import re
import requests
web_header = {
'User-Agent':'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36',
'Cookie':'iPlanetDirectoryPro=AQIC5wM2LY4SfczGmS5S1wsHjs3f8d%2FvQadvCPz780%2B9%2B1o%3D%40AAJTSQACMDI%3D%23; JSESSIONID=0000g4W05n040WWJHMgWwYK6u41:172u4qcnp'}
sys.stdout = io.TextIOWrapper(sys.stdout.buffer,encoding='utf8')
url_mh='http://xxx.cn/index.portal'
req=urllib.request.Request(url=url_mh,headers=web_header)
#resp=urllib.request.urlopen(req)
# data=resp.read()
#  print(data.decode('utf8'))
resp = requests.get(url=url_mh,headers=web_header)
resp.encoding = 'utf-8'
# print(resp.text)
gonggao = re.findall('<a   title=\'(.*?)\' class="rss-title"',resp.text,re.S)
for each in gonggao:
    print (each)

今天有點晚就不弄匯出到text了下次吧·

python爬蟲---網頁爬蟲，圖片爬蟲，文章爬蟲，Python爬蟲爬取新聞網站新聞
2019-01-04
Python爬蟲網頁網站
python爬蟲是什麼?學習python爬蟲難嗎
2021-03-31
Python爬蟲
爬蟲：多程式爬蟲
2021-05-19
爬蟲
什麼是爬蟲?學習Python爬蟲難不難?
2019-11-05
爬蟲Python
python就是爬蟲嗎-python就是爬蟲嗎
2020-10-29
Python爬蟲
通用爬蟲與聚焦爬蟲
2023-04-18
爬蟲
爬蟲--Scrapy簡易爬蟲
2020-10-07
爬蟲
python 爬蟲
2024-04-20
Python爬蟲
python爬蟲
2024-06-13
Python爬蟲
python爬蟲學習1
2020-11-29
Python爬蟲
為什麼學習python及爬蟲，Python爬蟲[入門篇]？
2018-11-21
Python爬蟲
Python爬蟲教程-01-爬蟲介紹
2018-09-06
Python爬蟲
Java爬蟲與Python爬蟲的區別？
2023-10-25
Java爬蟲Python
2個月精通Python爬蟲——3大爬蟲框架+6場實戰+反爬蟲技巧+分散式爬蟲
2018-06-28
Python爬蟲框架分散式
爬蟲進階：反反爬蟲技巧
2018-06-28
爬蟲
反爬蟲之字型反爬蟲
2019-06-27
爬蟲
【爬蟲】python爬蟲從入門到放棄
2018-12-20
爬蟲Python
【python--爬蟲】彼岸圖網高清桌布爬蟲
2019-07-21
Python爬蟲
Python爬蟲（1.爬蟲的基本概念）
2018-04-20
Python爬蟲
python爬蟲實戰，爬蟲之路，永無止境
2022-01-27
Python爬蟲
什麼是爬蟲？Python爬蟲框架有哪些？
2022-04-18
爬蟲Python框架
Python爬蟲與Java爬蟲有何區別？
2022-06-01
Python爬蟲Java
Python爬蟲之路-chrome在爬蟲中的使用
2021-01-04
Python爬蟲Chrome
Python爬蟲之路-selenium在爬蟲中的使用
2021-01-04
Python爬蟲
python爬蟲初探--第一個python爬蟲專案
2018-05-18
Python爬蟲
爬蟲
2024-11-16
爬蟲
Python爬蟲入門教程 50-100 Python3爬蟲爬取VIP視訊-Python爬蟲6操作
2019-02-14
Python爬蟲
python網路爬蟲_Python爬蟲：30個小時搞定Python網路爬蟲視訊教程
2020-10-21
Python爬蟲
【爬蟲】爬蟲專案推薦 / 思路
2020-04-21
爬蟲
網路爬蟲——爬蟲實戰（一）
2022-01-29
爬蟲
python爬蟲2
2019-01-07
Python爬蟲
Python爬蟲-xpath
2018-06-08
Python爬蟲
Python爬蟲——XPath
2018-07-28
Python爬蟲
Python爬蟲--2
2024-03-24
Python爬蟲
Python asyncio 爬蟲
2020-04-28
Python爬蟲
Python 爬蟲系列
2021-01-01
Python爬蟲
一入爬蟲深似海，總結python爬蟲學習筆記！
2019-02-14
爬蟲Python筆記
爬蟲複習
2024-12-04
爬蟲

【Python學習】爬蟲爬蟲爬蟲爬蟲~

相關文章