Python爬取糗事百科段子

#!/usr/bin/python
# -*- coding:utf-8 -*-
import urllib
import urllib2
import re
import sys
reload(sys)
sys.setdefaultencoding(`utf8`) 

page = 1
url = `http://www.qiushibaike.com/hot/page/` + str(page)
user_agent = `Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101 Safari/537.36`
headers = {`User-Agent`:user_agent}
try:
    request = urllib2.Request(url,headers = headers)
    response  = urllib2.urlopen(request)
   # print response.read()
 
    ######_match_ string
 
    content = response.read().decode(`utf-8`)
    # 其中 （.*?）為匹配的內容
    # 如果對正則不是很熟的同學，可以參考以下： 
    # 1. “.”是萬用字元，”*”表示匹配0次或任意次，”?”表示非貪婪匹配，.*?組合在一起則表示儘可能短地做匹配。 
    # 2. (.*?)代表一個分組，或者說一個捕獲組。 
    # 3. re.S 標誌代表在匹配時為點任意匹配模式，點 . 也可以代表換行符。 
    pattern = re.compile(r`<div.*?author clearfix".*?<img.*?<h2>(.*?)</h2>.*?<div.*?`+
                         `content">.*?<span>(.*?)</span>.*?<div class="stats.*?class="number">(.*?)</i>`,re.S)
    items = re.findall(pattern,content)
    # print("%s"%items)
    for item in items:
        print("auther:%s"%item[0])
        print("content:%s"%item[1])
        print("likes:%s
"%item[2])
 
except urllib2.URLError,e:
    if hasattr(e,`code`):
        print e.code
    if hasattr(e,`reaon`):
        print e.reason

原文地址

python爬取糗事百科
2018-08-14
Python
python爬蟲十二：middlewares的使用，爬取糗事百科
2018-05-31
Python爬蟲
python3.6.5 爬取糗事百科，開心一下
2018-07-10
Python
python多執行緒爬去糗事百科
2018-04-03
Python執行緒
Python爬取內涵段子裡的段子
2021-09-09
Python
網路爬蟲——專案實戰（爬取糗事百科所有文章）
2020-02-07
爬蟲
使用python爬取百度百科
2022-07-05
Python
Python網路爬蟲（正則, 內涵段子，貓眼電影, 鏈家爬取）
2018-10-30
Python爬蟲
[外掛擴充套件]糗事百科QiuBa
2020-04-04
套件
仿的一個笑話網站糗事百科
2019-05-11
網站
使用webmagic爬蟲對百度百科進行簡單的爬取
2019-02-20
Web爬蟲
python爬取網圖
2019-10-15
Python
python 爬蟲爬取 learnku 精華文章
2020-04-17
Python爬蟲
仿糗事百科笑話系統原始碼，PHP笑話系統原始碼
2019-05-11
原始碼PHP
Python爬取電影天堂
2018-11-01
Python
Python爬取周杰倫instagram
2018-07-08
Python
python 爬取 mc 皮膚
2019-08-02
Python
Python《爬取IPhone各式桌布》
2020-12-11
PythoniPhone
python例項，python網路爬蟲爬取大學排名!
2018-11-20
Python爬蟲
python爬蟲——爬取大學排名資訊
2019-08-02
Python爬蟲
Python爬蟲—爬取某網站圖片
2020-11-19
Python爬蟲網站
python爬蟲--爬取鏈家租房資訊
2020-05-16
Python爬蟲
python 爬蟲 1 爬取酷狗音樂
2020-03-29
Python爬蟲
【Python爬蟲】正則爬取趕集網
2020-12-24
Python爬蟲
用python爬取知識星球
2019-02-16
Python
python爬取北京租房資訊
2018-05-18
Python
Python：爬取疫情每日資料
2020-02-17
Python
利用Python爬取必應桌布
2020-10-13
Python
Python-爬取CVE漏洞庫?
2021-11-05
Python
關於python爬取網頁
2021-03-10
Python網頁
python——豆瓣top250爬取
2021-01-02
Python
Python爬蟲：爬取instagram，破解js加密引數
2019-04-09
Python爬蟲JS加密
python網路爬蟲--爬取淘寶聯盟
2018-07-17
Python爬蟲
Python爬蟲入門【5】：27270圖片爬取
2019-07-30
Python爬蟲
Python 第一個爬蟲，爬取 147 小說
2020-05-08
Python爬蟲
小白學 Python 爬蟲（25）：爬取股票資訊
2019-12-24
Python爬蟲
爬蟲——爬取貴陽房價（Python實現）
2022-02-09
爬蟲Python
房產資料爬取、智慧財產權資料爬取、企業工商資料爬取、抖音直播間資料python爬蟲爬取
2024-07-11
Python爬蟲

Python爬取糗事百科段子

相關文章