Python爬蟲—爬取某網站圖片

BaiXuePrincess發表於2020-11-19

原文網址 : https://blog.csdn.net/BaiXuePrincess/article/details/109814986

前言

本章主要用requests，解析圖片網址主要用beautiful soup

操作步驟

1.開啟F12，選到network，點選Load more…按鈕，可以檢視network裡抓到的網址
在這裡插入圖片描述
現在我們可以通過requests請求網頁

import requests
#cookies、headers值這裡就不寫了
cookies = {}
headers = {}
params = {'page': '2'}

#這裡是get請求，get方法帶引數請求時，是params=引數字典
response = requests.get('https://github.com/topics', headers=headers, params=params, cookies=cookies)

print(response.text)

2.點選下圖的小箭頭，選擇圖中的一個圖片點選，可以獲得圖片地址
在這裡插入圖片描述
根據請求到的資料用beautifulsoup 模組解析，獲取圖片地址

from bs4 import BeautifulSoup

soup = BeautifulSoup(response.content, "lxml")
    pngs = soup.find("ul", {"class": "list-style-none"}).find_all("li", {"class": "py-4 border-bottom"})
    print(len(pngs))
    for each in pngs:
        png_tag = each.find("img", {"class": "rounded-1 mr-3"})
        if not png_tag:
            png_url = ""
        else:
            png_url = png_tag.get("src")
            print(png_url)

3.獲取到圖片地址後就可將圖片儲存到本地
這裡我是用圖片原本的圖片名儲存的

import urllib.request
filename = png_url.split('/')[-1]
print(filename)
urllib.request.urlretrieve(png_url, 'E://images/'+filename)

4.全部的程式碼如下

import requests
from bs4 import BeautifulSoup
import urllib.request

def main():
    cookies = {}
    headers = {}
    params = {'page': '2'}

    response = requests.get('https://github.com/topics', headers=headers, params=params, cookies=cookies)

    soup = BeautifulSoup(response.content, "lxml")
    pngs = soup.find("ul", {"class": "list-style-none"}).find_all("li", {"class": "py-4 border-bottom"})
    print(len(pngs))
    for each in pngs:
        png_tag = each.find("img", {"class": "rounded-1 mr-3"})
        if not png_tag:
            png_url = ""
        else:
            png_url = png_tag.get("src")
            print(png_url)
            filename = png_url.split('/')[-1]
            print(filename)
            urllib.request.urlretrieve(png_url, 'E://images/'+filename)
            # response = requests.get(png_url, stream=True)
            # with open('E://images/'+filename, 'wb') as fd:
            #     fd.write(response.content)
            #     print(filename + "download success")

if __name__ == '__main__':
    main()

python爬蟲---網頁爬蟲，圖片爬蟲，文章爬蟲，Python爬蟲爬取新聞網站新聞
2019-01-04
Python爬蟲網頁網站
使用正則編寫簡單的爬蟲爬取某網站的圖片
2018-06-06
爬蟲網站
爬蟲Selenium+PhantomJS爬取動態網站圖片資訊（Python）
2018-03-24
爬蟲JS網站Python
node：爬蟲爬取網頁圖片
2019-02-16
爬蟲網頁
Python爬蟲入門【5】：27270圖片爬取
2019-07-30
Python爬蟲
【python--爬蟲】千圖網高清背景圖片爬蟲
2019-05-21
Python爬蟲
簡單的爬蟲：爬取網站內容正文與圖片
2021-09-09
爬蟲網站
爬蟲搭建代理池、爬取某網站影片案例、爬取新聞案例
2023-03-16
爬蟲網站
Java爬蟲批量爬取圖片
2021-09-24
Java爬蟲
網路爬蟲---從千圖網爬取圖片到本地
2019-09-03
爬蟲
Python爬蟲實戰詳解：爬取圖片之家
2020-11-04
Python爬蟲
爬蟲：HTTP請求與HTML解析（爬取某乎網站）
2021-05-19
爬蟲HTTPHTML網站
Python爬蟲入門教程 2-100 妹子圖網站爬取
2018-12-13
Python爬蟲網站
爬蟲---xpath解析（爬取美女圖片）
2020-12-23
爬蟲
Python網路爬蟲2 - 爬取新浪微博使用者圖片
2018-04-10
Python爬蟲
Python爬蟲入門【4】：美空網未登入圖片爬取
2019-07-30
Python爬蟲
Python爬蟲新手教程：知乎文章圖片爬取器
2019-07-20
Python爬蟲
Python爬蟲遞迴呼叫爬取動漫美女圖片
2020-10-19
Python爬蟲遞迴
爬蟲 Scrapy框架爬取圖蟲圖片並下載
2018-08-27
爬蟲框架
利用Python爬取攝影網站圖片，切勿商用
2018-12-18
Python網站
爬取某網站寫的python程式碼
2019-11-29
網站Python
新手爬蟲教程：Python爬取知乎文章中的圖片
2019-01-17
爬蟲Python
Python網路爬蟲3 – 生產者消費者模型爬取某金融網站資料
2019-02-28
Python爬蟲模型網站
Python網路爬蟲3 - 生產者消費者模型爬取某金融網站資料
2018-05-01
Python爬蟲模型網站
Python資料爬蟲學習筆記（11）爬取千圖網圖片資料
2018-09-18
Python爬蟲筆記
教你用Python爬取圖蟲網
2019-02-26
Python
Python爬蟲入門【9】：圖蟲網多執行緒爬取
2019-07-31
Python爬蟲執行緒
Python爬蟲小專案：爬一個圖書網站
2018-11-21
Python爬蟲網站
Python爬蟲入門教程 4-100 美空網未登入圖片爬取
2018-12-17
Python爬蟲
[Python]爬蟲獲取知乎某個問題下所有圖片並去除水印
2021-09-20
Python爬蟲
【python--爬蟲】彼岸圖網高清桌布爬蟲
2019-07-21
Python爬蟲
【Python爬蟲】正則爬取趕集網
2020-12-24
Python爬蟲
Python爬蟲入門【7】：蜂鳥網圖片爬取之二
2019-07-31
Python爬蟲
Python爬蟲入門【8】：蜂鳥網圖片爬取之三
2019-07-31
Python爬蟲
Python爬蟲入門【6】：蜂鳥網圖片爬取之一
2019-07-30
Python爬蟲
蘇寧易購網址爬蟲爬取商品資訊及圖片
2021-10-12
爬蟲
python 爬蟲之requests爬取頁面圖片的url，並將圖片下載到本地
2019-06-12
Python爬蟲
python爬取網圖
2019-10-15
Python

Python爬蟲—爬取某網站圖片

前言

操作步驟

相關文章