Python《wallpaper abyss桌布》
今天發現了一個新的桌布網站,wallpaper abyss,很不錯哦。
然而我的第一反應卻是,把它爬取下來。
到所有的分類頁面看一看
點選其中一個分類進去後
每個標籤item都有縮圖,且有子標籤,子標籤作為子目錄,我們看看縮圖和高清圖的區別吧。
所以我們可以從縮圖頁面即可得到高清img的地址。
每一個大的分類都是一個個分頁
好了,全部分析完畢,ok
完整測試程式碼如下:
import time
from concurrent.futures import ThreadPoolExecutor
import time
import os
import re
from urllib.parse import urlencode
import requests
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
rootrurl = 'https://wall.alphacoders.com/'
save_dir = 'D:/estimages/'
headers = {
"Referer": rootrurl,
'User-Agent': "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36",
'Accept-Language': 'en-US,en;q=0.8',
'Cache-Control': 'max-age=0',
'Connection': 'keep-alive'
} ###設定請求的頭部,偽裝成瀏覽器
def saveOneImg(dir, img_url):
new_headers = {
"Referer": img_url,
'User-Agent': "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36",
'Accept-Language': 'en-US,en;q=0.8',
'Cache-Control': 'max-age=0',
'Connection': 'keep-alive'
} ###設定請求的頭部,偽裝成瀏覽器,實時換成新的 header 是為了防止403 http code問題,防止反盜鏈,
try:
img = requests.get(img_url, headers=new_headers) # 請求圖片的實際URL
if (str(img).find('200') > 1):
with open(
'{}/{}.jpg'.format(dir, img_url.split('/')[-1].split('?')[0]), 'wb') as jpg: # 請求圖片並寫進去到本地檔案
jpg.write(img.content)
print(img_url)
jpg.close()
return True
else:
return False
except Exception as e:
print('exception occurs: ' + img_url)
print(e)
return False
def getAllTags():
list = {}
html = BeautifulSoup(requests.get(rootrurl + 'finding_wallpapers.php', headers=headers).text,
features="html.parser")
div = html.find('div', {'class': 'row'}).find_all('div')[1]
a_s = div.find_all('a')[1:]
for a in a_s:
list[a.get('title').split(' ')[0]] = rootrurl + a.get('href')
return list
def getSubTitleName(str):
cop = re.compile("[^\u4e00-\u9fa5^a-z^A-Z^0-9]") # 匹配不是中文、大小寫、數字的其他字元
string1 = cop.sub('_', str) # 將string1中匹配到的字元替換成下劃線字元
return string1
def getSubDir(p):
return getSubTitleName(p.find_all('a')[-1].get_text())
def getImgUrl(p):
list = p.find('img').get('src').split('-')
return list[0][:-5] + list[-1]
def processOnePage(tag, a_s, span):
for i in range(0 , len(a_s)):
subdir = getSubDir(span[i])
img = getImgUrl(a_s[i])
tmpDir = '{}{}/{}'.format(save_dir, tag, subdir)
if not os.path.exists(tmpDir):
os.makedirs(tmpDir)
saveOneImg(tmpDir, img)
pass
def oneSpider(tag, url):
# 獲得total pages
html = BeautifulSoup(requests.get(url, headers=headers).text, features="html.parser")
total = int(html.find('ul', {'class': 'pagination'}).find_all('a')[-2].string)
a_s = html.find_all('div', {'class': 'boxgrid'})
span = html.find_all('span', {'class': 'thumb-info-big'})
print('----- current page is 1. ------')
processOnePage(tag, a_s, span)
for i in range(2, (total + 1)):
html = BeautifulSoup(requests.get('{}&page={}'.format(url, i), headers=headers).text,
features="html.parser")
a_s = html.find_all('div', {'class': 'boxgrid'})
span = html.find_all('span', {'class': 'thumb-info-big'})
print('----- current page is %d. ------' % i)
processOnePage(tag, a_s, span)
pass
if __name__ == '__main__':
taglist = getAllTags()
# print(taglist)
# 給每個標籤配備一個執行緒
with ThreadPoolExecutor(max_workers=31) as t: # 建立一個最大容納數量為20的執行緒池
for tag, url in taglist.items():
t.submit(oneSpider, tag, url)
# just for test
# oneSpider('Anime', 'https://wall.alphacoders.com/by_category.php?id=3&name=Anime+Wallpapers')
# test
# for tag, url in taglist.items():
# oneSpider(tag, url)
# 等待所有執行緒都完成。
while 1:
print('-------------------')
time.sleep(1)
效果如下:
相關文章
- Dynamic Wallpaper for Mac(影片動態桌布)Mac
- Dynamic Wallpaper for Mac影片動態桌布Mac
- Dynamic Wallpaper視訊動態桌布
- Dynamic Wallpaper for Mac 動態桌布桌面Mac
- Mac影片動態桌布:Dynamic WallpaperMac
- Dynamic Wallpaper for Mac精美的動態桌布Mac
- Dynamic Wallpaper for Mac(精美的動態桌布)Mac
- 創意動態桌布:Dynamic Wallpaper 中文
- Dynamic Wallpaper Mac精美的動態桌布Mac
- 精美的動態桌布:Dynamic Wallpaper for MacMac
- 「最新」Dynamic Wallpaper for Mac 影片動態桌布Mac
- Mac動態桌布軟體—Dynamic WallpaperMac
- Mac視訊動態桌布:Dynamic WallpaperMac
- Mac視訊動態桌布:Dynamic Wallpaper MacMac
- Dynamic Wallpaper for Mac(精美的動態桌布) 8.7Mac
- 4K動態視訊桌布「Dynamic Wallpaper」
- Dynamic Wallpaper for Mac啟用版(影片動態桌布)Mac
- Dynamic Wallpaper for Mac(影片動態桌布) 啟用版Mac
- 優秀的動態桌布軟體:dynamic wallpaper
- Mac精美的動態桌布軟體:Dynamic Wallpaper for MacMac
- Dynamic Wallpaper for Mac(精美的動態桌布)v7.9Mac
- Living Wallpaper HD for Mac高畫質動態桌布Mac
- Dynamic Wallpaper for Mac(動態桌布) 16.2啟用版Mac
- Mac影片動態桌布如何製作?Dynamic Wallpaper for MacMac
- Mac上好用的視訊動態桌布:Dynamic WallpaperMac
- Macos動態桌布:Dynamic Wallpaper 16.6中文版Mac
- Dynamic Wallpaper Mac高畫質4K動態桌布Mac
- Dynamic Wallpaper for Mac(影片動態桌布)16.5啟用版Mac
- Dynamic Wallpaper for Mac最新版(影片動態桌布) 15.9Mac
- 草莓桌布下載軟體Strawberry Wallpaper 1.4.2中文版
- Dynamic Wallpaper for Mac(影片動態桌布) v14.1中文版Mac
- 精美的動態桌布推薦:Dynamic Wallpaper中文安裝包最新
- 4K高畫質Mac桌布軟體4K Wallpaper for MacMac
- Dynamic Wallpaper for Mac(Mac動態桌布桌面)v16.1中文版Mac
- 超高畫質4k桌布軟體4K Wallpaper HD Wallpapers
- Dynamic Wallpaper for Mac(精美的動態桌布)v8.8免啟用版Mac
- Dynamic Wallpaper for Mac(精美的動態桌布)v9.1免啟用版Mac
- Macos精美的動態桌布:Dynamic Wallpaper for Mac v16.4中文版Mac