手機版python爬取網頁書籍

qq_20575249發表於2020-12-19

原文網址 : https://blog.csdn.net/qq_20575249/article/details/111398296

喜歡玩python ，又睡不著爬蟲玩玩
做大專案又做不了。所以就選擇了做個電子書爬蟲專案。
作者是個假程式設計師，寫的程式碼自己都覺得噁心。各位大神看了請別吐！

算了還是別bb 也不知道該說啥了
請看先效果圖吧

在這裡插入圖片描述

程式碼如下（示例）：

# coding:utf-8
import time
import os
from bs4 import BeautifulSoup
import requests
import requests.packages.urllib3.util.ssl_
requests.packages.urllib3.util.ssl_.DEFAULT_CIPHERS = 'ALL'
from threading import Thread
def Get(url):        # 定義網頁爬取函式方便多次呼叫
    time.sleep(3)
    webdata = requests.get(url).text
    Data = BeautifulSoup(webdata,'html.parser')
    return Data


def 小說目錄(url,書名):
    書名 = 書名
    webdata = Get(url)
    li_all = webdata.find(attrs={'class':'book-list clearfix'}).find('ul').find_all('a')
    print(li_all)
    目錄 = []
    i = 0 
    for all in li_all:
        t = Thread(target = 解析章節,args=(all.get('href'),書名,i))
        t.start() 
        Data = '◇<A href="'+str(i)+'.html'+'" >'+all.text+'</A><BR>'
        目錄.append(Data)
        i+=1
    目錄頁儲存(書名,目錄)        

def 目錄頁儲存(書籍,html_links):  # 書籍目錄頁儲存函式
    htmlLink ='<html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8" /><style type="text/css">@font-face{font-family:lianbi;src:url(font/連筆簽名字型.ttf)}.lianbi{font-family:lianbi;font-size:50px;text-shadow:none}</style></head><body><td width="68"><a href="index.html"><img border="0" src="../上頁.gif" width="64" height="23"></a></td><td width="68"><a href="../../../main.html"><img border="0" src="../目錄.gif" width="64" height="23"></a></td><td width="68"><a href="index.html"><img border="0" src="../下頁.gif" width="64" height="23"></a></td><center><p align="center"><b><font size="7" face="lianbi">目錄</font></b></p><table border="0" cellpadding="0" cellspacing="0"><td><font face="lianbi" size="5" color="#000000">'
    Data=""
    for line in html_links:
        Data += line
        
    HHH = htmlLink+Data+'</font></td></center></body></html>'
    mdpath = os.getcwd() + '/' + 書籍
    if not os.path.exists(mdpath):
        os.makedirs(mdpath)
    HtmlFile=open(mdpath+'/index.html','w', encoding='UTF-8')
    HtmlFile.write(HHH) 
    HtmlFile.close()

def 解析章節(url,書名,章節標題):
    書籍 = 書名
    標題 = 章節標題
    Data = Get(url)
    # 正文
    li = Data.find(id = 'nr1')
    print('章節內容:',li)
    t = Thread(target = 章節儲存,args=(書籍,標題,str(li)))
    t.start() 
def 章節儲存(書籍,i,內容):   # 定義網頁章節儲存函式
    if i==0:
        shangyiye = i
        xiayiye =  i+1
    else:
        shangyiye = i-1
        xiayiye =  i+1
    line= '<html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8" /><style type="text/css">@font-face{font-family:lianbi;src:url(font/連筆簽名字型.ttf)}.lianbi{font-family:lianbi;font-size:50px;text-shadow:none}</style></head><body><td width="68"><a href="'+str(shangyiye)+'.html"><img border="0" src="../上頁.gif" width="64" height="23"></a></td><td width="68"><a href="index.html"><img border="0" src="../目錄.gif" width="64" height="23"></a></td><td width="68"><a href="'+str(xiayiye)+'.html"><img border="0" src="../下頁.gif" width="64" height="23"></a></td><font face="lianbi" size="5" color="#000000"><p align="center"><b><font size="6" >'+str(i)+'</font></b></p>'+內容+'</font></body></html>'
    mdpath = os.getcwd() + '/' + 書籍
    if not os.path.exists(mdpath):
        os.makedirs(mdpath)
    HHH = str(i)+"/n"+內容
    HtmlFile=open(mdpath+'/'+str(i)+'.html','w', encoding='UTF-8')
    HtmlFile.write(str(line))
    HtmlFile.close()    
if __name__ == '__main__':
    書名 = "阿Q正傳"
  
    url = "https://www.xingyueboke.com/aqzhengzhuan/"
    小說目錄(url,書名)

關於python爬取網頁
2021-03-10
Python網頁
java爬取豆瓣書籍資訊
2019-01-03
Java
Python爬蟲教程+書籍分享
2018-11-29
Python爬蟲
Python第一個爬蟲，爬取噹噹網 Top 500 本五星好評書籍
2019-07-19
Python爬蟲
python爬取網頁詳細教程
2021-09-11
Python網頁
library官網中文版，zlibrary網頁版如何下載書籍
2024-11-01
網頁
Python一鍵爬取你所關心的書籍資訊
2019-03-05
Python
python爬蟲練習之爬取豆瓣讀書所有標籤下的書籍資訊
2018-07-23
Python爬蟲
如何使用python進行網頁爬取?
2020-08-06
Python網頁
Python《爬取手機和桌面桌布》
2020-12-25
Python
2019最新Python爬蟲教程+書籍分享
2019-01-06
Python爬蟲
計算機書籍- 網路爬蟲開發實戰
2019-03-28
計算機爬蟲
爬取網頁文章
2021-09-29
網頁
網頁用python爬取後如何解析
2021-09-11
網頁Python
Python爬取網頁的所有內外鏈
2021-04-09
Python網頁
爬蟲——網頁爬取方法和網頁解析方法
2020-12-07
爬蟲網頁
python爬蟲---網頁爬蟲，圖片爬蟲，文章爬蟲，Python爬蟲爬取新聞網站新聞
2019-01-04
Python爬蟲網頁網站
Python應用開發——爬取網頁圖片
2022-09-21
Python網頁
Python 爬取網頁資料的兩種方法
2023-02-15
Python網頁
python 爬蟲如何爬取動態生成的網頁內容
2024-10-31
Python爬蟲網頁
Python書籍｜分享一本Python的書籍
2018-12-28
Python
python爬蟲解決趕集網掃碼獲取手機號
2018-03-13
Python爬蟲
node：爬蟲爬取網頁圖片
2019-02-16
爬蟲網頁
python爬蟲爬取網頁中文亂碼問題的解決
2024-11-17
Python爬蟲網頁
Python網路爬蟲之爬取淘寶網頁頁面 MOOC可以執行的程式碼
2018-11-24
Python爬蟲網頁
Python筆記：網頁資訊爬取簡介（一）
2020-11-11
Python筆記網頁
python爬取網圖
2019-10-15
Python
ferret 爬取動態網頁
2019-12-15
網頁
Puppeteer爬取網頁資料
2019-03-22
網頁
python爬取365好書中小說
2018-03-27
Python
python書籍推薦-Python爬蟲開發與專案實戰
2019-06-11
Python爬蟲
python爬取換頁_爬蟲爬不進下一頁了，怎麼辦
2020-11-24
Python爬蟲
爬蟲利器Pyppeteer的介紹和使用爬取京東商城書籍資訊
2020-09-22
爬蟲
python爬取58同城一頁資料
2018-08-04
Python
python爬蟲學習01--電子書爬取
2020-07-13
Python爬蟲
不會Python爬蟲？教你一個通用爬蟲思路輕鬆爬取網頁資料
2019-01-08
Python爬蟲網頁
Python 爬取網頁中JavaScript動態新增的內容（一）
2018-09-28
Python網頁JavaScript
Python 爬取網頁中JavaScript動態新增的內容（二）
2018-09-28
Python網頁JavaScript

手機版python爬取網頁書籍

相關文章