用 Python 做了一個豆瓣使用者讀書短評下載工具

發表於2015-10-11

Python

簡介

朋友問我能不能做一個下載他在豆瓣讀書上的短評的工具，於是就做了這個“豆瓣使用者讀書短評下載工具”。

GitHub連結：https://github.com/xiaff/dbc-downloader。

這個小工具使用Python3.4編寫，其工作流程為：

使用者輸入其豆瓣ID；

抓取使用者評論列表網頁

對網頁進行解析；

儲存評論相關資訊；

將Markdown格式檔案轉換為Html。

用到的庫主要有：

urllib.request

BeautifulSoup4

markdown

抓取網頁

所需要抓取的資訊在這樣的網頁中：http://book.douban.com/people/ahbei/collect?

sort=time&start=0&filter=all&mode=grid&tags_sort=count，URL中包含了使用者ID（people/之後）、評論序號（start=）等資訊。

url_1=’http://book.douban.com/people/’

url_2=’/collect?sort=time&start=’

url_3=’&filter=all&mode=grid&tags_sort=count’

url=url_1+uId+url_2+index+url_3,其中 UID 為豆瓣使用者ID，index 為評論序號。評論序號從0開始編號，每頁顯示15條，因為每

個url中的序號依次為0、15、30……15*i。 i的最大值即為網頁頁數-1，在解析第一張網頁的時候可以獲取頁數。

在抓取網頁的時候可以選擇使用代理伺服器，因此使用urllib.request設定代理：

proxyInfo=input(‘Please type in your HTTP Proxy: ‘)

proxySupport=urllib.request.ProxyHandler({‘http’:proxyInfo})

opener=urllib.request.build_opener(proxySupport)

urllib.request.install_opener(opener)

不過，如果只設定了代理就訪問豆瓣的使用者讀書評論列表，豆瓣會返回403 Forbidden。

解決辦法就是新增請求標頭（Request Headers）來模擬瀏覽器訪問。標頭資訊可以在瀏覽器中開啟網頁時按F12進入控制檯，在Network選項卡中找到請求標頭（Request Headers）。

比如，這是我在Edge瀏覽器中訪問豆瓣的請求標頭。

head= {
   'Accept':'text/html, application/xhtml+xml, image/jxr, */*',
   'Accept-Language': 'zh-Hans-CN, zh-Hans; q=0.5',
   'Connection':'Keep-Alive',
   'Cookie':'bid=lkpO8Id/Kbs; __utma=30149280.1824146216.1438612767.1440248573.1440319237.13; __utmz=30149280.1438612767.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); as=http://book.douban.com/people/133476248/; ll=108288; viewed=26274009_1051580; ap=1; ps=y; ct=y; __utmb=30149280.23.10.1440319237; __utmc=30149280; __utmt_douban=1; _pk_id.100001.3ac3=b288f385b4d73e38.1438657126.3.1440319394.1440248628.; __utma=81379588.142106303.1438657126.1440248573.1440319240.3; __utmz=81379588.1440319240.3.2.utmcsr=movie.douban.com|utmccn=(referral)|utmcmd=referral|utmcct=/; _pk_ses.100001.3ac3=*; __utmb=81379588.23.10.1440319240; __utmt=1; __utmc=81379588; _pk_ref.100001.3ac3=%5B%22%22%2C%22%22%2C1440319240%2C%22http%3A%2F%2Fmovie.douban.com%2F%22%5D',
   'Host':'book.douban.com',
   'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/42.0.2311.135 Safari/537.36 Edge/12.10240'}

head= {

'Accept':'text/html, application/xhtml+xml, image/jxr, */*',

'Accept-Language': 'zh-Hans-CN, zh-Hans; q=0.5',

'Connection':'Keep-Alive',

'Cookie':'bid=lkpO8Id/Kbs; __utma=30149280.1824146216.1438612767.1440248573.1440319237.13; __utmz=30149280.1438612767.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); as=http://book.douban.com/people/133476248/; ll=108288; viewed=26274009_1051580; ap=1; ps=y; ct=y; __utmb=30149280.23.10.1440319237; __utmc=30149280; __utmt_douban=1; _pk_id.100001.3ac3=b288f385b4d73e38.1438657126.3.1440319394.1440248628.; __utma=81379588.142106303.1438657126.1440248573.1440319240.3; __utmz=81379588.1440319240.3.2.utmcsr=movie.douban.com|utmccn=(referral)|utmcmd=referral|utmcct=/; _pk_ses.100001.3ac3=*; __utmb=81379588.23.10.1440319240; __utmt=1; __utmc=81379588; _pk_ref.100001.3ac3=%5B%22%22%2C%22%22%2C1440319240%2C%22http%3A%2F%2Fmovie.douban.com%2F%22%5D',

'Host':'book.douban.com',

'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/42.0.2311.135 Safari/537.36 Edge/12.10240'}

然後在訪問網頁的時候加上header資訊：

full_url=urllib.request.Request(url,headers=head)
response=urllib.request.urlopen(full_url)
html=response.read()

full_url=urllib.request.Request(url,headers=head)

response=urllib.request.urlopen(full_url)

html=response.read()

這樣就可以正確抓取到網頁內容了。

解析網頁

在我之前一篇文章《從豆瓣電影批量獲取看過某部電影的使用者列表》講過了使用 BeautifulSoup 解析網頁了，其實只要看看官方文件就很容易上手了。這裡就不再贅述了。

Markdown轉Html
最後一步是將以Markdown格式儲存的檔案轉換成Html檔案，這樣可以讓不熟悉Markdown的人在瀏覽器中直接檢視或者另存為PDF檔案。
markdown包可以做到這一點：

md = markdown.markdown(contents)
html = ‘<html><meta charset=”UTF-8″>’
html+='<title>’+title+'</title>’
html += “<body>” + md + “</body></html>”
md = markdown.markdown(contents)轉換出來的md是不包含<html>標籤的，因此需要自己加上這些標籤後再儲存。

原始碼

#coding=utf-8
#Python 3.4
##從豆瓣網頁中得到使用者的所有讀書短評

##網頁地址型別：http://book.douban.com/people/1000001/collect?sort=time&start=0&filter=all&mode=grid&tags_sort=count
##            http://book.douban.com/people/1000001/collect?sort=time&start=15&filter=all&mode=grid&tags_sort=count

from bs4 import BeautifulSoup
import time
import urllib.request,urllib.parse
from urllib.error import URLError,HTTPError
import os
import markdown

#換行符
lineSep='\n'

#設定HTTP代理
ans=input('Do you want to use a HTTP Proxy (N/y)? ')
ans=ans.lower()
if ans=='y' or ans=='yes':
    print('HTTP Proxy formart: IP:PORT \nExample: 127.0.0.1:80')
    print('Do NOT contain any unnecessary characters.')
    proxyInfo=input('Please type in your HTTP Proxy: ')
    proxySupport=urllib.request.ProxyHandler({'http':proxyInfo})
    opener=urllib.request.build_opener(proxySupport)
    urllib.request.install_opener(opener)
else:
    pass

#頭資訊
head= {
   'Accept':'text/html, application/xhtml+xml, image/jxr, */*',
   'Accept-Language': 'zh-Hans-CN, zh-Hans; q=0.5',
   'Connection':'Keep-Alive',
   'Cookie':'bid=lkpO8Id/Kbs; __utma=30149280.1824146216.1438612767.1440248573.1440319237.13; __utmz=30149280.1438612767.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); as=http://book.douban.com/people/133476248/; ll=108288; viewed=26274009_1051580; ap=1; ps=y; ct=y; __utmb=30149280.23.10.1440319237; __utmc=30149280; __utmt_douban=1; _pk_id.100001.3ac3=b288f385b4d73e38.1438657126.3.1440319394.1440248628.; __utma=81379588.142106303.1438657126.1440248573.1440319240.3; __utmz=81379588.1440319240.3.2.utmcsr=movie.douban.com|utmccn=(referral)|utmcmd=referral|utmcct=/; _pk_ses.100001.3ac3=*; __utmb=81379588.23.10.1440319240; __utmt=1; __utmc=81379588; _pk_ref.100001.3ac3=%5B%22%22%2C%22%22%2C1440319240%2C%22http%3A%2F%2Fmovie.douban.com%2F%22%5D',
   'Host':'book.douban.com',
   'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/42.0.2311.135 Safari/537.36 Edge/12.10240'}

#url=url_1+uId+url_2+Index+url_3
url_1='http://book.douban.com/people/'
url_2='/collect?sort=time&start='
url_3='&filter=all&mode=grid&tags_sort=count'

def is_chinese(uchar):
    """判斷一個unicode是否是漢字
    """
    if uchar >= u'\u4e00' and uchar<=u'\u9fa5':
        return True
    else:
        return False

def isChineseBook(title):
    """判斷書名是否為中文書名
    """
    for c in title:
        if(is_chinese(c)):
            return True
    return False

def getHtml(url):
    """返回指定的網頁內容
    """
    print('Loading: '+url+'......')
    full_url=urllib.request.Request(url,headers=head)
    TRY_TIMES=3
    response=None
    while TRY_TIMES>0 and response==None :
        TRY_TIMES-=1
        try:
            response=urllib.request.urlopen(full_url)   #open=urlopen  
        except HTTPError as e:
            print('HTTP Error:',e.code)
        except URLError as e:  
            print('URL Error: ',e.reason)
    if response==None:
        print('Error!')
        os.system("pause")
        exit()
    html=response.read()
    return html

def getBookComment(html):
    """解析網頁並返回5個列表：
    書名，出版資訊，標記日期，標籤，評論
    """
    titleList=[]    #書名
    pubList=[]      #出版資訊
    dateList=[]     #標記日期
    tagsList=[]     #標籤
    commentList=[]  #評論

    soup=BeautifulSoup(html,'html.parser')
    lis=soup.findAll('li','subject-item')
    for li in lis:
        infoDiv=li.find('div','info')
        commentP=infoDiv.find('p','comment')
        if commentP!=None:
            a=infoDiv.a
            #書名
            title1=a.get('title').strip()
            title2Span=a.span
            if title2Span!=None:
                title2=a.span.text.strip()
            else:
                title2=''
            title=title1+title2
            c1=title[0]
            c2=title[-1]
            #如果是中文書名，則加上書名號
            if isChineseBook(title):
                title=u'《'+title+u'》'
            else:   #英文書加斜體
                title='*'+title+'*'
            titleList.append(title)
            #出版資訊
            pubDiv=infoDiv.find('div','pub')
            pub=pubDiv.text.strip()
            pubList.append(pub)
            #標記日期
            dataSpan=infoDiv.find('span','date')
            words=dataSpan.text.split('\n')
            date=words[0]+words[1]
            dateList.append(date)
            #標籤
            tagsSpan=infoDiv.find('span','tags')
            if tagsSpan!=None:
                tags=tagsSpan.text.strip()
            else:
                tags=''
            tagsList.append(tags)
            #評論
            comment=commentP.text.strip()
            commentList.append(comment)
    return (titleList,pubList,dateList,tagsList,commentList)

def getHtmlTitle(html):
    """
    獲取網頁標題
    """
    soup=BeautifulSoup(html,'html.parser')
    title=soup.head.title.text
    return title

def clearOldFile(uId):
    """
    清除之前已儲存的檔案
    """
    fileName='booksComments_'+uId+'.md'
    temp=open(fileName,'w',encoding='utf-8')
    temp.close()

def saveBookComment(titleList,pubList,dateList,tagsList,commentList,uId):
    """儲存書評至檔案
    """
    fileName='booksComments_'+uId+'.md'
    wf=open(fileName,mode='a',encoding='utf-8')
    size=len(titleList)
    for i in range(size):
        title=titleList[i]
        pub=pubList[i]
        date=dateList[i]
        tags=tagsList[i]
        comment=commentList[i]
        wf.write('## '+title+lineSep)
        wf.write(pub+'  '+lineSep)
        wf.write(date+'  '+lineSep)
        wf.write(tags+lineSep+lineSep)
        wf.write(comment+lineSep+lineSep)
    wf.close()
    return fileName

def getPageNum(html):
    """解析第一頁網頁，返回該使用者的書評頁數
    """
    soup=BeautifulSoup(html,'html.parser')
    paginator=soup.find('div','paginator')
    pas=paginator.findAll('a')
    num=int(pas[-2].text)
    return num

def convertMd2Html(mdName,title):
    """
    將Markdown檔案轉換為Html格式檔案
    """
    htmlName=mdName.replace('.md','.html')
    mdFile=open(mdName,'r',encoding='utf-8')
    contents=mdFile.read()
    mdFile.close()
    md = markdown.markdown(contents)
    html = '<html><meta charset="UTF-8">'
    html+='<title>'+title+'</title>'
    html += "<body>" + md + "</body></html>"
    htmlFile=open(htmlName,'w',encoding='utf-8')
    htmlFile.write(html)
    htmlFile.close()
    return htmlName

#輸入User-Id
print('\nYou can find User-Id in the url.')
print('E.g. Someone\'s homepage\'url is http://book.douban.com/people/1000001/ , the User-Id should be 1000001 .')
uId=input('User-Id: ')
while(uId==''):
    uId=input('User-Id: ')
#計數器
count=0

#讀取第一頁
index=0
url=url=url_1+uId+url_2+str(index)+url_3
html=getHtml(url)
(titleList,pubList,dateList,tagsList,commentList)=getBookComment(html)
htmlTitle=getHtmlTitle(html)
clearOldFile(uId);
fileName=saveBookComment(titleList,pubList,dateList,tagsList,commentList,uId)

count+=len(titleList)
try:
    pageNum=getPageNum(html)    #使用者讀過的書的網頁頁數
except:
    pageNum=1
index+=1
#讀取後續頁
for i in range(index*15,15*pageNum,15):
    print('Sleep for 5 seconds.')
    time.sleep(5)
    print('%d/%d' %(i/15+1,pageNum))
    url=url=url_1+uId+url_2+str(i)+url_3
    html=getHtml(url)
    (titleList,pubList,dateList,tagsList,commentList)=getBookComment(html)
    count+=len(titleList)
    saveBookComment(titleList,pubList,dateList,tagsList,commentList,uId)
print('\nMission accomplished!')
print('%d comments have been saved to %s.' %(count,fileName))
ans=input('\nDo you want to convert Markdown file to html file(Y/n)?')
ans=ans.lower()
if ans!='n':
    htmlName=convertMd2Html(fileName,htmlTitle)
    print('Convert success: %s' %htmlName)
os.system("pause")

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

#coding=utf-8

#Python 3.4

##從豆瓣網頁中得到使用者的所有讀書短評

##網頁地址型別：http://book.douban.com/people/1000001/collect?sort=time&start=0&filter=all&mode=grid&tags_sort=count

## http://book.douban.com/people/1000001/collect?sort=time&start=15&filter=all&mode=grid&tags_sort=count

from bs4 import BeautifulSoup

import time

import urllib.request,urllib.parse

from urllib.error import URLError,HTTPError

import os

import markdown

#換行符

lineSep='\n'

#設定HTTP代理

ans=input('Do you want to use a HTTP Proxy (N/y)? ')

ans=ans.lower()

if ans=='y' or ans=='yes':

print('HTTP Proxy formart: IP:PORT \nExample: 127.0.0.1:80')

print('Do NOT contain any unnecessary characters.')

proxyInfo=input('Please type in your HTTP Proxy: ')

proxySupport=urllib.request.ProxyHandler({'http':proxyInfo})

opener=urllib.request.build_opener(proxySupport)

urllib.request.install_opener(opener)

else:

pass

#頭資訊

head= {

'Accept':'text/html, application/xhtml+xml, image/jxr, */*',

'Accept-Language': 'zh-Hans-CN, zh-Hans; q=0.5',

'Connection':'Keep-Alive',

'Host':'book.douban.com',

'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/42.0.2311.135 Safari/537.36 Edge/12.10240'}

#url=url_1+uId+url_2+Index+url_3

url_1='http://book.douban.com/people/'

url_2='/collect?sort=time&start='

url_3='&filter=all&mode=grid&tags_sort=count'

def is_chinese(uchar):

"""判斷一個unicode是否是漢字

"""

if uchar >= u'\u4e00' and uchar<=u'\u9fa5':

return True

else:

return False

def isChineseBook(title):

"""判斷書名是否為中文書名

"""

for c in title:

if(is_chinese(c)):

return True

return False

def getHtml(url):

"""返回指定的網頁內容

"""

print('Loading: '+url+'......')

full_url=urllib.request.Request(url,headers=head)

TRY_TIMES=3

response=None

while TRY_TIMES>0 and response==None :

TRY_TIMES-=1

try:

response=urllib.request.urlopen(full_url) #open=urlopen

except HTTPError as e:

print('HTTP Error:',e.code)

except URLError as e:

print('URL Error: ',e.reason)

if response==None:

print('Error!')

os.system("pause")

exit()

html=response.read()

return html

def getBookComment(html):

"""解析網頁並返回5個列表：

書名，出版資訊，標記日期，標籤，評論

"""

titleList=[] #書名

pubList=[] #出版資訊

dateList=[] #標記日期

tagsList=[] #標籤

commentList=[] #評論

soup=BeautifulSoup(html,'html.parser')

lis=soup.findAll('li','subject-item')

for li in lis:

infoDiv=li.find('div','info')

commentP=infoDiv.find('p','comment')

if commentP!=None:

a=infoDiv.a

#書名

title1=a.get('title').strip()

title2Span=a.span

if title2Span!=None:

title2=a.span.text.strip()

else:

title2=''

title=title1+title2

c1=title[0]

c2=title[-1]

#如果是中文書名，則加上書名號

if isChineseBook(title):

title=u'《'+title+u'》'

else: #英文書加斜體

title='*'+title+'*'

titleList.append(title)

#出版資訊

pubDiv=infoDiv.find('div','pub')

pub=pubDiv.text.strip()

pubList.append(pub)

#標記日期

dataSpan=infoDiv.find('span','date')

words=dataSpan.text.split('\n')

date=words[0]+words[1]

dateList.append(date)

#標籤

tagsSpan=infoDiv.find('span','tags')

if tagsSpan!=None:

tags=tagsSpan.text.strip()

else:

tags=''

tagsList.append(tags)

#評論

comment=commentP.text.strip()

commentList.append(comment)

return (titleList,pubList,dateList,tagsList,commentList)

def getHtmlTitle(html):

"""

獲取網頁標題

"""

soup=BeautifulSoup(html,'html.parser')

title=soup.head.title.text

return title

def clearOldFile(uId):

"""

清除之前已儲存的檔案

"""

fileName='booksComments_'+uId+'.md'

temp=open(fileName,'w',encoding='utf-8')

temp.close()

def saveBookComment(titleList,pubList,dateList,tagsList,commentList,uId):

"""儲存書評至檔案

"""

fileName='booksComments_'+uId+'.md'

wf=open(fileName,mode='a',encoding='utf-8')

size=len(titleList)

for i in range(size):

title=titleList[i]

pub=pubList[i]

date=dateList[i]

tags=tagsList[i]

comment=commentList[i]

wf.write('## '+title+lineSep)

wf.write(pub+' '+lineSep)

wf.write(date+' '+lineSep)

wf.write(tags+lineSep+lineSep)

wf.write(comment+lineSep+lineSep)

wf.close()

return fileName

def getPageNum(html):

"""解析第一頁網頁，返回該使用者的書評頁數

"""

soup=BeautifulSoup(html,'html.parser')

paginator=soup.find('div','paginator')

pas=paginator.findAll('a')

num=int(pas[-2].text)

return num

def convertMd2Html(mdName,title):

"""

將Markdown檔案轉換為Html格式檔案

"""

htmlName=mdName.replace('.md','.html')

mdFile=open(mdName,'r',encoding='utf-8')

contents=mdFile.read()

mdFile.close()

md = markdown.markdown(contents)

html = '<html><meta charset="UTF-8">'

html+='<title>'+title+'</title>'

html += "<body>" + md + "</body></html>"

htmlFile=open(htmlName,'w',encoding='utf-8')

htmlFile.write(html)

htmlFile.close()

return htmlName

#輸入User-Id

print('\nYou can find User-Id in the url.')

print('E.g. Someone\'s homepage\'url is http://book.douban.com/people/1000001/ , the User-Id should be 1000001 .')

uId=input('User-Id: ')

while(uId==''):

uId=input('User-Id: ')

#計數器

count=0

#讀取第一頁

index=0

url=url=url_1+uId+url_2+str(index)+url_3

html=getHtml(url)

(titleList,pubList,dateList,tagsList,commentList)=getBookComment(html)

htmlTitle=getHtmlTitle(html)

clearOldFile(uId);

fileName=saveBookComment(titleList,pubList,dateList,tagsList,commentList,uId)

count+=len(titleList)

try:

pageNum=getPageNum(html) #使用者讀過的書的網頁頁數

except:

pageNum=1

index+=1

#讀取後續頁

for i in range(index*15,15*pageNum,15):

print('Sleep for 5 seconds.')

time.sleep(5)

print('%d/%d' %(i/15+1,pageNum))

url=url=url_1+uId+url_2+str(i)+url_3

html=getHtml(url)

(titleList,pubList,dateList,tagsList,commentList)=getBookComment(html)

count+=len(titleList)

saveBookComment(titleList,pubList,dateList,tagsList,commentList,uId)

print('\nMission accomplished!')

print('%d comments have been saved to %s.' %(count,fileName))

ans=input('\nDo you want to convert Markdown file to html file(Y/n)?')

ans=ans.lower()

if ans!='n':

htmlName=convertMd2Html(fileName,htmlTitle)

print('Convert success: %s' %htmlName)

os.system("pause")

豆瓣短評榜單短評下載
2024-08-11
用python寫一個豆瓣短評通用爬蟲(登入、爬取、視覺化)
2020-10-24
Python爬蟲視覺化
用 Laravel 做了一個視訊下載站
2020-06-14
Laravel
【python爬蟲案例】利用python爬取豆瓣讀書評分TOP250排行資料
2024-09-20
Python爬蟲
python爬蟲練習之爬取豆瓣讀書所有標籤下的書籍資訊
2018-07-23
Python爬蟲
教你用python登陸豆瓣並爬取影評
2019-03-04
Python
豆瓣電影更改短評顯示機制不再顯示全部短評
2022-02-28
豆瓣：2020年度讀書榜單
2020-12-23
Python爬取豆瓣電影的短評資料並進行詞雲分析處理
2019-01-05
Python
【資料視覺化】周杰倫新歌《Mojito》豆瓣短評資料
2020-06-26
視覺化
ftp下載工具,ftp下載工具哪個好用，如何使用？
2020-05-23
FTP
Python實現微信讀書輔助工具
2020-07-31
Python
豆瓣讀書搜尋頁的window.__DATA__的解密
2019-03-12
解密
又做了一個“圖書借閱系統”小程式
2020-04-04
快手電商：2022短影片運營白皮書（附下載）
2022-12-08
益普索：2020年短視訊白皮書（附下載）
2021-01-26
用 python 寫一個自動化部署工具
2024-03-14
Python
用Python寫一個圖片標註工具
2018-12-27
Python
用Hexo主題next做了一個部落格！
2020-12-15
Hexo
Python豆瓣源
2018-11-18
Python
百度：短視訊營銷白皮書2.0（附下載）
2020-10-24
巨量引擎短視訊廣告價值白皮書（附下載）
2021-01-31
哪裡可以免費下載短影片？這個方法你們用過嗎？
2021-06-24
用Python做一個簡單的翻譯工具
2020-07-17
Python
這11個Linux下電子書工具，用的人都偷著樂
2019-03-04
Linux
python下載哪個版本好
2021-09-11
Python
寫了一個七牛備份下載工具: Qiniu Backup
2019-02-16
釘釘直播回放一鍵下載工具使用教程（2024最新下載工具）
2024-03-18
Python 豆瓣頂帖
2019-01-02
Python
《Head First Android》讀後感，電子書PDF下載
2018-05-16
Android
一個用Python將影片變為表情包的工具
2024-04-30
Python
CSM：2022年短影片使用者價值研究報告（附下載）
2023-02-01
CTR：2020年短視訊全鏈路營銷白皮書（附下載）
2020-10-29
怎麼下載最火短影片素材？短影片這個行業是暴利嗎？
2020-05-12
行業
NeatDownloadManager ，mac使用者必備一款多執行緒下載工具
2021-08-12
Mac執行緒
重寫Hexo豆瓣影評外掛
2020-12-15
Hexo
Mac專用影片下載工具-Downie 4
2023-02-20
Mac
用 PHP 寫一個命令列音樂下載器
2019-06-19
PHP命令列
6 個用於寫書的開源工具
2018-11-02
開源工具

用 Python 做了一個豆瓣使用者讀書短評下載工具

抓取網頁

解析網頁

相關文章