京東商品資訊爬蟲

weixin_33976072發表於2017-08-14

最近閒著在家無聊，就看看爬蟲的書籍，突然發現很有趣，就寫了許多程式碼，爬取了許多的網站，今天就分享爬取京東的原始碼。

#京東商品資訊爬蟲  
#爬取京東商品資訊並儲存到csv格式檔案中  
#2017-7-23  
  
  
import os  
import requests  
import csv  
from bs4 import BeautifulSoup  
  
#獲取url請求  
def gethtml(kind,page):  
    '''''獲取url請求'''  
    pagenum = str(2 * page)  
    try:  
        r = requests.get('https://search.jd.com/Search?keyword=' + \  
        kind + '&enc=utf-8&page=' + pagenum)#連結url  
        r.raise_for_status()  
        r.encoding = r.apparent_encoding#轉碼  
        print('爬取第{}頁：'.format(page))  
        return r.text#返回html  
    except:  
        print('連結異常！！！')  
        return ''  
  
#獲取定位資源  
def findhtml(html,httplist):  
    """尋找資源"""  
    soup = BeautifulSoup(html,'lxml')  
    links = soup.find_all('div', class_='gl-i-wrap')#尋找'div'標籤  
    for link in links:  
        ui = []  
        namediv = link.find('div', class_='p-name p-name-type-2')#尋找商品名稱和連結  
        title = namediv.a['title']  
        href = namediv.a['href']  
        ui.append(title)#名稱加入到ui中  
        pricediv = link.find('div', class_='p-price')#尋找商品價格  
        try:  
            price =  pricediv.strong['data-price']   
            ui.append(price)#價格加入到ui中  
        except:  
            ui.append('')  
        if 'https:' not in href:#新增連結  
            ui.append('https:' + href)  
        else:  
            ui.append(href)  
        aggressmentdiv = link.find('div', class_='p-commit')#尋找評論  
        number = aggressmentdiv.strong.contents[1].string  
        ui.append(number)#評論數新增到ui中  
        httplist.append(ui)  
        try:  
            if price:  
                print('{:^10s}:{:<}元'.format(title,price))  
            else:  
                print('{:^10s}'.format(title))  
        except:  
            print('{:^10s}'.format(title))  
  
  
#儲存資源  
def savehtml(ul):  
    path = 'D:/資料/'  
    if not os.path.exists(path):  
        os.mkdir(path)#建立一個檔案  
    with open(path + '京東商品資訊爬蟲.csv','w+') as f:  
        writer = csv.writer(f)  
        writer.writerow(['商品','價格','連結','評價數'])  
        for u in range(len(ul)):  
            if ul[u]:  
                writer.writerow([ul[u][0],ul[u][1],ul[u][2],ul[u][3]])  
  
  
  
#程式主體  
if __name__ == '__main__':  
    goods = input('請輸入要搜尋的物品：')  
    yeshu = int(input('請輸入要查詢到的頁數:'))  
    ulist = []  
    for i in range(yeshu+1):  
        try:  
            if i != 0:  
                text = gethtml(goods,i)  
                findhtml(text,ulist)  
            savehtml(ulist)  
        except:  
            break

Python爬蟲爬取淘寶，京東商品資訊
2020-02-11
Python爬蟲
Python爬蟲二：抓取京東商品列表頁面資訊
2018-06-26
Python爬蟲
Python爬蟲實戰：爬取淘寶的商品資訊
2021-09-11
Python爬蟲
python 爬蟲實戰專案--爬取京東商品資訊（價格、優惠、排名、好評率等）
2018-06-27
Python爬蟲
爬蟲例項-淘寶頁面商品資訊獲取
2020-10-08
爬蟲
京東商品圖片自動下載抓取 c# 爬蟲
2020-09-30
C#爬蟲
蘇寧易購網址爬蟲爬取商品資訊及圖片
2021-10-12
爬蟲
用python編寫的抓京東商品價格的爬蟲
2014-01-02
Python爬蟲
爬蟲利器Pyppeteer的介紹和使用爬取京東商城書籍資訊
2020-09-22
爬蟲
電商API介面：京東按關鍵字搜尋商品批次抓取資料爬蟲
2023-02-23
API爬蟲
如何用python爬蟲分析動態網頁的商品資訊？
2021-09-11
Python爬蟲網頁
爬蟲入門之淘寶商品資訊定向爬取！雙十一到了學起來啊！
2020-10-30
爬蟲
Java爬蟲-爬取疫苗批次資訊
2024-06-03
Java爬蟲
python爬蟲--招聘資訊
2018-11-03
Python爬蟲
python爬蟲——爬取大學排名資訊
2019-08-02
Python爬蟲
API商品資料介面呼叫爬蟲實戰
2023-10-27
API爬蟲
python爬蟲抓取哈爾濱天氣資訊（靜態爬蟲）
2020-04-05
Python爬蟲
Python爬蟲抓取股票資訊
2021-01-03
Python爬蟲
python爬蟲--爬取鏈家租房資訊
2020-05-16
Python爬蟲
Java爬蟲實戰：API商品資料介面呼叫
2023-10-26
Java爬蟲API
Python爬蟲，抓取淘寶商品評論內容!
2018-06-24
Python爬蟲
【Python學習】爬蟲爬蟲爬蟲爬蟲~
2018-05-03
Python爬蟲
小白學 Python 爬蟲（25）：爬取股票資訊
2019-12-24
Python爬蟲
手把手教你寫電商爬蟲(2):實戰尚妝網分頁商品採集爬蟲
2016-08-01
爬蟲
Python 爬蟲實戰之爬拼多多商品並做資料分析
2023-10-17
Python爬蟲
京東商品詳情介面，京東商品優惠券介面，京東商品分析資料介面，京東API介面封裝程式碼
2023-04-07
API封裝
分散式爬蟲之知乎使用者資訊爬取
2018-08-31
分散式爬蟲
爬蟲實戰（一）：爬取微博使用者資訊
2018-07-15
爬蟲
爬蟲01:爬取豆瓣電影TOP 250基本資訊
2020-12-29
爬蟲
[Python3]selenium爬取淘寶商品資訊
2021-09-09
Python
[Python] 網路爬蟲與資訊提取（1）網路爬蟲之規則
2020-11-06
Python爬蟲
Python爬蟲——實戰一：爬取京東產品價格(逆向工程方法)
2017-08-15
Python爬蟲
爬蟲：多程式爬蟲
2021-05-19
爬蟲
Python大神利用正規表示式教你搞定京東商品資訊
2019-06-24
Python
上天的Node.js之爬蟲篇 15行程式碼爬取京東資源
2019-03-22
Node.js爬蟲行程
通用爬蟲與聚焦爬蟲
2023-04-18
爬蟲
爬蟲--Scrapy簡易爬蟲
2020-10-07
爬蟲
python爬蟲---網頁爬蟲，圖片爬蟲，文章爬蟲，Python爬蟲爬取新聞網站新聞
2019-01-04
Python爬蟲網頁網站

京東商品資訊爬蟲

相關文章