【python】百度關鍵詞排名查詢實現

勵志要做一隻牛逼的程式媛。發表於2018-12-03

python版本:3.7.1

安裝依賴包requests  re urllib bs4......

安裝方法:開啟python安裝目錄,找到scripts的目錄,按住shift出現開啟命令視窗,進入後先pip list檢視安裝了那些包,然後再pip install 安裝所需要的包。

 

參考網址:https://blog.csdn.net/Ryuchong/article/details/80687447

# -*- coding:utf8 -*-
import requests
import re
import pymysql

#關鍵字,公司網址,查詢網址
keyword = input(u"請輸入你要查詢的關鍵字")
site = input("請輸入您要查詢的網址")
site_baidu = u"http://www.baidu.com/s?wd=%s&pn=%d0"
site_360 = "https://hao.360.cn/"


#查詢排名
i = 0
#word = u"體檢行業爆醜聞"
#site = "https://baijiahao.baidu.com"
site_baidu = u"http://www.baidu.com/s?wd=%s&pn=%d0"
def KeywordRank(searchTxt, webUrl):
    global i
    try:
        pattern = re.compile(b'class="c-showurl" style="text-decoration:none;">(.*?)&nbsp', re.S)
        result = pattern.findall(searchTxt)
        for item in result:
            item_str = str(item, encoding = "utf8")
            i = i+1
            print ("rank %d: %s"%(i,item_str))
            if site  in item_str:
                return i
    except Exception as e:
        print(e)
       
        return None
    return None
 
# content:要搜尋的關鍵詞, page:要搜尋的頁碼
def BaiduSearch(content, page):
    try:
        url = site_baidu % (content, page)
        data = requests.get(url)
        return data.content
    except Exception as e:
        return None
     
if __name__ == "__main__":
    loops = 10     # 最多查到第 10 頁
    page = 0
    while(loops):
        searchTxt = BaiduSearch(keyword, page)
        page = page+1
        rank = KeywordRank(searchTxt, site)
        if None!=rank:
            print (u"輸入的關鍵詞排在第 %d 名" % rank)
            print(rank)
            break
        loops = loops - 1


#資料庫連線儲存資料
        
conn = pymysql.Connect(
    host = '127.0.0.1',
    port = 3306,
    user = 'root',
    password = 'root',
    db = 'test',
    charset = 'utf8'
    )

cursor = conn.cursor()

sql_insert="insert into seo(id,site,word,rank) values('','%s','%s','%d')"%(site,keyword,rank)
cursor.execute(sql_insert)
conn.commit()
cursor.close()
conn.close()

執行結果:

 

思路的話參考網址裡說的很清楚,在這裡就強調一下注意新增編碼格式以及python2版本與3的不相容,語法方面的變化。

相關文章