python爬蟲（四）——selenium校園網自動填報

jerry_liufeng發表於2020-10-25

原文網址 : https://blog.csdn.net/jerry_liufeng/article/details/109272715

Python爬蟲

文章目錄

說明：由於本次selenium自動化填報的網站需要個人資訊、密碼、校園網路VPN的支援，所以我將關於個人資訊、網址的相關內容進行了隱藏。主要展示填報的方法、流程。

一、selenium自動填報

1.流程

1）登入網站
2）跳轉到填報頁面
3）填寫內容，提交表單
4）關閉提交頁面

2.分析

1）登入網站、填寫內容都需要進行資訊的傳遞使用selenium進行控制比較方便
2）登入網站之後不必通過點選到達填報頁面，也可以使用其頁面地址直接跳轉，登入之後跳轉網站後臺已經儲存了你的登入session。

3.主要程式碼

我自己寫的時候是沒這麼過註釋的。為了便於大家理解在後期又加上了這些註釋。如果還有不理解的地方，或者認為有問題的地方歡迎指正。
主體結構：

登入網站（填寫個人資訊、點選登入按鈕）
跳轉填報頁面（選擇校區、寢室樓、寢室號、個人狀態等）

from selenium import webdriver
from selenium.webdriver.support.ui import Select
import time

vpn_username = "username"
vpn_password = "password" # 填自己的

class check_in(object):
    driver_path = r'C:\Program Files (x86)\Google\Chrome\Application\chromedriver.exe' # 填自己的，chromedriver.exe一般就放在這個位置中
    def __init__(self):
        option = webdriver.ChromeOptions()
        # 設定不顯示頁面，隱式執行（測試程式碼時建議將這樣註釋掉，可以直觀地看到自己在那一個步驟出bug了）
        option.add_argument('--headless')
        # 設定不載入圖片，加快載入速度
        prefs = {'profile.managed_default_content_settings.images': 2}
        option.add_experimental_option('prefs', prefs)
        self.driver = webdriver.Chrome(executable_path=check_in.driver_path,chrome_options=option)
        # 設定最大等待時間，避免阻塞
        self.driver.set_page_load_timeout(20)
        self.driver.set_script_timeout(20)
        
        self.url = 'https://****' # 登入頁面的url
        self.url_apply= 'https://****' # 填報頁面的url

    def run(self):   
        '''
        模擬登陸網站
		注意這次網站登入是沒有驗證碼的，如果有驗證碼還需要進行驗證碼的解析（現在的驗證碼都比較麻煩）
		'''
        try:
            self.driver.get(self.url)
            self.driver.find_element_by_name("username").send_keys(vpn_username)  # 填寫使用者名稱
            self.driver.find_element_by_name("password").send_keys(vpn_password)  # 填寫密碼
            self.driver.find_element_by_name("login_submit").click()              # 點選登入  
        except:
            print("載入頁面太慢，停止載入，繼續下一步操作")
            self.driver.execute_script("window.stop()")    # 這裡是防止網頁一直載入不能之後後面程式碼

    def tianxie(self):
    	'''
		對，你沒看錯，這個方法叫做填寫，hhh
		填寫網站中的相關內容
		'''
        self.driver.execute_script("window.open('%s')" % self.url_apply)  # 新開啟一個頁面
        self.driver.switch_to.window(self.driver.window_handles[1])       # 定位新開啟的頁面
        time.sleep(2)                                                     # 最好在每個步驟的前面都加上時間間隔，防止頁面沒開啟，自動化和爬蟲其實是有一定區別的，我們需要的不是快速爬取資訊（僅針對這個例子啊，別槓精）
        # 單選，選擇寢室校區
        select_xiaoqu = Select(self.driver.find_element_by_name("fieldSQxq"))
        select_xiaoqu.select_by_value("6")
        # 單選，選擇寢室樓
        select_gongyu = Select(self.driver.find_element_by_name('fieldSQgyl'))
        select_gongyu.select_by_value("60")
        # 填寫寢室號
        try:
            self.driver.find_element_by_name('fieldSQqsh').clear()
        except:
            print('寢室號似乎本來就是空的！！')
        self.driver.find_element_by_name('fieldSQqsh').send_keys('318')
        # 選擇個人狀態
        try:
            self.driver.find_element_by_id('V2_CTRL28').click()
            self.driver.find_element_by_id('V2_CTRL19').click()
            self.driver.find_element_by_id('V2_CTRL23').click()
        except:
            print("莫得感情，現在還簽到不了")
        # 提交表單
        self.driver.find_element_by_class_name('command_button').click()
        time.sleep(1)
        self.driver.find_element_by_xpath("//button[@class='dialog_button default fr']").click()
        time.sleep(1)
        self.driver.find_element_by_xpath("//button[@class='dialog_button default fr']").click()
        time.sleep(1)
        # 非常重要（關閉全部的頁面）
        self.driver.quit()


if __name__ == '__main__':
    auto_checkin = check_in()
    print("正在登陸填寫平臺...")
    auto_checkin.run()
    print("登陸成功...\n正在跳轉填頁面...")
    auto_checkin.tianxie()
    print("填報完成了！！！")

4.注意（期間可能出現的報錯）

selenium.common.exceptions.ElementNotInteractableException: Message: element not interactable: Element is not currently visible and may not be manipulated
原因：沒有獲取到網頁頁面的資訊，表示頁面載入出問題了，嘗試在出問題的行前加上time.sleep(2)使其載入一定的時間再執行後續工作
網頁一直載入，後面的selenium程式碼執行不了，300s之後程式自動kill。
這個網站可能設定了動態載入，使用selenium不能一直等待其載入完成，在載入一段時間之後直接停止載入即可。

self.driver.set_page_load_timeout(20)
self.driver.set_script_timeout(20)

selenium.common.exceptions.SessionNotCreatedException: Message: session not created: This version of ChromeDriver only supports Chrome version 86
你的chromedriver.exe與你的chrome版本不一致了，去下載一個新的chromedriver.exe將原有的替換掉
地址：http://chromedriver.storage.googleapis.com/index.html
chomedriver.exe放到你電腦中C:\Program Files (x86)\Google\Chrome\Application\chromedriver.exe這個位置上，或者其他位置（沒試過）
網頁中元素定位不到（這個是最基礎也是非常關鍵的一步，所以建議多學學相關知識，HTML+CSS+Xpath+正則）：
1）使用driver.find_element_by_name之內根據tag知識獲取
2）使用css定位獲取
3）使用xpath定位獲取

二、windows10自動執行程式進行填報

我使用windows10的後臺自動執行，也是參考別人的教程：
1）windows 10 設定計劃任務自動執行 python 指令碼的方法_JJLiu天姿
這個教程基本流程比較齊全了，但是關於最後路徑的位置可能有所欠缺，可以從2）中得到補充
2）Windows建立定時任務執行Python指令碼_CodingDang

這樣就可以進行自動系統填報了，懶人必備。。。

Python+selenium實現Drcom校園網自動檢測網路以及自動登入
2020-11-21
Python
python實現selenium網路爬蟲
2021-03-11
Python爬蟲
使用Python自動填寫問卷星(pyppeteer反爬蟲版)
2021-01-14
Python爬蟲
JAVA爬蟲使用Selenium自動翻頁
2024-05-15
Java爬蟲
Python網路爬蟲 - Phantomjs, selenium/Chromedirver使用
2019-01-22
Python爬蟲JSChrome
Python web自動化爬蟲-selenium/處理驗證碼/Xpath
2024-07-18
PythonWeb爬蟲
爬蟲Selenium+PhantomJS爬取動態網站圖片資訊（Python）
2018-03-24
爬蟲JS網站Python
Python爬蟲之路-selenium在爬蟲中的使用
2021-01-04
Python爬蟲
Python爬蟲基礎之selenium
2022-07-13
Python爬蟲
【0基礎學爬蟲】爬蟲基礎之自動化工具 Selenium 的使用
2023-04-21
爬蟲
Python爬蟲教程-26-Selenium + PhantomJS
2018-09-06
Python爬蟲JS
python爬蟲---網頁爬蟲，圖片爬蟲，文章爬蟲，Python爬蟲爬取新聞網站新聞
2019-01-04
Python爬蟲網頁網站
[python爬蟲] selenium爬取區域性動態重新整理網站（URL始終固定）
2018-04-26
Python爬蟲網站
Python爬蟲之Selenium庫的基本使用
2018-11-30
Python爬蟲
Python爬蟲之selenium庫使用詳解
2018-05-16
Python爬蟲
建站四部曲之Python爬蟲+資料準備篇(selenium)
2018-12-12
Python爬蟲
爬蟲-selenium的使用
2021-02-04
爬蟲
python網路爬蟲_Python爬蟲：30個小時搞定Python網路爬蟲視訊教程
2020-10-21
Python爬蟲
Python爬蟲教程-28-Selenium 操縱 Chrome
2018-09-06
Python爬蟲Chrome
python爬蟲第四天
2019-01-28
Python爬蟲
Python爬蟲深造篇(四)——Scrapy爬蟲框架啟動一個真正的專案
2021-11-08
Python爬蟲框架
python selenium爬蟲不開啟網頁不開啟瀏覽器
2020-11-15
Python爬蟲網頁瀏覽器
selenium自動爬取網易易盾的驗證碼
2020-07-20
【python爬蟲】用selenium爬時報錯UnicodeEncodeError: ‘gbk‘ codec can‘t encode character ‘\u2022‘
2020-11-27
Python爬蟲UnicodeError
【爬蟲】專案篇-使用selenium爬取大魚潮汐網
2024-04-05
爬蟲
selenium+python設定爬蟲代理IP的方法
2019-04-17
Python爬蟲
selenium爬蟲學習1
2024-08-29
爬蟲
【python--爬蟲】彼岸圖網高清桌布爬蟲
2019-07-21
Python爬蟲
【Python學習】爬蟲爬蟲爬蟲爬蟲~
2018-05-03
Python爬蟲
python網路爬蟲應用_python網路爬蟲應用實戰
2020-12-29
Python爬蟲
python DHT網路爬蟲
2019-02-14
Python爬蟲
使用Python爬蟲實現自動下載圖片
2021-09-11
Python爬蟲
Python爬蟲的兩套解析方法和四種爬蟲實現
2018-07-03
Python爬蟲
python網路爬蟲（14）使用Scrapy搭建爬蟲框架
2019-07-27
Python爬蟲框架
校園網-真實的校園網路社群
2019-05-11
[Python3網路爬蟲開發實戰] 7-動態渲染頁面爬取-1-Selenium的使用
2019-02-28
Python爬蟲
python 爬蟲如何爬取動態生成的網頁內容
2024-10-31
Python爬蟲網頁
Python網路爬蟲進階：自動切換HTTP代理IP的應用
2024-01-16
Python爬蟲HTTP