【python爬蟲】用selenium爬時報錯UnicodeEncodeError: ‘gbk‘ codec can‘t encode character ‘\u2022‘

獨愛Python發表於2020-11-27

今日在用selenium爬取拉勾網資料時，遇到如下錯誤：

UnicodeEncodeError: 'gbk' codec can't encode character '\u2022' in position 131907: illegal multibyte sequence

解決方案：

import io
import sys
sys.stdout = io.TextIOWrapper(sys.stdout.buffer,encoding='utf-8')

查了相關資料發現原因是print()函式自身有限制，不能完全列印所有的unicode字元。

其實print()函式的侷限就是Python預設編碼的侷限，因為系統是win7的，python的預設編碼不是’utf-8’,改一下python的預設編碼成’utf-8’就行了

import io  
import sys 
from urllib import request
sys.stdout = io.TextIOWrapper(sys.stdout.buffer,encoding='utf8') 
#改變標準輸出的預設編碼
url="https://www.baidu.com/"
headers="User-Agent","Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:83.0) Gecko/20100101 Firefox/83.0"
response=requests.get(url,headers=Headers)
print(response)

如果在cmd下執行該指令碼有亂碼，而在IDLE下執行卻很正常。

原因是cmd不能很好地相容utf8，而IDLE就可以，甚至在IDLE下執行，連“改變標準輸出的預設編碼”都不用，因為它預設就是utf8。如果一定要在cmd下執行，那就改一下編碼，比如我換成“gb18030”，就能正常顯示了：

sys.stdout = io.TextIOWrapper(sys.stdout.buffer,encoding='gb18030')  
#改變標準輸出的預設編碼

這裡附上一些常用的和中文有關的編碼的名稱，分別賦值給encoding，就可以看到不同的效果了：

編碼名稱	用途
utf8	所有語言
gbk	簡體中文
gb2312	簡體中文
gb18030	簡體中文
big5	繁體中文
big5hkscs	繁體中文

參考連結：https://blog.csdn.net/jim7424994/article/details/22675759

Python——UnicodeEncodeError: 'ascii' codec can't encode/decode characters
2018-11-28
PythonUnicodeErrorASCII
day1 UnicodeEncodeError: 'gbk' codec can't encode character '\xa0' in position 2490: illegal multi...
2020-04-04
UnicodeError
python cx_Oracle: UnicodeEncodeError: 'ascii' codec can't encode characters
2019-10-11
PythonOracleUnicodeErrorASCII
Python報錯：UnicodeDecodeError: 'gbk' codec can't decode byte ...
2018-06-12
PythonUnicodeError
Python3解決UnicodeEncodeError: 'ascii' codec can't encode characters in position 0
2019-12-23
PythonUnicodeErrorASCII
Python3.7使用pip install xxxx報錯：UnicodeDecodeError: 'gbk' codec can't decode byte
2019-03-07
PythonUnicodeError
Python爬蟲之路-selenium在爬蟲中的使用
2021-01-04
Python爬蟲
pip install ... ERROR: UnicodeDecodeError: ‘gbk‘ codec can‘t decode/ python setup.py egg_info Check
2020-09-30
ErrorUnicodePython
Python爬蟲基礎之selenium
2022-07-13
Python爬蟲
【Python學習】爬蟲爬蟲爬蟲爬蟲~
2018-05-03
Python爬蟲
Python爬蟲教程-26-Selenium + PhantomJS
2018-09-06
Python爬蟲JS
python實現selenium網路爬蟲
2021-03-11
Python爬蟲
python爬蟲（四）——selenium校園網自動填報
2020-10-25
Python爬蟲
爬蟲-selenium的使用
2021-02-04
爬蟲
Python讀取檔案時出現UnicodeDecodeError: 'gbk' codec can't decode byte 0x80 in position xx: 解決方案
2019-03-04
PythonUnicodeError
Python爬蟲之Selenium庫的基本使用
2018-11-30
Python爬蟲
Python網路爬蟲 - Phantomjs, selenium/Chromedirver使用
2019-01-22
Python爬蟲JSChrome
Python爬蟲之selenium庫使用詳解
2018-05-16
Python爬蟲
python爬蟲 -IndexError: list index out of range報錯
2020-12-26
Python爬蟲IndexError
python網路爬蟲_Python爬蟲：30個小時搞定Python網路爬蟲視訊教程
2020-10-21
Python爬蟲
python爬蟲---網頁爬蟲，圖片爬蟲，文章爬蟲，Python爬蟲爬取新聞網站新聞
2019-01-04
Python爬蟲網頁網站
Python爬蟲可以幹什麼?Python爬蟲有什麼用?
2022-08-25
Python爬蟲
selenium爬蟲學習1
2024-08-29
爬蟲
python網路爬蟲應用_python網路爬蟲應用實戰
2020-12-29
Python爬蟲
Python爬蟲教程-28-Selenium 操縱 Chrome
2018-09-06
Python爬蟲Chrome
python就是爬蟲嗎-python就是爬蟲嗎
2020-10-29
Python爬蟲
python 爬蟲
2024-04-20
Python爬蟲
python爬蟲
2024-06-13
Python爬蟲
selenium+python設定爬蟲代理IP的方法
2019-04-17
Python爬蟲
爬蟲Selenium+PhantomJS爬取動態網站圖片資訊（Python）
2018-03-24
爬蟲JS網站Python
python爬蟲是什麼?為什麼用python語言寫爬蟲？
2022-04-02
Python爬蟲
簡單的 Selenium 爬蟲應用及定時桌面提示圖示
2020-03-11
爬蟲
Python爬蟲入門教程 50-100 Python3爬蟲爬取VIP視訊-Python爬蟲6操作
2019-02-14
Python爬蟲
PIP3安裝報錯nicodeDecodeError: 'ascii' codec can't decode byte 0xc3
2020-03-09
ErrorASCII
python爬蟲初探--第一個python爬蟲專案
2018-05-18
Python爬蟲
Python爬蟲教程-01-爬蟲介紹
2018-09-06
Python爬蟲
Java爬蟲與Python爬蟲的區別？
2023-10-25
Java爬蟲Python
什麼是網路爬蟲?為什麼用Python寫爬蟲?
2021-03-08
爬蟲Python

【python爬蟲】用selenium爬時報錯UnicodeEncodeError: ‘gbk‘ codec can‘t encode character ‘\u2022‘

相關文章