Python 3的bytes/str之別

veldts發表於2012-02-26

原文：The bytes/str dichotomy in Python 3

Python 3最重要的新特性大概要算是對文字和二進位制資料作了更為清晰的區分。文字總是Unicode，由str型別表示，二進位制資料則由bytes型別表示。Python 3不會以任意隱式的方式混用str和bytes，正是這使得兩者的區分特別清晰。你不能拼接字串和位元組包，也無法在位元組包裡搜尋字串（反之亦然），也不能將字串傳入引數為位元組包的函式（反之亦然）。這是件好事。

不管怎樣，字串和位元組包之間的界線是必然的，下面的圖解非常重要，務請牢記於心：

enter image description here

字串可以編碼成位元組包，而位元組包可以解碼成字串。

>>> '€20'.encode('utf-8')
b'\xe2\x82\xac20'
>>> b'\xe2\x82\xac20'.decode('utf-8')
'€20'

這個問題要這麼來看：字串是文字的抽象表示。字串由字元組成，字元則是與任何特定二進位制表示無關的抽象實體。在操作字串時，我們生活在幸福的無知之中。我們可以對字串進行分割和分片，可以拼接和搜尋字串。我們並不關心它們內部是怎麼表示的，字串裡的每個字元要用幾個位元組儲存。只有在將字串編碼成位元組包（例如，為了在通道上傳送它們）或從位元組包解碼字串（反向操作）時，我們才會開始關注這點。

傳入encode和decode的引數是編碼（或codec）。編碼是一種用二進位制資料表示抽象字元的方式。目前有很多種編碼。上面給出的UTF-8是其中一種，下面是另一種：

>>> '€20'.encode('iso-8859-15')
b'\xa420'
>>> b'\xa420'.decode('iso-8859-15')
'€20'

編碼是這個轉換過程中至關重要的一部分。離了編碼，bytes物件b'\xa420'只是一堆位元位而已。編碼賦予其含義。採用不同的編碼，這堆位元位的含義就會大不同：

>>> b'\xa420'.decode('windows-1255')
'₪20'

據說百分之八十的金錢損失皆因使用錯誤的編碼導致，因此務必小心謹慎。

引申閱讀

阿呆學Unicode——編碼

Effective Python（3）- 瞭解 bytes 與 str 的區別
2021-11-13
Python
python str與bytes之間的轉換
2020-10-22
Python
Python報錯：TypeError: a bytes-like object is required, not ‘str‘
2020-12-09
PythonErrorObjectUI
Python基本資料型別之str
2016-10-13
Python資料型別
TypeError: can‘t concat str to bytes
2020-11-04
Error
char str[]和char *str的區別
2014-09-15
Python3之字串str、列表list、元組tuple的切片操作
2018-09-01
Python字串
python3去除str中的n、r
2017-06-12
Python
Python 編碼處理之 str與Unicode的區別與使用
2016-09-26
PythonUnicode
Python之str內部功能的介紹
2020-04-04
Python
Python中str()和repr()函式的區別
2017-12-25
Python函式
Python - 基本資料型別_str 字串
2021-07-18
Python資料型別字串
Rust中 String、str、&str、char 的區別
2024-07-10
Rust
python中的str和repr函式的區別
2019-01-06
Python函式
Python3 dict和str互轉
2020-12-09
Python
python3 將bytes轉為字串
2024-09-02
Python字串
Python3學習筆記1，基本資料型別-Number、str
2018-07-02
Python筆記資料型別
String str=null; 和String str=""的區別
2017-07-23
Null
Python str型別學習總結（一）
2020-10-20
Python型別
【python】str與json型別轉換
2018-05-18
PythonJSON型別
5.Python3原始碼—字串（str）物件
2018-06-06
Python原始碼字串物件
Python 字串 str
2020-03-02
Python字串
Python資料型別-str,list常見操作
2020-07-25
Python資料型別
Python 優雅程式設計之 str.format()
2017-03-12
Python程式設計ORM
python str.endswith
2018-03-16
Python
python: 理解__str__
2017-12-15
Python
深入瞭解python2.7 str(), repr(), (``操作符)的區別
2017-04-09
Python
解釋 Python 2 和 Python 3 的版本之間差別
2017-02-16
Python
python如何讓str排序
2021-09-11
Python排序
Python str() 引發的 UnicodeEncodeError
2018-03-06
PythonUnicodeError
Python的__str__和__repr__方法
2017-05-19
Python
python str dict list 轉換
2017-05-12
Python
詳解Python中的str.format方法
2021-09-11
PythonORM
Python 3 學習筆記之——資料型別
2018-10-23
Python筆記資料型別
python list tuple str dic series dataframe
2020-09-27
Python
python str.format高階用法
2024-03-31
PythonORM
[Python3] 關於Bytes與String 寫檔案遇到的編碼問題
2019-01-21
Python
Python基礎之四：Python3 基礎資料型別
2020-10-20
Python資料型別

Python 3的bytes/str之別

相關文章