Python 原始碼理解: '+=' 和 'xx = xx + xx' 的區別

發表於2017-08-07

原文網址 : http://python.jobbole.com/88282/

Python原始碼

前菜

在我們使用Python的過程, 很多時候會用到+運算, 例如:

a = 1 + 2

print a

# 輸出

不光在加法中使用, 在字串的拼接也同樣發揮這重要的作用, 例如:

a = 'abc' + 'efg'

print a

# 輸出

abcefg

同樣的, 在列表中也能使用, 例如:

a = [1, 2, 3] + [4, 5, 6]

print a

# 輸出

[1, 2, 3, 4, 5, 6]

為什麼上面不同的物件執行同一個+會有不同的效果呢? 這就涉及到+的過載, 然而這不是本文要討論的重點, 上面的只是前菜而已~~~

正文

先看一個例子:

num = 123

num = num + 4

print num

# 輸出

127

這段程式碼的用途很明確, 就是一個簡單的數字相加, 但是這樣似乎很繁瑣, 一點都Pythonic, 於是就有了下面的程式碼:

num = 123

num += 4

print num

# 輸出

127

哈, 這樣就很Pythonic了! 但是這種用法真的就是這麼好麼? 不一定. 看例子:

# coding: utf8

l = [1, 2]

l = l + [3, 4]

print l

# 輸出

[1, 2, 3, 4]

# ------------------------------------------

l = [1, 2]

l += [3, 4] # 列表的+被過載了, 左右運算元必須都是iterable物件, 否則會報錯

print l

# 輸出

[1, 2, 3, 4]

看起來結果都一樣嘛~, 但是真的一樣嗎? 我們改下程式碼再看下:

# coding: utf8

l = [1, 2]

print 'l之前的id: ', id(l)

l = l + [3, 4]

print 'l之後的id: ', id(l)

# 輸出

l之前的id: 40270024

l之後的id: 40389000

# ------------------------------------------

l = [1, 2]

print 'l之前的id: ', id(l)

l += [3, 4] # 列表的+被過載了, 左右運算元必須都是iterable物件, 否則會報錯

print 'l之後的id: ', id(l)

# 輸出

l之前的id: 40270024

l之後的id: 40270024

看到結果了嗎? 雖然結果一樣, 但是通過id的值表示, 運算前後, 第一種方法物件是不同的了, 而第二種還是同一個物件! 為什麼會這樣?

結果分析

先來看看位元組碼:

[root@test1 ~]# cat 2.py 
# coding: utf8
l = [1, 2]
l = l + [3, 4]
print l

l = [1, 2]
l += [3, 4]  
print l
[root@test1 ~]# python -m dis 2.py 
  2           0 LOAD_CONST               0 (1)
              3 LOAD_CONST               1 (2)
              6 BUILD_LIST               2
              9 STORE_NAME               0 (l)

3          12 LOAD_NAME                0 (l)
             15 LOAD_CONST               2 (3)
             18 LOAD_CONST               3 (4)
             21 BUILD_LIST               2
             24 BINARY_ADD          
             25 STORE_NAME               0 (l)

4          28 LOAD_NAME                0 (l)
             31 PRINT_ITEM          
             32 PRINT_NEWLINE

7          33 LOAD_CONST               0 (1)
             36 LOAD_CONST               1 (2)
             39 BUILD_LIST               2
             42 STORE_NAME               0 (l)

8          45 LOAD_NAME                0 (l)
             48 LOAD_CONST               2 (3)
             51 LOAD_CONST               3 (4)
             54 BUILD_LIST               2
             57 INPLACE_ADD         
             58 STORE_NAME               0 (l)

9          61 LOAD_NAME                0 (l)
             64 PRINT_ITEM          
             65 PRINT_NEWLINE       
             66 LOAD_CONST               4 (None)
             69 RETURN_VALUE

[root@test1 ~]# cat 2.py

# coding: utf8

l = [1, 2]

l = l + [3, 4]

print l

l = [1, 2]

l += [3, 4]

print l

[root@test1 ~]# python -m dis 2.py

2 0 LOAD_CONST 0 (1)

3 LOAD_CONST 1 (2)

6 BUILD_LIST 2

9 STORE_NAME 0 (l)

3 12 LOAD_NAME 0 (l)

15 LOAD_CONST 2 (3)

18 LOAD_CONST 3 (4)

21 BUILD_LIST 2

24 BINARY_ADD

25 STORE_NAME 0 (l)

4 28 LOAD_NAME 0 (l)

31 PRINT_ITEM

32 PRINT_NEWLINE

7 33 LOAD_CONST 0 (1)

36 LOAD_CONST 1 (2)

39 BUILD_LIST 2

42 STORE_NAME 0 (l)

8 45 LOAD_NAME 0 (l)

48 LOAD_CONST 2 (3)

51 LOAD_CONST 3 (4)

54 BUILD_LIST 2

57 INPLACE_ADD

58 STORE_NAME 0 (l)

9 61 LOAD_NAME 0 (l)

64 PRINT_ITEM

65 PRINT_NEWLINE

66 LOAD_CONST 4 (None)

69 RETURN_VALUE

在上訴的位元組碼, 我們著重需要看的是兩個: BINARY_ADD 和 INPLACE_ADD!

很明顯:
l = l + [3, 4, 5]　　　這種背後就是BINARY_ADD
l += [3, 4, 5]　　　　　這種背後就是INPLACE_ADD

深入理解

雖然兩個單詞差很遠, 但其實兩個的作用是很類似的, 最起碼前面一部分是, 為什麼這樣說, 請看原始碼:

# 取自ceva.c
# BINARY_ADD
TARGET_NOARG(BINARY_ADD)
        {
            w = POP();
            v = TOP();
            if (PyInt_CheckExact(v) && PyInt_CheckExact(w)) {    // 檢查左右運算元是否 int 型別
                /* INLINE: int + int */
                register long a, b, i;
                a = PyInt_AS_LONG(v);
                b = PyInt_AS_LONG(w);
                /* cast to avoid undefined behaviour
                   on overflow */
                i = (long)((unsigned long)a + b);
                if ((i^a) < 0 && (i^b) < 0)
                    goto slow_add;
                x = PyInt_FromLong(i);
            }
            else if (PyString_CheckExact(v) &&
                     PyString_CheckExact(w)) {                   // 檢查左右運算元是否 string 型別
                x = string_concatenate(v, w, f, next_instr);
                /* string_concatenate consumed the ref to v */
                goto skip_decref_vx;
            }
            else {
              slow_add:                                          // 兩者都不是, 請走這裡~
                x = PyNumber_Add(v, w);
            }
           ...(省略)

# INPLACE_ADD
TARGET_NOARG(INPLACE_ADD)
        {
            w = POP();
            v = TOP();
            if (PyInt_CheckExact(v) && PyInt_CheckExact(w)) {   // 檢查左右運算元是否 int 型別
                /* INLINE: int + int */
                register long a, b, i;
                a = PyInt_AS_LONG(v);
                b = PyInt_AS_LONG(w);
                i = a + b;
                if ((i^a) < 0 && (i^b) < 0)
                    goto slow_iadd;
                x = PyInt_FromLong(i);
            }
            else if (PyString_CheckExact(v) &&
                     PyString_CheckExact(w)) {                 // 檢查左右運算元是否 string 型別
                x = string_concatenate(v, w, f, next_instr);
                /* string_concatenate consumed the ref to v */
                goto skip_decref_v;
            }
            else {
              slow_iadd:                           
                x = PyNumber_InPlaceAdd(v, w);                 // 兩者都不是, 請走這裡~
            }
           ... (省略)

# 取自ceva.c

# BINARY_ADD

TARGET_NOARG(BINARY_ADD)

{

w = POP();

v = TOP();

if (PyInt_CheckExact(v) && PyInt_CheckExact(w)) { // 檢查左右運算元是否 int 型別

/* INLINE: int + int */

a = PyInt_AS_LONG(v);

b = PyInt_AS_LONG(w);

/* cast to avoid undefined behaviour

on overflow */

i = (long)((unsigned long)a + b);

if ((i^a) < 0 && (i^b) < 0)

goto slow_add;

x = PyInt_FromLong(i);

}

else if (PyString_CheckExact(v) &&

PyString_CheckExact(w)) { // 檢查左右運算元是否 string 型別

x = string_concatenate(v, w, f, next_instr);

/* string_concatenate consumed the ref to v */

goto skip_decref_vx;

}

else {

slow_add: // 兩者都不是, 請走這裡~

x = PyNumber_Add(v, w);

}

...(省略)

# INPLACE_ADD

TARGET_NOARG(INPLACE_ADD)

{

w = POP();

v = TOP();

if (PyInt_CheckExact(v) && PyInt_CheckExact(w)) { // 檢查左右運算元是否 int 型別

/* INLINE: int + int */

a = PyInt_AS_LONG(v);

b = PyInt_AS_LONG(w);

i = a + b;

if ((i^a) < 0 && (i^b) < 0)

goto slow_iadd;

x = PyInt_FromLong(i);

}

else if (PyString_CheckExact(v) &&

PyString_CheckExact(w)) { // 檢查左右運算元是否 string 型別

x = string_concatenate(v, w, f, next_instr);

/* string_concatenate consumed the ref to v */

goto skip_decref_v;

}

else {

slow_iadd:

x = PyNumber_InPlaceAdd(v, w); // 兩者都不是, 請走這裡~

}

... (省略)

從上面可以看出, 不管是BINARY_ADD 還是INPLACE_ADD, 他們都會有如下相同的操作:

1 2	檢查是不是都是`int`型別, 如果是, 直接返回兩個數值相加的結果檢查是不是都是`string`型別, 如果是, 直接返回字串拼接的結果

因為兩者的行為真的很類似, 所以在這著重講INPLACE_ADD, 對BINARY_ADD感興趣的童鞋可以在原始碼檔案: abstract.c, 搜尋: PyNumber_Add.實際上也就少了對列表之類物件的操作而已.

那我們接著繼續, 先貼個原始碼:

PyObject *

PyNumber_InPlaceAdd(PyObject *v, PyObject *w)

{

PyObject *result = binary_iop1(v, w, NB_SLOT(nb_inplace_add),

NB_SLOT(nb_add));

if (result == Py_NotImplemented) {

PySequenceMethods *m = v->ob_type->tp_as_sequence;

Py_DECREF(result);

if (m != NULL) {

binaryfunc f = NULL;

if (HASINPLACE(v))

f = m->sq_inplace_concat;

if (f == NULL)

f = m->sq_concat;

if (f != NULL)

return (*f)(v, w);

}

result = binop_type_error(v, w, "+=");

}

return result;

INPLACE_ADD本質上是對應著abstract.c檔案裡面的PyNumber_InPlaceAdd函式, 在這個函式中, 首先呼叫binary_iop1函式, 然後進而又呼叫了裡面的binary_op1函式, 這兩個函式很大一個篇幅, 都是針對ob_type->tp_as_number, 而我們目前是list, 所以他們的大部分操作, 都和我們的無關. 正因為無關, 所以這兩函式呼叫最後, 直接返回Py_NotImplemented, 而這個是用來幹嘛, 這個有大作用, 是列表相加的核心所在!

因為binary_iop1的呼叫結果是Py_NotImplemented, 所以下面的判斷成立, 開始尋找物件(也就是演示程式碼中l物件)的ob_type->tp_as_sequence屬性.

因為我們的物件是l(列表), 所以我們需要去PyList_type需找真相:

# 取自: listobject.c

PyTypeObject PyList_Type = {

... (省略)

&list_as_sequence, /* tp_as_sequence */

... (省略)

}

可以看出, 其實也就是直接取list_as_sequence, 而這個是什麼呢? 其實是一個結構體, 裡面存放了列表的部分功能函式.

static PySequenceMethods list_as_sequence = {

(lenfunc)list_length, /* sq_length */

(binaryfunc)list_concat, /* sq_concat */

(ssizeargfunc)list_repeat, /* sq_repeat */

(ssizeargfunc)list_item, /* sq_item */

(ssizessizeargfunc)list_slice, /* sq_slice */

(ssizeobjargproc)list_ass_item, /* sq_ass_item */

(ssizessizeobjargproc)list_ass_slice, /* sq_ass_slice */

(objobjproc)list_contains, /* sq_contains */

(binaryfunc)list_inplace_concat, /* sq_inplace_concat */

(ssizeargfunc)list_inplace_repeat, /* sq_inplace_repeat */

};

接下來就是一個判斷, 判斷我們們這個l物件是否有Py_TPFLAGS_HAVE_INPLACEOPS這個特性, 很明顯是有的, 所以就呼叫上步取到的結構體中的sq_inplace_concat函式, 那接下來呢? 肯定就是看看這個函式是幹嘛的:

list_inplace_concat(PyListObject *self, PyObject *other)

{

PyObject *result;

result = listextend(self, other); # 關鍵所在

if (result == NULL)

return result;

Py_DECREF(result);

Py_INCREF(self);

return (PyObject *)self;

}

終於找到關鍵了, 原來最後就是呼叫這個listextend函式, 這個和我們python層面的列表的extend方法很類似, 在這不細講了!

把PyNumber_InPlaceAdd的執行呼叫過程, 簡單整理下來就是:

INPLACE_ADD(位元組碼)

-> PyNumber_InPlaceAdd

-> 判斷是否數字: 如果是, 直接返回兩數相加

-> 判斷是否字串: 如果是, 直接返回`string_concatenate`的結果

-> 都不是:

-> binary_iop1 (判斷是否數字, 如果是則按照數字處理, 否則返回Py_NotImplemented)

-> binary_iop (判斷是否數字, 如果是則按照數字處理, 否則返回Py_NotImplemented)

-> 返回的結果是否 Py_NotImplemented:

-> 是:

-> 物件是否有Py_TPFLAGS_HAVE_INPLACEOPS:

-> 是: 呼叫物件的: sq_inplace_concat

-> 否: 呼叫物件的: sq_concat

-> 否: 報錯

所以在上面的結果, 第二種程式碼: l += [3,4,5], 我們看到的id值並沒有改變, 就是因為+=通過sq_inplace_concat呼叫了列表的listextend函式, 然後導致新列表以追加的方式去處理.

結論

現在我們大概明白了+=實際上是幹嘛了: 它應該能算是一個加強版的+, 因為它比+多了一個寫回本身的功能.不過是否能夠寫回本身, 還是得看物件自身是否支援, 也就是說是否具備Py_NotImplemented標識, 是否支援sq_inplace_concat, 如果具備, 才能實現, 否則, 也就是和 + 效果一樣而已.

JAVA -Xms -Xmx -XX:PermSize -XX:MaxPermSize 區別
2018-06-25
Java
HTTP的response code 1xx，2xx，3xx，4xx，5xx分別代表什麼
2019-01-07
HTTP
菜鳥學Python之 _, __ 和 __xx__的區別
2019-04-10
Python
appscan 遇到 xx.xx.xx.xx/8080/web/#/url，是不是就不能識別延伸的網址啦。。
2020-07-31
APPWeb
沙師弟學Python之 _, __ 和 __xx__的區別
2019-04-10
Python
Oracle案例08——xx.xx.xx.xx,表空間 SYSAUX 使用率>95%%
2018-06-19
OracleUX
簡介響應狀態碼1xx、2xx、5xx
2018-12-26
python中 _、__、__xx__() 區別及使用場景
2020-06-29
Python
navicat無法連線遠端的mysql--Host ‘xx.xx.xx.xx‘ is not allowed to connect to this MySQL server“
2024-09-10
MySqlServer
SharePlex reader missed marker wait for xx but got xx
2020-01-09
AIGo
Spring | xx-servlet.xml 和 applicationContext.xml 的區別
2021-10-23
SpringServletXMLAPPContext
[sth]xx
2019-05-10
Go 自定義日期時間格式解析解決方案 - 解決 `parsing time xx as xx: cannot parse xx as xx` 錯誤
2020-04-04
Go
Maven - Non-resolvable parent POM: Failure to find xx:xx:pom
2018-05-18
MavenAI
小知識點之 JVM -XX:MaxGCPauseMillis 與 -XX:GCTimeRatio
2020-11-29
JVMGC
【Nginx實戰】構建NGINX 4xx 5xx 狀態碼例項
2019-01-19
Nginx
XX外賣專案
2024-08-08
The directory xx is registered as a Git root
2020-11-27
Git
JD+XX金融的面試
2019-02-16
面試
-XX:PretenureSizeThreshold的預設值和作用淺析
2021-09-09
NRF528XX說明
2020-12-08
3xx HTTP狀態碼的終極指南
2023-01-10
HTTP
記錄oracle中查詢資料0.xx結果為.xx的解決辦法
2018-11-12
Oracle
xx開啟了朋友驗證
2018-06-28
【問題解決】java.sql.SQLException: null, message from server: “Host ‘xxx.xx.xx.xxx‘ is blocked because of
2020-11-15
JavaSQLExceptionNullServerBloC
記一次XX前端面試
2018-05-19
前端面試
Elasticsearch報Call to undefined xx makeAllSearchable()錯誤
2021-11-06
ElasticsearchUndefined
windows下pip install xx 遇到的 ConnectionResetError問題
2018-09-08
WindowsError
"你們不要學python，python執行效率慢，去學XX吧"
2019-01-08
Python
PHP程式報錯：PHP Notice: Undefined index: XX in
2021-04-28
PHPUndefinedIndex
各種HTTP 3xx重定向狀態碼介紹
2021-11-09
HTTP
Angular 錯誤訊息：ERROR Error NullInjectorError No provider for XX
2023-04-02
AngularErrorNullIDE
深圳XX機器人前端開發面試
2020-12-16
機器人前端面試
面試常問有關響應狀態碼3XX
2018-12-25
面試
windows下使用mingw和msvc靜態編譯Qt5.15.xx
2021-01-18
Windows編譯QT
windows下用XX-Net+ipv6+Chrome翻牆
2018-10-27
WindowsChrome
STM32F4XX LWIP+freeRTOS移植(一)
2018-08-01
我天！xx.equals(null) 是什麼騷操作？？
2020-06-01
Null
你會去玩一款“類XX”遊戲嗎？
2020-02-12
遊戲

Python 原始碼理解: '+=' 和 'xx = xx + xx' 的區別

前菜

正文

結果分析

深入理解

結論

相關文章