做一個位元組碼追蹤器，從內部理解 Python 的執行過程

jasper發表於2015-06-22

最近我在研究 Python 的執行模型。我對 Python 內部的東西挺好奇，比如：類似 YIELDVALUE 和 YIELDFROM 此類操作碼的實現；列表表示式、生成器表示式以及一些有趣的Python 特性是怎麼編譯的；異常觸發之時，位元組碼層面發生了什麼。

閱讀 CPython 程式碼是相當有益的，但是我覺得要完全理解位元組碼的執行和堆疊的變化，光讀原始碼是遠遠不夠的。GDB 是個好選擇，但我很懶，只想寫一些高階的介面和 Python 程式碼。

因此我想做一個位元組碼級別的追蹤 API，就像 sys.settrace 所提供的那樣，但顆粒度更出色。這種練習完美地鍛鍊了我將 C 轉化為 Python 的能力。我們所需的有以下幾點：

一個新的CPython直譯器操作碼
一種將操作碼注入Python位元組碼的方法
一些Python程式碼，用於在Python的角度處理操作碼

注：在這篇文章中，Python版本是3.5

一種新的CPython操作碼

我們的新操作碼：DEBUG_OP

這個新的操作碼DEBUG_OP是我第一次嘗試用C程式碼來實現CPython。我會盡量使之保持簡潔。

我想要達到的目的是，無論我的操作碼何時執行，都有一種方式呼叫一些Python程式碼，與此同時，我們也想能夠追蹤一些與執行上下文有關的資料。我們的操作碼會把這些資訊當作引數傳遞給我們的回撥函式。我能辨識出的有用資訊如下：

堆疊的內容
執行DEBUG_OP的幀物件資訊

因此我們 DEBUG_OP 所需做的所有事情是：

找到回撥函式
建立堆疊內容的列表
呼叫回撥函式，並將堆疊列表和當前幀作為引數傳給它

聽起來挺簡單啊，讓我們開始吧！

宣告：以下的解釋和程式碼都是經過大量段錯誤得到的。首先要做的事情，就是給我們的操作碼命名並賦值，因此我們需要在Include/opcode.h中新增

/** My own comments begin by '**' **/
/** From: Includes/opcode.h **/

/* Instruction opcodes for compiled code */

/** We just have to define our opcode with a free value
    0 was the first one I found **/
#define DEBUG_OP                0

#define POP_TOP                 1
#define ROT_TWO                 2
#define ROT_THREE               3

/** My own comments begin by '**' **/

/** From: Includes/opcode.h **/

/* Instruction opcodes for compiled code */

/** We just have to define our opcode with a free value

0 was the first one I found **/

#define DEBUG_OP 0

#define POP_TOP 1

#define ROT_TWO 2

#define ROT_THREE 3

這簡單的部分是完成了，現在我們必須真正去編寫我們的操作碼。

實現 DEBUG_OP

在考慮實現DEBUG_OP之前，我們需要問我們自己的第一個問題是：“我的介面應該是什麼樣的？”

擁有一個可以呼叫其他程式碼的新操作碼是很酷的，但是它實際上會呼叫哪些程式碼呢？這個操作碼怎麼找到回撥函式呢？我選擇了一種看起來最簡單的解決方案，在幀的全域性區域寫死函式名。

現在問題就變成了：“我怎麼從一個字典中找到一個不變的C字串？”

為了回答這個問題，我們可以尋找一些用在Python的main迴圈中的用到的和上下文管理相關的識別符號**enter**和**exit**。

我們可以看到識別符號被用在 SETUP_WITH 操作碼中。

/** From: Python/ceval.c **/
TARGET(SETUP_WITH) {
_Py_IDENTIFIER(__exit__);
_Py_IDENTIFIER(__enter__);
PyObject *mgr = TOP();
PyObject *exit = special_lookup(mgr, &PyId___exit__), *enter;
PyObject *res;

/** From: Python/ceval.c **/

TARGET(SETUP_WITH) {

_Py_IDENTIFIER(__exit__);

_Py_IDENTIFIER(__enter__);

PyObject *mgr = TOP();

PyObject *exit = special_lookup(mgr, &PyId___exit__), *enter;

PyObject *res;

現在，看一下_Py_IDENTIFIER 的巨集定義：

/** From: Include/object.h **/

/********************* String Literals ****************************************/
/* This structure helps managing static strings. The basic usage goes like this:
   Instead of doing

       r = PyObject_CallMethod(o, "foo", "args", ...);

   do

       _Py_IDENTIFIER(foo);
       ...
       r = _PyObject_CallMethodId(o, &PyId_foo, "args", ...);

   PyId_foo is a static variable, either on block level or file level. On first
   usage, the string "foo" is interned, and the structures are linked. On interpreter
   shutdown, all strings are released (through _PyUnicode_ClearStaticStrings).

   Alternatively, _Py_static_string allows to choose the variable name.
   _PyUnicode_FromId returns a borrowed reference to the interned string.
   _PyObject_{Get,Set,Has}AttrId are __getattr__ versions using _Py_Identifier*.
*/
typedef struct _Py_Identifier {
    struct _Py_Identifier *next;
    const char* string;
    PyObject *object;
} _Py_Identifier;

#define _Py_static_string_init(value) { 0, value, 0 }
#define _Py_static_string(varname, value)  static _Py_Identifier varname = _Py_static_string_init(value)
#define _Py_IDENTIFIER(varname) _Py_static_string(PyId_##varname, #varname)

/** From: Include/object.h **/

/********************* String Literals ****************************************/

/* This structure helps managing static strings. The basic usage goes like this:

Instead of doing

r = PyObject_CallMethod(o, "foo", "args", ...);

_Py_IDENTIFIER(foo);

...

r = _PyObject_CallMethodId(o, &PyId_foo, "args", ...);

PyId_foo is a static variable, either on block level or file level. On first

usage, the string "foo" is interned, and the structures are linked. On interpreter

shutdown, all strings are released (through _PyUnicode_ClearStaticStrings).

Alternatively, _Py_static_string allows to choose the variable name.

_PyUnicode_FromId returns a borrowed reference to the interned string.

_PyObject_{Get,Set,Has}AttrId are __getattr__ versions using _Py_Identifier*.

typedef struct _Py_Identifier {

struct _Py_Identifier *next;

const char* string;

PyObject *object;

} _Py_Identifier;

#define _Py_static_string_init(value) { 0, value, 0 }

#define _Py_static_string(varname, value) static _Py_Identifier varname = _Py_static_string_init(value)

#define _Py_IDENTIFIER(varname) _Py_static_string(PyId_##varname, #varname)

很好，至少註釋部分已經說明得很清楚了。通過一番查詢，我們發現了可以用來從字典找固定字串的函式 _PyDict_GetItemId，所以我們操作碼的查詢部分的程式碼就是這樣的：

/** Our callback function will be named op_target **/
PyObject *target = NULL;
_Py_IDENTIFIER(op_target);
target = _PyDict_GetItemId(f->f_globals, &PyId_op_target);
if (target == NULL && _PyErr_OCCURRED()) {
    if (!PyErr_ExceptionMatches(PyExc_KeyError))
        goto error;
    PyErr_Clear();
    DISPATCH();
}

/** Our callback function will be named op_target **/

PyObject *target = NULL;

_Py_IDENTIFIER(op_target);

target = _PyDict_GetItemId(f->f_globals, &PyId_op_target);

if (target == NULL && _PyErr_OCCURRED()) {

if (!PyErr_ExceptionMatches(PyExc_KeyError))

goto error;

PyErr_Clear();

DISPATCH();

}

為了方便理解，我來解釋一下這段程式碼：

f 是當前的幀，f->f_globals 是它的全域性區域
如果我們沒有找到 op_target，我們需要檢查這個異常是不是 KeyError
goto error; 是一種在 main-loop 中丟擲異常的方法

PyErr_Clear() 抑制了當前異常，DISPATCH() 觸發了下一個操作碼的執行下一步是收集我們想要的堆疊資訊。

/** This code create a list with all the values on the current stack **/
PyObject *value = PyList_New(0);
for (i = 1 ; i <= STACK_LEVEL(); i++) {
    tmp = PEEK(i);
    if (tmp == NULL) {
        tmp = Py_None;
    }
    PyList_Append(value, tmp);
}

/** This code create a list with all the values on the current stack **/

PyObject *value = PyList_New(0);

for (i = 1 ; i <= STACK_LEVEL(); i++) {

tmp = PEEK(i);

if (tmp == NULL) {

tmp = Py_None;

}

PyList_Append(value, tmp);

}

最後一步是呼叫回撥函式，我們需要使用 call_function，通過研究操作碼 CALL_FUNCTION 來學習怎麼使用 call_function。

/** From: Python/ceval.c **/
TARGET(CALL_FUNCTION) {
    PyObject **sp, *res;
    /** stack_pointer is a local of the main loop.
        It's the pointer to the stacktop of our frame **/
    sp = stack_pointer;
    res = call_function(&sp, oparg);
    /** call_function handles the args it consummed on the stack for us **/
    stack_pointer = sp;
    PUSH(res);
    /** Standard exception handling **/
    if (res == NULL)
        goto error;
    DISPATCH();
}

/** From: Python/ceval.c **/

TARGET(CALL_FUNCTION) {

PyObject **sp, *res;

/** stack_pointer is a local of the main loop.

It's the pointer to the stacktop of our frame **/

sp = stack_pointer;

res = call_function(&sp, oparg);

/** call_function handles the args it consummed on the stack for us **/

stack_pointer = sp;

PUSH(res);

/** Standard exception handling **/

if (res == NULL)

goto error;

DISPATCH();

}

有了這些資訊，我們就能夠精心地完成 DEBUG_OP:

TARGET(DEBUG_OP) {
    PyObject *value = NULL;
    PyObject *target = NULL;
    PyObject *res = NULL;
    PyObject **sp = NULL;
    PyObject *tmp;
    int i;
    _Py_IDENTIFIER(op_target);

    target = _PyDict_GetItemId(f->f_globals, &PyId_op_target);
    if (target == NULL && _PyErr_OCCURRED()) {
        if (!PyErr_ExceptionMatches(PyExc_KeyError))
            goto error;
        PyErr_Clear();
        DISPATCH();
    }
    value = PyList_New(0);
    Py_INCREF(target);
    for (i = 1 ; i <= STACK_LEVEL(); i++) {
        tmp = PEEK(i);
        if (tmp == NULL)
            tmp = Py_None;
        PyList_Append(value, tmp);
    }

    PUSH(target);
    PUSH(value);
    Py_INCREF(f);
    PUSH(f);
    sp = stack_pointer;
    res = call_function(&sp, 2);
    stack_pointer = sp;
    if (res == NULL)
        goto error;
    Py_DECREF(res);
    DISPATCH();
}

TARGET(DEBUG_OP) {

PyObject *value = NULL;

PyObject *target = NULL;

PyObject *res = NULL;

PyObject **sp = NULL;

PyObject *tmp;

int i;

_Py_IDENTIFIER(op_target);

target = _PyDict_GetItemId(f->f_globals, &PyId_op_target);

if (target == NULL && _PyErr_OCCURRED()) {

if (!PyErr_ExceptionMatches(PyExc_KeyError))

goto error;

PyErr_Clear();

DISPATCH();

}

value = PyList_New(0);

Py_INCREF(target);

for (i = 1 ; i <= STACK_LEVEL(); i++) {

tmp = PEEK(i);

if (tmp == NULL)

tmp = Py_None;

PyList_Append(value, tmp);

}

PUSH(target);

PUSH(value);

Py_INCREF(f);

PUSH(f);

sp = stack_pointer;

res = call_function(&sp, 2);

stack_pointer = sp;

if (res == NULL)

goto error;

Py_DECREF(res);

DISPATCH();

}

因為我在編寫 CPython 實現 C 程式碼方面沒有太多的經驗，，所以我可能漏掉了一些（我期待你的反饋）

編譯通過！完成了！

看起來一切順利，但是當我們嘗試去執行 DEBUG_OP 時卻失敗了。自 2008 年以來，Python 使用事先完成的 GOTO（你可以從這裡讀取更多資訊），因此我們需要更新下 goto jump table，我們僅需要在 Python/opcode_targets.h 中做如下修改：

/** From: Python/opcode_targets.h **/
/** Easy change since DEBUG_OP is the opcode number 1 **/
static void *opcode_targets[256] = {
    //&&_unknown_opcode,
    &&TARGET_DEBUG_OP,
    &&TARGET_POP_TOP,
    /** ... **/

/** From: Python/opcode_targets.h **/

/** Easy change since DEBUG_OP is the opcode number 1 **/

static void *opcode_targets[256] = {

//&&_unknown_opcode,

&&TARGET_DEBUG_OP,

&&TARGET_POP_TOP,

/** ... **/

搞定了，現在我們擁有一個全新的可以工作的操作碼，唯一的問題是，我們的操作碼永遠不會被呼叫，因為不存在於編譯好的位元組碼中。現在我們需要在一些函式的位元組碼中注入 DEBUG_OP。

將操作碼 DEBUG_OP 注入到 Python 位元組碼中

下面是一些把新的操作碼插入 Python 位元組碼中的方法。

我們可以像 Quarkslab 那樣用 peephole optimizer
我們可以在生成位元組碼時做些改變
我們可以僅僅修改一些執行時的函式的位元組碼（這其實就是我們將要做的）

為了編寫出新的操作碼，有了上面的C程式碼就足夠了，讓我們回到起點，理解奇怪而神奇的Python！

So, what we are going to do is:

因此，我們將要做下面這些事兒：

得到我們想要追蹤的code object
重寫位元組碼來注入DEBUG_OP
將新的code object替換回去

關於 code object 的提示

如果你聽說過 code object，在我第一篇文章裡有一點介紹。在網上也有一些相關文件，可以直接用 Ctrl+F 查詢“code objects”

在這篇文章中，還有一件需要注意的事情是，code objects不能改變：

Python 3.4.2 (default, Oct  8 2014, 10:45:20)
[GCC 4.9.1] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> x = lambda y : 2
>>> x.__code__
<code object <lambda> at 0x7f481fd88390, file "<stdin>", line 1>
>>> x.__code__.co_name
'<lambda>'
>>> x.__code__.co_name = 'truc'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: readonly attribute
>>> x.__code__.co_consts = ('truc',)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: readonly attribute

Python 3.4.2 (default, Oct 8 2014, 10:45:20)

[GCC 4.9.1] on linux

Type "help", "copyright", "credits" or "license" for more information.

>>> x = lambda y : 2

>>> x.__code__

>>> x.__code__.co_name

'<lambda>'

>>> x.__code__.co_name = 'truc'

Traceback (most recent call last):

File "<stdin>", line 1, in <module>

AttributeError: readonly attribute

>>> x.__code__.co_consts = ('truc',)

Traceback (most recent call last):

File "<stdin>", line 1, in <module>

AttributeError: readonly attribute

但是不用擔心，我們會找到方法繞過這個問題。

所用工具

為了修改這些位元組碼，我們將需要一些工具：

dist模組用來反編譯和分析位元組碼
dis.Bytecode是Python3.4的新特性，對於反編譯和分析位元組碼特別有用
簡單修改code object的工具dis.

dis.Bytecode反編譯一個code object，可以給我們一些關於操作碼，引數和上下文有用的資訊。

# Python3.4
>>> import dis
>>> f = lambda x: x + 3
>>> for i in dis.Bytecode(f.__code__): print (i)
...
Instruction(opname='LOAD_FAST', opcode=124, arg=0, argval='x', argrepr='x', offset=0, starts_line=1, is_jump_target=False)
Instruction(opname='LOAD_CONST', opcode=100, arg=1, argval=3, argrepr='3', offset=3, starts_line=None, is_jump_target=False)
Instruction(opname='BINARY_ADD', opcode=23, arg=None, argval=None, argrepr='', offset=6, starts_line=None, is_jump_target=False)
Instruction(opname='RETURN_VALUE', opcode=83, arg=None, argval=None, argrepr='', offset=7, starts_line=None, is_jump_target=False)

# Python3.4

>>> import dis

>>> f = lambda x: x + 3

>>> for i in dis.Bytecode(f.__code__): print (i)

...

Instruction(opname='LOAD_FAST', opcode=124, arg=0, argval='x', argrepr='x', offset=0, starts_line=1, is_jump_target=False)

Instruction(opname='LOAD_CONST', opcode=100, arg=1, argval=3, argrepr='3', offset=3, starts_line=None, is_jump_target=False)

Instruction(opname='BINARY_ADD', opcode=23, arg=None, argval=None, argrepr='', offset=6, starts_line=None, is_jump_target=False)

Instruction(opname='RETURN_VALUE', opcode=83, arg=None, argval=None, argrepr='', offset=7, starts_line=None, is_jump_target=False)

為了能夠修改code objects，我建立了一個class，用來複制code object，並允許根據我們的需要修改相應的值，然後生成新的code object。

class MutableCodeObject(object):
    args_name = ("co_argcount", "co_kwonlyargcount", "co_nlocals", "co_stacksize", "co_flags", "co_code",
                  "co_consts", "co_names", "co_varnames", "co_filename", "co_name", "co_firstlineno",
                   "co_lnotab", "co_freevars", "co_cellvars")

    def __init__(self, initial_code):
        self.initial_code = initial_code
        for attr_name in self.args_name:
            attr = getattr(self.initial_code, attr_name)
            if isinstance(attr, tuple):
                attr = list(attr)
            setattr(self, attr_name, attr)

    def get_code(self):
        args = []
        for attr_name in self.args_name:
            attr = getattr(self, attr_name)
            if isinstance(attr, list):
                attr = tuple(attr)
            args.append(attr)
        return self.initial_code.__class__(*args)

class MutableCodeObject(object):

args_name = ("co_argcount", "co_kwonlyargcount", "co_nlocals", "co_stacksize", "co_flags", "co_code",

"co_consts", "co_names", "co_varnames", "co_filename", "co_name", "co_firstlineno",

"co_lnotab", "co_freevars", "co_cellvars")

def __init__(self, initial_code):

self.initial_code = initial_code

for attr_name in self.args_name:

attr = getattr(self.initial_code, attr_name)

if isinstance(attr, tuple):

attr = list(attr)

setattr(self, attr_name, attr)

def get_code(self):

args = []

for attr_name in self.args_name:

attr = getattr(self, attr_name)

if isinstance(attr, list):

attr = tuple(attr)

args.append(attr)

return self.initial_code.__class__(*args)

很容易使用，並解決了上面說的 code object 不可變的問題

>>> x = lambda y : 2
>>> m = MutableCodeObject(x.__code__)
>>> m
<new_code.MutableCodeObject object at 0x7f3f0ea546a0>
>>> m.co_consts
[None, 2]
>>> m.co_consts[1] = '3'
>>> m.co_name = 'truc'
>>> m.get_code()
<code object truc at 0x7f3f0ea2bc90, file "<stdin>", line 1>

>>> x = lambda y : 2

>>> m = MutableCodeObject(x.__code__)

>>> m

<new_code.MutableCodeObject object at 0x7f3f0ea546a0>

>>> m.co_consts

[None, 2]

>>> m.co_consts[1] = '3'

>>> m.co_name = 'truc'

>>> m.get_code()

測試新的操作碼

現在我們有了注入DEBUG_OP的基本工具，我們來驗證實現是否可用。

將操作碼加入到一個最簡單的函式中：

from new_code import MutableCodeObject

def op_target(*args):
    print("WOOT")
    print("op_target called with args <{0}>".format(args))

def nop():
    pass

new_nop_code = MutableCodeObject(nop.__code__)
new_nop_code.co_code = b"\x00" + new_nop_code.co_code[0:3] + b"\x00" + new_nop_code.co_code[-1:]
new_nop_code.co_stacksize += 3

nop.__code__ = new_nop_code.get_code()

import dis
dis.dis(nop)
nop()

# Don't forget that ./python is our custom Python implementing DEBUG_OP
hakril@computer ~/python/CPython3.5 % ./python proof.py
  8           0 <0>
              1 LOAD_CONST               0 (None)
              4 <0>
              5 RETURN_VALUE
WOOT
op_target called with args <([], <frame object at 0x7fde9eaebdb0>)>
WOOT
op_target called with args <([None], <frame object at 0x7fde9eaebdb0>)>

from new_code import MutableCodeObject

def op_target(*args):

print("WOOT")

print("op_target called with args <{0}>".format(args))

def nop():

pass

new_nop_code = MutableCodeObject(nop.__code__)

new_nop_code.co_code = b"\x00" + new_nop_code.co_code[0:3] + b"\x00" + new_nop_code.co_code[-1:]

new_nop_code.co_stacksize += 3

nop.__code__ = new_nop_code.get_code()

import dis

dis.dis(nop)

nop()

# Don't forget that ./python is our custom Python implementing DEBUG_OP

hakril@computer ~/python/CPython3.5 % ./python proof.py

8 0 <0>

1 LOAD_CONST 0 (None)

4 <0>

5 RETURN_VALUE

WOOT

op_target called with args <([], <frame object at 0x7fde9eaebdb0>)>

WOOT

op_target called with args <([None], <frame object at 0x7fde9eaebdb0>)>

好像成功了！有一行程式碼需要解釋一下：new_nop_code.co_stacksize += 3:

Co_stacksize表示code object所需的堆疊大小
DEBUG_OP增加了3個值到堆疊中，因此我們需要增加預留空間

現在我們可以將我們的操作碼注入到每一個Python函式中了！

重寫位元組碼

就像我們在上一個例子中看到的，重寫Python位元組碼聽起來很簡單！為了在每一操作碼之間注入DEBUG _OP，所有我們必須獲取每一個操作碼的偏移量（把我們操作碼注入到引數上是有問題的），然後將操作碼注入到這些偏移量中。偏移量很容易獲取，使用dis.Bytecode就行。

如下所示：

def add_debug_op_everywhere(code_obj):
    # We get every instruction offset in the code object
    offsets = [instr.offset for instr in dis.Bytecode(code_obj)]
    # And insert a DEBUG_OP at every offset
    return insert_op_debug_list(code_obj, offsets)

def insert_op_debug_list(code, offsets):
    # We insert the DEBUG_OP one by one
    for nb, off in enumerate(sorted(offsets)):
        # Need to ajust the offsets by the number of opcodes already inserted before
        # That's why we sort our offsets!
        code = insert_op_debug(code, off + nb)
    return code

# Last problem: what does insert_op_debug looks like?

def add_debug_op_everywhere(code_obj):

# We get every instruction offset in the code object

offsets = [instr.offset for instr in dis.Bytecode(code_obj)]

# And insert a DEBUG_OP at every offset

return insert_op_debug_list(code_obj, offsets)

def insert_op_debug_list(code, offsets):

# We insert the DEBUG_OP one by one

for nb, off in enumerate(sorted(offsets)):

# Need to ajust the offsets by the number of opcodes already inserted before

# That's why we sort our offsets!

code = insert_op_debug(code, off + nb)

return code

# Last problem: what does insert_op_debug looks like?

基於上面的例子，有人可能會認為我們的insert_op_debug會在指定的偏移量增加一個”x00″，這是個坑啊！在第一個 DEBUG_OP 注入的例子中，被注入的函式是沒有任何分支的，為了使 insert_op_debug 有完美的功能，我們需要考慮到存在分支操作碼的情況。

Python 的分支一共有兩種：

絕對分支：看起來是這樣的 Instruction_Pointer = argument(instruction)
相對分支：看起來是這樣的 Instruction_Pointer += argument(instruction)
相對分支總是向前的

我們希望這些分支在插入操作碼之後仍然能夠正常工作，為此我們需要修改一些指令引數。以下是我用的邏輯：

對於每一個在插入偏移量之前的相對分支而言：

如果目標地址是嚴格大於我們的插入偏移量，將指令引數增加 1
如果相等，則不需要增加 1 就能夠在跳轉操作和目標地址之間執行DEBUG_OP
如果小於，插入DEBUG_OP並不會影響到跳轉操作和目標地址之間的距離

對於 code object 中的每一個絕對分支而言

如果目標地址是嚴格大於我們的插入偏移量的話，將指令引數增加 1
如果相等，那麼不需要任何修改，理由和相對分支部分是一樣的
如果小於，插入DEBUG_OP並不會影響到跳轉操作和目標地址之間的距離

下面是實現：

# Helper
def bytecode_to_string(bytecode):
    if bytecode.arg is not None:
        return struct.pack("&lt;Bh", bytecode.opcode, bytecode.arg)
    return struct.pack("&lt;B", bytecode.opcode)

# Dummy class for bytecode_to_string
class DummyInstr:
    def __init__(self, opcode, arg):
        self.opcode = opcode
        self.arg = arg

def insert_op_debug(code, offset):
    opcode_jump_rel = ['FOR_ITER', 'JUMP_FORWARD', 'SETUP_LOOP', 'SETUP_WITH', 'SETUP_EXCEPT', 'SETUP_FINALLY']
    opcode_jump_abs = ['POP_JUMP_IF_TRUE', 'POP_JUMP_IF_FALSE', 'JUMP_ABSOLUTE']
    res_codestring = b""
    inserted = False
    for instr in dis.Bytecode(code):
        if instr.offset == offset:
            res_codestring += b"x00"
            inserted = True
        if instr.opname in opcode_jump_rel and not inserted: #relative jump are always forward
            if offset &lt; instr.offset + 3 + instr.arg: # inserted beetwen jump and dest: add 1 to dest (3 for size)
                #If equal: jump on DEBUG_OP to get info before exec instr
                res_codestring += bytecode_to_string(DummyInstr(instr.opcode, instr.arg + 1))
                continue
        if instr.opname in opcode_jump_abs:
            if instr.arg &gt; offset:
                res_codestring += bytecode_to_string(DummyInstr(instr.opcode, instr.arg + 1))
                continue
        res_codestring += bytecode_to_string(instr)
    # replace_bytecode just replaces the original code co_code
    return replace_bytecode(code, res_codestring)

# Helper

def bytecode_to_string(bytecode):

if bytecode.arg is not None:

return struct.pack("<Bh", bytecode.opcode, bytecode.arg)

return struct.pack("<B", bytecode.opcode)

# Dummy class for bytecode_to_string

class DummyInstr:

def __init__(self, opcode, arg):

self.opcode = opcode

self.arg = arg

def insert_op_debug(code, offset):

opcode_jump_rel = ['FOR_ITER', 'JUMP_FORWARD', 'SETUP_LOOP', 'SETUP_WITH', 'SETUP_EXCEPT', 'SETUP_FINALLY']

opcode_jump_abs = ['POP_JUMP_IF_TRUE', 'POP_JUMP_IF_FALSE', 'JUMP_ABSOLUTE']

res_codestring = b""

inserted = False

for instr in dis.Bytecode(code):

if instr.offset == offset:

res_codestring += b"x00"

inserted = True

if instr.opname in opcode_jump_rel and not inserted: #relative jump are always forward

if offset < instr.offset + 3 + instr.arg: # inserted beetwen jump and dest: add 1 to dest (3 for size)

#If equal: jump on DEBUG_OP to get info before exec instr

res_codestring += bytecode_to_string(DummyInstr(instr.opcode, instr.arg + 1))

continue

if instr.opname in opcode_jump_abs:

if instr.arg > offset:

res_codestring += bytecode_to_string(DummyInstr(instr.opcode, instr.arg + 1))

continue

res_codestring += bytecode_to_string(instr)

# replace_bytecode just replaces the original code co_code

return replace_bytecode(code, res_codestring)

我們可以看到結果如下：

>>> def lol(x):
...     for i in range(10):
...         if x == i:
...             break

>>> dis.dis(lol)
101           0 SETUP_LOOP              36 (to 39)
              3 LOAD_GLOBAL              0 (range)
              6 LOAD_CONST               1 (10)
              9 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
             12 GET_ITER
        >>   13 FOR_ITER                22 (to 38)
             16 STORE_FAST               1 (i)

102          19 LOAD_FAST                0 (x)
             22 LOAD_FAST                1 (i)
             25 COMPARE_OP               2 (==)
             28 POP_JUMP_IF_FALSE       13

103          31 BREAK_LOOP
             32 JUMP_ABSOLUTE           13
             35 JUMP_ABSOLUTE           13
        >>   38 POP_BLOCK
        >>   39 LOAD_CONST               0 (None)
             42 RETURN_VALUE
>>> lol.__code__ = transform_code(lol.__code__, add_debug_op_everywhere, add_stacksize=3)

>>> dis.dis(lol)
101           0 <0>
              1 SETUP_LOOP              50 (to 54)
              4 <0>
              5 LOAD_GLOBAL              0 (range)
              8 <0>
              9 LOAD_CONST               1 (10)
             12 <0>
             13 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
             16 <0>
             17 GET_ITER
        >>   18 <0>

102          19 FOR_ITER                30 (to 52)
             22 <0>
             23 STORE_FAST               1 (i)
             26 <0>
             27 LOAD_FAST                0 (x)
             30 <0>

103          31 LOAD_FAST                1 (i)
             34 <0>
             35 COMPARE_OP               2 (==)
             38 <0>
             39 POP_JUMP_IF_FALSE       18
             42 <0>
             43 BREAK_LOOP
             44 <0>
             45 JUMP_ABSOLUTE           18
             48 <0>
             49 JUMP_ABSOLUTE           18
        >>   52 <0>
             53 POP_BLOCK
        >>   54 <0>
             55 LOAD_CONST               0 (None)
             58 <0>
             59 RETURN_VALUE

# Setup the simplest handler EVER
>>> def op_target(stack, frame):
...     print (stack)

# GO
>>> lol(2)
[]
[]
[<class 'range'>]
[10, <class 'range'>]
[range(0, 10)]
[<range_iterator object at 0x7f1349afab80>]
[0, <range_iterator object at 0x7f1349afab80>]
[<range_iterator object at 0x7f1349afab80>]
[2, <range_iterator object at 0x7f1349afab80>]
[0, 2, <range_iterator object at 0x7f1349afab80>]
[False, <range_iterator object at 0x7f1349afab80>]
[<range_iterator object at 0x7f1349afab80>]
[1, <range_iterator object at 0x7f1349afab80>]
[<range_iterator object at 0x7f1349afab80>]
[2, <range_iterator object at 0x7f1349afab80>]
[1, 2, <range_iterator object at 0x7f1349afab80>]
[False, <range_iterator object at 0x7f1349afab80>]
[<range_iterator object at 0x7f1349afab80>]
[2, <range_iterator object at 0x7f1349afab80>]
[<range_iterator object at 0x7f1349afab80>]
[2, <range_iterator object at 0x7f1349afab80>]
[2, 2, <range_iterator object at 0x7f1349afab80>]
[True, <range_iterator object at 0x7f1349afab80>]
[<range_iterator object at 0x7f1349afab80>]
[]
[None]

>>> def lol(x):

... for i in range(10):

... if x == i:

... break

>>> dis.dis(lol)

101 0 SETUP_LOOP 36 (to 39)

3 LOAD_GLOBAL 0 (range)

6 LOAD_CONST 1 (10)

9 CALL_FUNCTION 1 (1 positional, 0 keyword pair)

12 GET_ITER

>> 13 FOR_ITER 22 (to 38)

16 STORE_FAST 1 (i)

102 19 LOAD_FAST 0 (x)

22 LOAD_FAST 1 (i)

25 COMPARE_OP 2 (==)

28 POP_JUMP_IF_FALSE 13

103 31 BREAK_LOOP

32 JUMP_ABSOLUTE 13

35 JUMP_ABSOLUTE 13

>> 38 POP_BLOCK

>> 39 LOAD_CONST 0 (None)

42 RETURN_VALUE

>>> lol.__code__ = transform_code(lol.__code__, add_debug_op_everywhere, add_stacksize=3)

>>> dis.dis(lol)

101 0 <0>

1 SETUP_LOOP 50 (to 54)

4 <0>

5 LOAD_GLOBAL 0 (range)

8 <0>

9 LOAD_CONST 1 (10)

12 <0>

13 CALL_FUNCTION 1 (1 positional, 0 keyword pair)

16 <0>

17 GET_ITER

>> 18 <0>

102 19 FOR_ITER 30 (to 52)

22 <0>

23 STORE_FAST 1 (i)

26 <0>

27 LOAD_FAST 0 (x)

30 <0>

103 31 LOAD_FAST 1 (i)

34 <0>

35 COMPARE_OP 2 (==)

38 <0>

39 POP_JUMP_IF_FALSE 18

42 <0>

43 BREAK_LOOP

44 <0>

45 JUMP_ABSOLUTE 18

48 <0>

49 JUMP_ABSOLUTE 18

>> 52 <0>

53 POP_BLOCK

>> 54 <0>

55 LOAD_CONST 0 (None)

58 <0>

59 RETURN_VALUE

# Setup the simplest handler EVER

>>> def op_target(stack, frame):

... print (stack)

# GO

>>> lol(2)

[]

[<class 'range'>]

[10, <class 'range'>]

[range(0, 10)]