Python 安全編碼指南

發表於2015-11-14

Python

0x00 前言

這個pdf中深入Python的核心庫進行分析，並且探討了在兩年的安全程式碼審查過程中，一些被認為是最關鍵的問題，最後也提出了一些解決方案和緩解的方法。我自己也在驗證探究過程中添油加醋了一點，如有錯誤還請指出哈。

下面一張圖表示他們的方法論：

探究的場景為：

輸入的資料是”未知”的型別和大小
使用RFC規範構建Libraries
資料在沒有經過適當的驗證就被處理了
邏輯被更改為是獨立於作業系統的

0x01 Date and time —> time, datetime, os

time

asctime

#!python
import time 
initial_struct_time = [tm for tm in time.localtime()]       

# Example on how time object will cause an overflow 
# Same for: Year, Month, Day, minutes, seconds  
invalid_time = (2**63)      

# change ‘Hours' to a value bigger than 32bit/64bit limit   
initial_struct_time[3] = invalid_time       

overflow_time = time.asctime(initial_struct_time)

#!python

import time

initial_struct_time = [tm for tm in time.localtime()]

# Example on how time object will cause an overflow

# Same for: Year, Month, Day, minutes, seconds

invalid_time = (2**63)

# change ‘Hours' to a value bigger than 32bit/64bit limit

initial_struct_time[3] = invalid_time

overflow_time = time.asctime(initial_struct_time)

這裡面asctime()函式是將一個tuple或者是struct_time表示的時間形式轉換成類似於Sun Jun 20 23:21:05 1993的形式，可以time.asctime(time.localtime())驗證一下。對time.struct_time(tm_year=2015, tm_mon=11, tm_mday=7, tm_hour=20, tm_min=58, tm_sec=57, tm_wday=5, tm_yday=311, tm_isdst=0)中每一個鍵值設定invalid_time可造成溢位錯誤。

在Python 2.6.x中報錯為OverflowError: long int too large to convert to int
在Python 2.7.x中報錯為
- OverflowError: Python int too large to convert to C long
- OverflowError: signed integer is greater than maximum

自己在64位Ubuntu Python2.7.6也測試了一下，輸出結果為：

[-] hour:
    [+] OverflowError begins at 31: signed integer is greater than maximum
    [+] OverflowError begins at 63: Python int too large to convert to C long
...

[-] hour:

[+] OverflowError begins at 31: signed integer is greater than maximum

[+] OverflowError begins at 63: Python int too large to convert to C long

...

gmtime

#!python
import time 
print time.gmtime(-2**64)   
print time.gmtime(2**63)

#!python

import time

print time.gmtime(-2**64)

print time.gmtime(2**63)

time.gmtime()為將秒數轉化為struct_time格式，它會基於time_t平臺進行檢驗，如上程式碼中將秒數擴大進行測試時會產生報錯ValueError: timestamp out of range for platform time_t。如果數值在-2^63到-2^56之間或者2^55到2^62之間又會引發另一種報錯ValueError: (84, ‘Value too large to be stored in data type’)。我自己的測試結果輸出如下：

[-] 2 power:
    [+] ValueError begins at 56: (75, 'Value too large for defined data type')
    [+] ValueError begins at 63: timestamp out of range for platform time_t
[-] -2 power:
    [+] ValueError begins at 56: (75, 'Value too large for defined data type')
    [+] ValueError begins at 64: timestamp out of range for platform time_t

[-] 2 power:

[+] ValueError begins at 56: (75, 'Value too large for defined data type')

[+] ValueError begins at 63: timestamp out of range for platform time_t

[-] -2 power:

[+] ValueError begins at 56: (75, 'Value too large for defined data type')

[+] ValueError begins at 64: timestamp out of range for platform time_t

os

#!python
import os   
TESTFILE = 'temp.bin'       

validtime = 2**55   
os.utime(TESTFILE,(-2147483648, validtime)) 
stinfo = os.stat(TESTFILE)  
print(stinfo)       

invalidtime = 2**63 
os.utime(TESTFILE,(-2147483648, invalidtime))   
stinfo = os.stat(TESTFILE)  
print(stinfo)

#!python

import os

TESTFILE = 'temp.bin'

validtime = 2**55

os.utime(TESTFILE,(-2147483648, validtime))

stinfo = os.stat(TESTFILE)

print(stinfo)

invalidtime = 2**63

os.utime(TESTFILE,(-2147483648, invalidtime))

stinfo = os.stat(TESTFILE)

print(stinfo)

這裡的os.utime(path, times)是設定對應檔案的access和modified時間，時間以(atime, mtime)元組的形式傳入，程式碼中將modified time設定過大也會產生報錯。

在Python 2.6.x中報錯為OverflowError: long int too large to convert to int
在Python 2.7.x, Python 3.1中報錯為OverflowError: Python int too large to convert to C long

如果我們將其中的modified time設定為2^55，ls後會有：

#!bash
$ ls -la temp.bin   
-rw-r--r-- 1 user01 user01 5 13 Jun 1141709097 temp.bin
$ stat temp.bin 
A:"Oct 10 16:31:45 2015"    
M:"Jun 13 01:26:08 1141709097"  
C: ”Oct 10 16:31:42 2015"

#!bash

$ ls -la temp.bin

-rw-r--r-- 1 user01 user01 5 13 Jun 1141709097 temp.bin

$ stat temp.bin

A:"Oct 10 16:31:45 2015"

M:"Jun 13 01:26:08 1141709097"

C: ”Oct 10 16:31:42 2015"

在某些作業系統上如果我們將值設為2^56，將會有以下輸出（也有造成系統崩潰和資料丟失的風險）：

#!bash
$ ls -la temp.bin   
Segmentation fault: 11  
$ stat temp.bin 
A:"Oct 10 16:32:50 2015"    
M:"Dec 31 19:00:00 1969"    
C:"Oct 10 16:32:50 2015"

#!bash

$ ls -la temp.bin

Segmentation fault: 11

$ stat temp.bin

A:"Oct 10 16:32:50 2015"

M:"Dec 31 19:00:00 1969"

C:"Oct 10 16:32:50 2015"

Modules通常沒有對無效輸入進行檢查或者測試。例如，對於64位的作業系統，最大數可以達到2^63-1，但是在不同的情況下使用數值會造成不同的錯誤，任何超出有效邊界的數字都會造成溢位，所以要對有效的資料進行檢驗。

0x02 Numbers —> ctypes, xrange, len, decimal

ctype

ctypes是Python的一個外部庫，提供和C語言相容的資料型別,具體可見官方文件

測試程式碼：

#!python
import ctypes       

#32-bit test with max 32bit integer 2147483647  
ctypes.c_char * int(2147483647)     

#32-bit test with max 32bit integer 2147483647 + 1  
ctypes.c_char * int(2147483648)     

#64-bit test with max 64bit integer 9223372036854775807 
ctypes.c_char * int(9223372036854775807)        

#64-bit test with max 64bit integer 9223372036854775807 + 1 
ctypes.c_char * int(9223372036854775808)

#!python

import ctypes

#32-bit test with max 32bit integer 2147483647

ctypes.c_char * int(2147483647)

#32-bit test with max 32bit integer 2147483647 + 1

ctypes.c_char * int(2147483648)

#64-bit test with max 64bit integer 9223372036854775807

ctypes.c_char * int(9223372036854775807)

#64-bit test with max 64bit integer 9223372036854775807 + 1

ctypes.c_char * int(9223372036854775808)

舉個例子，可以在64位的作業系統上造成溢位：

#!python
>>> ctypes.c_char * int(9223372036854775808)
Traceback (most recent call last):
File "", line 1, in 
OverflowError: cannot fit 'long' into an index-sized integer

#!python

>>> ctypes.c_char * int(9223372036854775808)

Traceback (most recent call last):

File "", line 1, in

OverflowError: cannot fit 'long' into an index-sized integer

Python ctypes 可呼叫的資料型別有：

問題在於：

ctypes對記憶體大小沒有限制
也沒有對溢位進行檢查

所以，在32位和64位作業系統上都可以造成溢位，解決方案就是也要對資料的有效性和溢位進行檢查。

xrange()

演示程式碼：

#!python
valid = (2 ** 63) -1    
invalid = 2 ** 63       

for n in xrange(invalid):   
    print n

#!python

valid = (2 ** 63) -1

invalid = 2 ** 63

for n in xrange(invalid):

print n

報錯為：OverflowError: Python int too large to convert to C long。雖然這種行為是“故意”的和在預期之內的，但在這種情況下依舊沒有進行檢查而導致數字溢位，這是因為xrange使用Plain Integer Objects而無法接受任意長度的物件。解決方法就是使用Python的long integer object，這樣就可以使用任意長度的數字了，限制條件則變為作業系統記憶體的大小了。

len()

演示程式碼：

#!python
valid = (2**63)-1   
invalid = 2**63     

class A(object):    
    def __len__(self):  
        return invalid      

print len(A())

#!python

valid = (2**63)-1

invalid = 2**63

class A(object):

def __len__(self):

return invalid

print len(A())

這裡也會報錯：OverflowError: long int too large to convert to int。因為len()函式沒有對物件的長度進行檢查，也沒有使用python int objects（使用了就會沒有限制），當物件可能包含一個“.length”屬性的時候，就有可能造成溢位錯誤。解決辦法同樣也是使用python int objects。

Decimal

#!python
from decimal import Decimal 
try:    
    # DECIMAL '1172837167.27'   
    x = Decimal("1172837136.0800")  
    # FLOAT '1172837167.27' 
    y = 1172837136.0800 
    if y > x:   
        print("ERROR: FLOAT seems comparable with DECIMAL") 
    else:   
        print("ERROR: FLOAT seems comparable with DECIMAL") 
except Exception as e:  
    print("OK: FLOAT is NOT comparable with DECIMAL")

#!python

from decimal import Decimal

try:

# DECIMAL '1172837167.27'

x = Decimal("1172837136.0800")

# FLOAT '1172837167.27'

y = 1172837136.0800

if y > x:

print("ERROR: FLOAT seems comparable with DECIMAL")

else:

print("ERROR: FLOAT seems comparable with DECIMAL")

except Exception as e:

print("OK: FLOAT is NOT comparable with DECIMAL")

以上程式碼是將Decimal例項和浮點值進行比較，在不同Python版本中如果無法比較則用except捕獲異常，輸出情況為：

在Python 2.6.5, 2.7.4, 2.7.10中輸出ERROR: FLOAT seems comparable with DECIMAL (WRONG)
在Python 3.1.2中輸出OK: FLOAT is NOT comparable with DECIMAL (CORRECT)

Type Comparsion

#!python
try:    
    # STRING 1234567890 
    x = "1234567890"    
    # FLOAT '1172837167.27' 
    y = 1172837136.0800 
    if y > x:   
        print("ERROR: FLOAT seems comparable with STRING")  
    else:   
        print("ERROR: FLOAT seems comparable with STRING")  
except Exception as e:  
    print("OK: FLOAT is NOT comparable with STRING")

#!python

try:

# STRING 1234567890

x = "1234567890"

# FLOAT '1172837167.27'

y = 1172837136.0800

if y > x:

print("ERROR: FLOAT seems comparable with STRING")

else:

print("ERROR: FLOAT seems comparable with STRING")

except Exception as e:

print("OK: FLOAT is NOT comparable with STRING")

以上程式碼是將字串和浮點值進行比較，在不同Python版本中如果無法比較則用except捕獲異常，輸出情況為：

在Python 2.6.5, 2.7.4, 2.7.10中輸出ERROR: FLOAT seems comparable with STRING (WRONG)
在Python 3.1.2中輸出OK: FLOAT is NOT comparable with STRING (CORRECT)

在使用同一種型別的物件進行比較之後，Python內建的比較函式就不會進行檢驗。但在以上兩個程式碼例子當中Python並不知道該如何把STRING和FLOAT進行比較，就會直接返回一個FALSE而不是產生一個Error。同樣的問題也發生於在將DECIMAL和FLOATS時。解決方案就是使用強型別（strong type）檢測和資料驗證。

0x03 Strings —> input, eval, codecs, os, ctypes

eval()

#!python
import os   
try:    
    # Linux/Unix    
    eval("__import__('os').system('clear')", {})    
    # Windows   
    #eval("__import__('os').system(cls')", {})  
    print "Module OS loaded by eval"    
except Exception as e:  
    print repr(e)

#!python

import os

try:

# Linux/Unix

eval("__import__('os').system('clear')", {})

# Windows

#eval("__import__('os').system(cls')", {})

print "Module OS loaded by eval"

except Exception as e:

print repr(e)

關於eval()函式，Python中eval帶來的潛在風險這篇文章也有提到過，使用__import__匯入os,再結合eval()就可以執行命令了。只要使用者載入瞭直譯器就可以沒有限制地執行任何命令。

input()

#!python
Secret = "42"       

value = input("Answer to everything is ? ")     

print "The answer to everything is %s" % (value,)

#!python

Secret = "42"

value = input("Answer to everything is ? ")

print "The answer to everything is %s" % (value,)

在以上的程式碼中input()會接受原始輸入，如何這裡使用者傳入一個dir()再結合print，就會執行dir()的功能返回一個物件的大部分屬性：

#!python
Answer to everything is ? dir() 
The answer to everything is 
[‘Secret’, '__builtins__', '__doc__', '__file__', '__name__',
'__package__']

#!python

Answer to everything is ? dir()

The answer to everything is

[‘Secret’, '__builtins__', '__doc__', '__file__', '__name__',

'__package__']

我在這裡看到了有一個Secret物件，然後藉助原來程式的功能就可以得到該值：

#!python
Answer to everything is ? Secret    
The answer to everything is 42

#!python

Answer to everything is ? Secret

The answer to everything is 42

codecs

#!python
import codecs   
import io       

b = b'x41xF5x42x43xF4' 
print("Correct-String %r") % ((repr(b.decode('utf8', 'replace'))))      

with open('temp.bin', 'wb') as fout:    
    fout.write(b)   
with codecs.open('temp.bin', encoding='utf8', errors='replace') as fin:
    print("CODECS-String %r") % (repr(fin.read()))  
with io.open('temp.bin', 'rt', encoding='utf8', errors='replace') as fin:
    print("IO-String %r") % (repr(fin.read()))

#!python

import codecs

import io

b = b'x41xF5x42x43xF4'

print("Correct-String %r") % ((repr(b.decode('utf8', 'replace'))))

with open('temp.bin', 'wb') as fout:

fout.write(b)

with codecs.open('temp.bin', encoding='utf8', errors='replace') as fin:

print("CODECS-String %r") % (repr(fin.read()))

with io.open('temp.bin', 'rt', encoding='utf8', errors='replace') as fin:

print("IO-String %r") % (repr(fin.read()))

以上的程式碼將x41xF5x42x43xF4以二進位制的形式寫入檔案，再分別用codecs和io模組進行讀取，編碼形式為utf-8，對xF5和xF4不能編碼的設定errors='replace'，編碼成為\ufffd，最後結果如下：

Correct-String —> "u'A\ufffdBC\ufffd'"
CODECS-String —> "u'A\ufffdBC'" (WRONG)
IO-String —> "u'A\ufffdBC\ufffd'" (OK)

Correct-String —> "u'A\ufffdBC\ufffd'"

CODECS-String —> "u'A\ufffdBC'" (WRONG)

IO-String —> "u'A\ufffdBC\ufffd'" (OK)

當codecs在讀取x41xF5x42x43xF4這個字串的時候，它期望接收到包含4個位元組的序列，而且因為在讀入xF4的時候它還會再等待其他3個位元組，而沒有進行編碼，結果就是得到的字串有一段被刪除了。更好且安全的方法就是使用os模組，讀取整個資料流，然後進行解碼處理。解決方案就是使用io模組或者對字串進行識別和確認來檢測畸形字元。

os

#!python
import os   
os.environ['a=b'] = 'c' 
try:    
    os.environ.clear()  
    print("PASS => os.environ.clear removed variable 'a=b'")    
except: 
    print("FAIL => os.environ.clear removed variable 'a=b'")    
    raise

#!python

import os

os.environ['a=b'] = 'c'

try:

os.environ.clear()

print("PASS => os.environ.clear removed variable 'a=b'")

except:

print("FAIL => os.environ.clear removed variable 'a=b'")

raise

在不同的平臺上，環境變數名的名稱和語法都是基於不同的規則。但Python並不遵守同樣的邏輯，它儘量使用一種普遍的介面來相容大多數的作業系統。這種重視相容性大於安全的選擇，使得用於環境變數的邏輯存在缺陷。

#!bash
$ env -i =value python -c 'import pprint, os;
pprint.pprint(os.environ); del os.environ[""]'      

environ({'': 'value'})  
Traceback (most recent call last):  
    File "", line 1, in     
    File "Lib/os.py", line 662, in __delitem__  
        self.unsetenv(encodedkey)   
OSError: [Errno 22] Invalid argument

#!bash

$ env -i =value python -c 'import pprint, os;

pprint.pprint(os.environ); del os.environ[""]'

environ({'': 'value'})

Traceback (most recent call last):

File "", line 1, in

File "Lib/os.py", line 662, in __delitem__

self.unsetenv(encodedkey)

OSError: [Errno 22] Invalid argument

上面的程式碼使用env -i以一個空的環境開始，再設定一個鍵為空值為value的環境變數，使用python列印出來再刪除。這樣就可以定義一個鍵為空的環境變數了，也可以設定在鍵名中包含”=”，但是會無法移除它：

#!bash
$ env -i python -c 'import pprint, posix, os;
os.environ["a="]="1"; print(os.environ); posix.unsetenv("a=")'      

environ({'a=': ‘1'})    
Traceback (most recent call last):  
    File "", line 1, in     
OSError: [Errno 22] Invalid argument

#!bash

$ env -i python -c 'import pprint, posix, os;

os.environ["a="]="1"; print(os.environ); posix.unsetenv("a=")'

environ({'a=': ‘1'})

Traceback (most recent call last):

File "", line 1, in

OSError: [Errno 22] Invalid argument

根據不同的版本，Python也會有不同的反應：

Python 2.6 —> NO ERRORS，允許無效操作！
PYTHON 2.7 —> OSError: [Errno 22] Invalid argument
PYTHON 3.1 —> NO ERRORS，允許無效操作！

解決方案是對基礎設施和作業系統進行檢測，檢測和環境變數相關的鍵值對，阻止一些對作業系統為空或者無效鍵值對的使用。

ctypes

#!python
buffer=ctypes.create_string_buffer(8)       

buffer.value='abc1234'        

print "Original value => %r" % (buffer.raw,)    
print "Interpreted value => %r" % (buffer.value,)

#!python

buffer=ctypes.create_string_buffer(8)

buffer.value='abc1234'

print "Original value => %r" % (buffer.raw,)

print "Interpreted value => %r" % (buffer.value,)

ctypes模組在包含空字元的字串中會產生截斷，上面程式碼輸出如下：

Original value => 'ax00bc1234' 
Interpreted value => 'a'

1 2	Original value => 'ax00bc1234' Interpreted value => 'a'

這一點和C處理字串是一樣的，會把空字元作為一行的終止。Python在這種情況下使用ctypes，就會繼承相同的邏輯，所以字串就被截斷了。解決方案就是對資料進行確認，刪除字串中的空字元來保護字串或者是禁止使用ctypes。

Python Interpreter

#!python
try:    
    if 0:   
        yield 5 
    print("T1-FAIL")    
except Exception as e:  
    print("T1-PASS")    
    pass        

try:    
    if False:   
        yield 5 
    print("T2-FAIL")    
except Exception as e:  
    print(repr(e))  
    pass

#!python

try:

if 0:

yield 5

print("T1-FAIL")

except Exception as e:

print("T1-PASS")

pass

try:

if False:

yield 5

print("T2-FAIL")

except Exception as e:

print(repr(e))

pass

以上的測試程式碼應該返回一個語法錯誤：SyntaxError: ‘yield’ outside function。在不同版本的Python上執行結果如下：

這個問題在最新的Python 2.7.x版本中已經解決，而且避免使用像”if 0:“，”if False:“，”while 0:“，”while False:“之類的結構。

0x04 Files —> sys, os, io, pickle, cpickl

pickle

#!python
import pickle   
import io   
badstring = "cosnsystemn(S'ls -la /'ntR."    
badfile = "./pickle.sec"    
with io.open(badfile, 'wb') as w:   
    w.write(badstring)  
obj = pickle.load(open(badfile))    
print "== Object =="    
print repr(obj)

#!python

import pickle

import io

badstring = "cosnsystemn(S'ls -la /'ntR."

badfile = "./pickle.sec"

with io.open(badfile, 'wb') as w:

w.write(badstring)

obj = pickle.load(open(badfile))

print "== Object =="

print repr(obj)

這裡構造惡意序列化字串，以二進位制的形式寫入檔案中，使用pickle.load()函式載入進行反序列化，還原出原始python物件，從而使用os的system()函式來執行命令”ls -la /“。由於pickle這樣不安全的設計，就可以藉此來執行命令了。程式碼輸出結果如下：

Linux

total 104
drwxr-xr-x  23 root root  4096 Oct 20 11:19 .
drwxr-xr-x  23 root root  4096 Oct 20 11:19 ..
drwxr-xr-x   2 root root  4096 Oct  4 00:05 bin
drwxr-xr-x   4 root root  4096 Oct  4 00:07 boot
...

total 104

drwxr-xr-x 23 root root 4096 Oct 20 11:19 .

drwxr-xr-x 23 root root 4096 Oct 20 11:19 ..

drwxr-xr-x 2 root root 4096 Oct 4 00:05 bin

drwxr-xr-x 4 root root 4096 Oct 4 00:07 boot

...

Mac OS X

total 16492 
drwxr-xr-x    31 root wheel     1122 12 Oct 18:58 . 
drwxr-xr-x    31 root wheel     1122 12 Oct 18:58 ..    
drwxrwxr-x+  122 root wheel     4148 10 Oct 15:19 Applications
drwxr-xr-x+   68 root wheel     2312  3 Sep 10:47 Library
...

total 16492

drwxr-xr-x 31 root wheel 1122 12 Oct 18:58 .

drwxr-xr-x 31 root wheel 1122 12 Oct 18:58 ..

drwxrwxr-x+ 122 root wheel 4148 10 Oct 15:19 Applications

drwxr-xr-x+ 68 root wheel 2312 3 Sep 10:47 Library

...

pickle / cPickle

#!python
import cPickle  
import traceback    
import sys  
# bignum = int((2**31)-1) # 2147483647 -> OK    
bignum = int(2**31) # 2147483648 -> Max 32bit -> Crash  
random_string = os.urandom(bignum)  
print ("STRING-LENGTH-1=%r") % (len(random_string)) 
fout = open('test.pickle', 'wb')    
try:    
    cPickle.dump(random_string, fout)   
except Exception as e:  
    print "###### ERROR-WRITE ######"   
    print sys.exc_info()[0] 
    raise   
fout.close()    
fin = open('test.pickle', 'rb') 
try:    
    random_string2 = cPickle.load(fin)  
except Exception as e:  
    print "###### ERROR-READ ######"    
    print sys.exc_info()[0] 
    raise   
print ("STRING-LENGTH-2=%r") % (len(random_string2))    
print random_string == random_string2   
sys.exit(0)

#!python

import cPickle

import traceback

import sys

# bignum = int((2**31)-1) # 2147483647 -> OK

bignum = int(2**31) # 2147483648 -> Max 32bit -> Crash

random_string = os.urandom(bignum)

print ("STRING-LENGTH-1=%r") % (len(random_string))

fout = open('test.pickle', 'wb')

try:

cPickle.dump(random_string, fout)

except Exception as e:

print "###### ERROR-WRITE ######"

print sys.exc_info()[0]

raise

fout.close()

fin = open('test.pickle', 'rb')

try:

random_string2 = cPickle.load(fin)

except Exception as e:

print "###### ERROR-READ ######"

print sys.exc_info()[0]

raise

print ("STRING-LENGTH-2=%r") % (len(random_string2))

print random_string == random_string2

sys.exit(0)

在上面的程式碼中，根據使用的Python版本不同，pickle或cPickle要麼儲存截斷的資料而沒有錯誤要麼就會儲存限制為32bit的部分。而且根據Python在作業系統上安裝時編譯的情況，它會返回在請求隨機資料大小上的錯誤，或者是報告無效引數的OS錯誤：

cPickle (debian 7 x64)

#!python
STRING-LENGTH-1=2147483648  
###### ERROR-WRITE ######   

Traceback (most recent call last):  
....    
    pickle.dump(random_string, fout)    
SystemError: error return without exception set

#!python

STRING-LENGTH-1=2147483648

###### ERROR-WRITE ######

Traceback (most recent call last):

....

pickle.dump(random_string, fout)

SystemError: error return without exception set

pickle (debian 7 x64)

#!python
STRING-LENGTH-1=2147483648  
###### ERROR-WRITE ######   

Traceback (most recent call last):  
....    
File "/usr/lib/python2.7/pickle.py", line 488,
in save_string
self.write(STRING + repr(obj)+ 'n')    
MemoryError

#!python

STRING-LENGTH-1=2147483648

###### ERROR-WRITE ######

Traceback (most recent call last):

....

File "/usr/lib/python2.7/pickle.py", line 488,

in save_string

self.write(STRING + repr(obj)+ 'n')

MemoryError

解決方案就是執行強大的資料檢測來確保不會執行危險行為，還有即使在64位的作業系統上也要限制資料到32位大小。

File Open

#!python
import os   
import sys  
FPATH = 'bug2091.test'  
# ==========================    
print 'wa (1)_write1'   
with open(FPATH, 'wa') as fp:   
    fp.write('test1-')  
with open(FPATH, 'rb') as fp:   
    print repr(fp.read())   
# ==========================    
print 'rU+_write2'  
with open(FPATH, 'rU+') as fp:  
    fp.write('test2-')  
with open(FPATH, 'rb') as fp:   
    print repr(fp.read())   
# ==========================    
print 'wa (2)_write3'   
with open(FPATH, 'wa+') as fp:  
    fp.write('test3-')  
with open(FPATH, 'rb') as fp:   
    print repr(fp.read())   
# ==========================    
print 'aw_write4'   
with open(FPATH, 'aw') as fp:   
    fp.write('test4-')  
with open(FPATH, 'rb') as fp:   
    print repr(fp.read())   
# ==========================    
print 'rU+_read1',  
with open(FPATH, 'rU+') as fp:  
    print repr(fp.read())   
# ==========================    
print 'read_2', 
with open(FPATH, 'read') as fp: 
    print repr(fp.read())   
# ==========================    
os.unlink(FPATH)    
sys.exit(0)

#!python

import os

import sys

FPATH = 'bug2091.test'

# ==========================

print 'wa (1)_write1'

with open(FPATH, 'wa') as fp:

fp.write('test1-')

with open(FPATH, 'rb') as fp:

print repr(fp.read())

# ==========================

print 'rU+_write2'

with open(FPATH, 'rU+') as fp:

fp.write('test2-')

with open(FPATH, 'rb') as fp:

print repr(fp.read())

# ==========================

print 'wa (2)_write3'

with open(FPATH, 'wa+') as fp:

fp.write('test3-')

with open(FPATH, 'rb') as fp:

print repr(fp.read())

# ==========================

print 'aw_write4'

with open(FPATH, 'aw') as fp:

fp.write('test4-')

with open(FPATH, 'rb') as fp:

print repr(fp.read())

# ==========================

print 'rU+_read1',

with open(FPATH, 'rU+') as fp:

print repr(fp.read())

# ==========================

print 'read_2',

with open(FPATH, 'read') as fp:

print repr(fp.read())

# ==========================

os.unlink(FPATH)

sys.exit(0)

以上程式碼主要是測試各種檔案的開啟模式，其中U是指以統一的換行模式開啟（不贊成使用），各個平臺的測試結果如下：

Linux and Mac OS X
Windows

INVALID stream operations – Linux / OS X

#!python
import sys  
import io   
fd = io.open(sys.stdout.fileno(), 'wb') 
fd.close()  
try:    
    sys.stdout.write("test for error")  
except Exception:   
    raise

#!python

import sys

import io

fd = io.open(sys.stdout.fileno(), 'wb')

fd.close()

try:

sys.stdout.write("test for error")

except Exception:

raise

程式碼在這裡使用fileno()來獲取sys.stdout的檔案描述符，在讀寫後就關閉，之後便無法從標準輸入往標準輸出中傳送資料流了。輸出如下：

在Python 2.6.5, 2.7.4中

#!python close failed in file object destructor: sys.excepthook is missing lost sys.stderr

1
2
3
4

#!python
close failed in file object destructor:
sys.excepthook is missing
lost sys.stderr

在Python 2.7.10中

#!python
Traceback (most recent call last):  
    File "tester.py", line 6, in    
        sys.stdout.write("test for error")  
IOError: [Errno 9] Bad file descriptor

#!python

Traceback (most recent call last):

File "tester.py", line 6, in

sys.stdout.write("test for error")

IOError: [Errno 9] Bad file descriptor

INVALID stream operations – Windows

#!python
import io
import sys    

fd = io.open(sys.stdout.fileno(), 'wb')
fd.close()
sys.stdout.write(“Crash")

#!python

import io

import sys

fd = io.open(sys.stdout.fileno(), 'wb')

fd.close()

sys.stdout.write(“Crash")

在windows上也是類似的，如圖：

解決方案就是file和stream庫雖然不遵循OS規範，但它們使用一個通用的邏輯，有必要為每個OS使用有處理能力的庫，來設定正確的呼叫過程。

File Write

#!python
import os
import sys  
testfile = 'tempA'  
with open(testfile, "ab") as f: 
    f.write(b"abcd")    
    f.write(b"x" * (1024 ** 2)) 
#########################################   
import io   
testfilea = 'tempB' 
with io.open(testfilea, "ab") as f: 
    f.write(b"abcd")    
    f.write(b"x" * (1024 ** 2))

#!python

import os

import sys

testfile = 'tempA'

with open(testfile, "ab") as f:

f.write(b"abcd")

f.write(b"x" * (1024 ** 2))

#########################################

import io

testfilea = 'tempB'

with io.open(testfilea, "ab") as f:

f.write(b"abcd")

f.write(b"x" * (1024 ** 2))

我們在Linux上使用strace python -OOBRttu script.py來檢測Python的寫檔案行為：

在這裡我們想要寫入的字元數目是4 + 1048576 = 1048580，在不同的版本上對呼叫open()和使用io模組進行比較：

PYTHON 2.6
- 呼叫open()的輸出為：
  
  write(3, "abcdxxxxxxxxxxxxxxxxxxxxxxxxxxxx"..., 4096) = 4096 write(3, "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"..., 1044480) = 1044480
  
  1
  2
  
  write(3, "abcdxxxxxxxxxxxxxxxxxxxxxxxxxxxx"..., 4096) = 4096
  write(3, "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"..., 1044480) = 1044480
  
  第一次呼叫的時候被緩衝，不僅僅是寫入了4個字元（abcd），還寫入了4092個x；第2次呼叫總共寫入1044480個x。這樣加起來1044480 + 4096 = 1.048.576，相比1048580就少了4個x。等待5秒就可以解決這個問題，因為作業系統flush了快取。
- 呼叫io模組的輸出為：
  
  write(3, "abcd", 4) = 4 write(3, "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"..., 1048576) = 1048576
  
  1
  2
  
  write(3, "abcd", 4) = 4
  write(3, "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"..., 1048576) = 1048576
  
  這樣一切就很正常
PYTHON 2.7
- 用open()的輸出為：
  
  write(3, "abcdxxxxxxxxxxxxxxxxxxxxxxxxxxxx"..., 4096) = 4.096 write(3, "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"..., 1044480) = 1.044.480 write(3, "xxxx", 4) = 4
  
  1
  2
  3
  
  write(3, "abcdxxxxxxxxxxxxxxxxxxxxxxxxxxxx"..., 4096) = 4.096
  write(3, "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"..., 1044480) = 1.044.480
  write(3, "xxxx", 4) = 4
  
  在這裡進行了三次呼叫，最後再寫入4個x，保證整體資料的正確性。問題就在於這裡使用了3次呼叫而不是我們預期的2次呼叫。
- 呼叫io模組則一切正常
PYTHON 3.x在Python3中用open()函式和io模組則一切都很正常

在Python2中沒有包含原子操作，核心庫是在使用快取進行讀寫。所以應該儘量去使用io模組。

0x05 Protocols —> socket, poplib, urllib, urllib2

httplib, smtplib, ftplib…

核心庫是獨立於作業系統的，開發者必須要知道如何為每一個作業系統構建合適的通訊通道，而且這些庫將會執行執行那些不安全且不正確的操作

#!python
import SimpleHTTPServer 
httplib, smtplib, ftplib...
import SocketServer 
PORT = 45678    
def do_GET(self):   
    self.send_response(200) 
    self.end_headers()  
Handler = SimpleHTTPServer.SimpleHTTPRequestHandler 
Handler.do_GET = do_GET 
httpd = SocketServer.TCPServer(("", PORT), Handler) 
httpd.serve_forever()

#!python

import SimpleHTTPServer

httplib, smtplib, ftplib...

import SocketServer

PORT = 45678

def do_GET(self):

self.send_response(200)

self.end_headers()

Handler = SimpleHTTPServer.SimpleHTTPRequestHandler

Handler.do_GET = do_GET

httpd = SocketServer.TCPServer(("", PORT), Handler)

httpd.serve_forever()

在上面的程式碼中構造了一個HTTP服務端，如果一個客戶端連線進來，再去關閉服務端，Python將不會釋放資源，作業系統也不會釋放socket，引發報錯為socket.error: [Errno 48] Address already in use。可以通過以下程式碼來解決：

#!python
import socket   
import SimpleHTTPServer 
import SocketServer 
PORT = 8080 
# ESSENTIAL: socket resuse is setup BEFORE it is bound. 
# This will avoid TIME_WAIT issues and socket in use errors 
class MyTCPServer(SocketServer.TCPServer):  
    def server_bind(self):  
        self.socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)   
        self.socket.bind(self.server_address)   
def do_GET(self):   
    self.send_response(200) 
    self.end_headers()  
Handler = SimpleHTTPServer.SimpleHTTPRequestHandler 
Handler.do_GET = do_GET 
httpd = MyTCPServer(("", PORT), Handler)    
httpd.serve_forever()

#!python

import socket

import SimpleHTTPServer

import SocketServer

PORT = 8080

# ESSENTIAL: socket resuse is setup BEFORE it is bound.

# This will avoid TIME_WAIT issues and socket in use errors

class MyTCPServer(SocketServer.TCPServer):

def server_bind(self):

self.socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)

self.socket.bind(self.server_address)

def do_GET(self):

self.send_response(200)

self.end_headers()

Handler = SimpleHTTPServer.SimpleHTTPRequestHandler

Handler.do_GET = do_GET

httpd = MyTCPServer(("", PORT), Handler)

httpd.serve_forever()

解決方案就是每一個協議庫都應該由這樣的庫封裝：為每一個OS和協議都適當地建立和撤銷通訊，並釋放資源

poplib, httplib …

服務端：

#!python
import socket   
HOST = '127.0.0.1'  
PORT = 45678    
NULLS = '' * (1024 * 1024) # 1 MB 
try:    
    sock = socket.socket()  
    sock.bind((HOST, PORT)) 
    sock.listen(1)  
    while 1:    
        print "Waiting connection..."   
        conn, _ = sock.accept() 
        print "Sending welcome..."  
        conn.sendall("+OK THIS IS A TESTrn")  
        conn.recv(4096) 
        DATA = NULLS    
        try:    
            while 1:    
                print "Sending 1 GB..." 
                for _ in xrange(1024):  
                    conn.sendall(DATA)  
        except IOError, ex: 
            print "Error: %r" % str(ex) 
        print "End session."    
        print   
finally:    
    sock.close()    
print "End server."

#!python

import socket

HOST = '127.0.0.1'

PORT = 45678

NULLS = '' * (1024 * 1024) # 1 MB

try:

sock = socket.socket()

sock.bind((HOST, PORT))

sock.listen(1)

while 1:

print "Waiting connection..."

conn, _ = sock.accept()

print "Sending welcome..."

conn.sendall("+OK THIS IS A TESTrn")

conn.recv(4096)

DATA = NULLS

try:

while 1:

print "Sending 1 GB..."

for _ in xrange(1024):

conn.sendall(DATA)

except IOError, ex:

print "Error: %r" % str(ex)

print "End session."

finally:

sock.close()

print "End server."

客戶端：

#!python
import poplib   
import sys  
HOST = '127.0.0.1'  
PORT = 45678    
try:    
    print "Connecting to %r:%d..." % (HOST, PORT)   
    pop = poplib.POP3(HOST, PORT)   
    print "Welcome:", repr(pop.welcome) 
    print "Listing..."  
    reply = pop.list()  
    print "LIST:", repr(reply)  
except Exception, ex:   
    print "Error: %r" % str(ex) 
print "End."    
sys.exit(0)

#!python

import poplib

import sys

HOST = '127.0.0.1'

PORT = 45678

try:

print "Connecting to %r:%d..." % (HOST, PORT)

pop = poplib.POP3(HOST, PORT)

print "Welcome:", repr(pop.welcome)

print "Listing..."

reply = pop.list()

print "LIST:", repr(reply)

except Exception, ex:

print "Error: %r" % str(ex)

print "End."

sys.exit(0)

以上程式碼當中，首先開啟一個虛擬的服務端，使用客戶端去連線服務端，然後服務端開始傳送空字元，客戶端持續性接收空字元，最後到客戶端記憶體填滿，系統崩潰，輸出如下：

服務端

#!python Waiting connection... Sending welcome... Sending 1 GB... Error: '[Errno 54] Connection reset by peer' End session.

1
2
3
4
5
6

#!python
Waiting connection...
Sending welcome...
Sending 1 GB...
Error: '[Errno 54] Connection reset by peer'
End session.
客戶端
- Python >= 2.7.9, 3.3
  
  #!python Connecting to '127.0.0.1':45678... Welcome: '+OK THIS IS A TEST' Listing... Error: 'line too long' End.
  
  1
  2
  3
  4
  5
  6
  
  #!python
  Connecting to '127.0.0.1':45678...
  Welcome: '+OK THIS IS A TEST'
  Listing...
  Error: 'line too long'
  End.
- Python
  
  #!python Client! Connecting to '127.0.0.1':45678... Welcome: '+OK THIS IS A TEST' ........ Error: 'out of memory'
  
  1
  2
  3
  4
  5
  6
  
  #!python
  Client!
  Connecting to '127.0.0.1':45678...
  Welcome: '+OK THIS IS A TEST'
  ........
  Error: 'out of memory'

解決方案就是如果無法控制檢查資料的型別和大小，就使用Python > 2.7.9’或者’Python > 3.3’的版本

對資料沒有進行限制的庫：

urllib, urllib2

#!python
import io   
import os   
import urllib2 #but all fine with urllib    
domain = 'ftp://ftp.ripe.net'   
location = '/pub/stats/ripencc/'    
file = 'delegated-ripencc-extended-latest'  
url = domain + location + file  
data = urllib2.urlopen(url).read()  
with io.open(file, 'wb') as w:  
    w.write(data)   
file_size = os.stat(file).st_size   
print "Filesize: %s" % (file_size)

#!python

import io

import os

import urllib2 #but all fine with urllib

domain = 'ftp://ftp.ripe.net'

location = '/pub/stats/ripencc/'

file = 'delegated-ripencc-extended-latest'

url = domain + location + file

data = urllib2.urlopen(url).read()

with io.open(file, 'wb') as w:

w.write(data)

file_size = os.stat(file).st_size

print "Filesize: %s" % (file_size)

urllib2並沒有合適的邏輯來處理資料流而且每次都會失敗，將上次程式碼執行三次都會得到錯誤的檔案大小的輸出：

Filesize: 65536
Filesize: 32768
Filesize: 49152

Filesize: 65536

Filesize: 32768

Filesize: 49152

如果使用以下的程式碼則會產生正確的輸出：

#!python
import os   
import io   
import urllib2  
domain = 'ftp://ftp.ripe.net'   
location = '/pub/stats/ripencc/'    
file = 'delegated-ripencc-extended-latest'  
with io.open(file, 'wb') as w:  
    url = domain + location + file  
    response = urllib2.urlopen(url) 
    data = response.read()  
    w.write(data)   
file_size = os.stat(file).st_size   
print "Filesize: %s" % (file_size)

#!python

import os

import io

import urllib2

domain = 'ftp://ftp.ripe.net'

location = '/pub/stats/ripencc/'

file = 'delegated-ripencc-extended-latest'

with io.open(file, 'wb') as w:

url = domain + location + file

response = urllib2.urlopen(url)

data = response.read()

w.write(data)

file_size = os.stat(file).st_size

print "Filesize: %s" % (file_size)

輸出為：

Filesize: 6598450
Filesize: 6598450
Filesize: 6598450

Filesize: 6598450

通過以上的例子可以看出，解決方案為利用作業系統來保證資料流的正確性

已知不安全的庫：

最後，當數百萬人在使用它的時候，永遠不要以為它會一直按你期望的那樣運作，也絕對不要以為在使用它的時候是安全的。

Python安全編碼指南
2020-08-19
Python
java安全編碼指南之:方法編寫指南
2020-10-08
Java
java安全編碼指南之:字串和編碼
2020-09-16
Java字串
java安全編碼指南之:Number操作
2020-09-10
Java
Python 編碼風格指南
2016-03-30
Python
java安全編碼指南之：基礎篇
2020-08-25
Java
java安全編碼指南之:執行緒安全規則
2020-10-23
Java執行緒
python 安全編碼&程式碼審計
2020-08-19
Python
java安全編碼指南之:檔案IO操作
2020-10-27
Java
java安全編碼指南之:ThreadPool的使用
2020-10-20
Javathread
java安全編碼指南之:序列化Serialization
2020-11-01
Java
java安全編碼指南之:堆汙染Heap pollution
2020-09-18
Java
java安全編碼指南之:輸入校驗
2020-09-21
Java
java安全編碼指南之:Mutability可變性
2020-09-03
Java
java安全編碼指南之:輸入注入injection
2020-10-12
Java
java安全編碼指南之:異常處理
2020-09-29
Java
java安全編碼指南之:宣告和初始化
2020-09-06
Java
java安全編碼指南之:死鎖dead lock
2020-10-01
Java
java安全編碼指南之:敏感類的拷貝
2020-09-28
Java
java安全編碼指南之:敏感類的複製
2020-09-28
Java
PEP 8 Python編碼風格指南概要
2017-02-25
Python
Protobuf 編碼指南
2019-12-05
JavaScript編碼指南
2018-08-16
JavaScript
JavaScript 編碼指南
2017-05-23
JavaScript
java安全編碼指南之:可見性和原子性
2020-09-25
Java
java安全編碼指南之:鎖的雙重檢測
2020-10-14
Java
java安全編碼指南之:Thread API呼叫規則
2020-10-19
JavathreadAPI
java安全編碼指南之:檔案和共享目錄的安全性
2020-11-03
Java
java安全編碼指南之:lock和同步的正確使用
2020-10-10
Java
PHP安全編碼
2020-08-19
PHP
前端安全編碼
2019-03-10
前端
JavaScript編碼風格指南
2019-02-24
JavaScript
JavaScript 編碼風格指南
2018-07-24
JavaScript
Go 編碼規範指南
2016-10-09
Go
Go編碼規範指南
2016-10-09
Go
OpenCV 編碼樣式指南
2013-07-07
OpenCV
CoffeeScript 編碼風格指南
2014-08-27
Android編碼規範指南
2016-05-18
Android

Python 安全編碼指南

0x01 Date and time —> time, datetime, os

time

asctime

gmtime

os

0x02 Numbers —> ctypes, xrange, len, decimal

ctype

xrange()

len()

Decimal

Type Comparsion

0x03 Strings —> input, eval, codecs, os, ctypes

eval()

input()

codecs

os

ctypes

Python Interpreter

0x04 Files —> sys, os, io, pickle, cpickl

pickle

pickle / cPickle

File Open

INVALID stream operations – Linux / OS X

INVALID stream operations – Windows

File Write

0x05 Protocols —> socket, poplib, urllib, urllib2

httplib, smtplib, ftplib…

poplib, httplib …

urllib, urllib2

相關文章