太囂張了！他竟用Python繞過了“驗證碼”

Tynam.Yang發表於2019-03-14

原文網址 : https://www.cnblogs.com/tynam/p/10532679.html

Python

在web頁面中，經常會遇到驗證碼，這對於我這麼一個熱愛web自動化測試人員，就變成了一件頭疼的事。於是千方百計找各種資源得到破解簡單的驗證碼方法。

識別驗證碼

　　大致分如下幾個步驟：

　　　　1.獲取驗證碼圖片

　　　　2.灰度處理

　　　　3.增加對比度

　　　　4.降噪

　　　　5.識別

>>>>獲取驗證碼

　　通過各種方法，將含有驗證碼的圖片獲取並存貯在本地。

　　本次的方法是：擷取當前web頁面，然後獲取驗證碼在web頁面中的位置，通過位置定位驗證碼圖片再次擷取。

　　以163郵箱註冊頁面為例

　　用到的庫：selenium、PIL

　　如果是python2.x，pip install PIL；在python3.x中PIL被移植到pillow 中，所以匯入時需要匯入pillow，pip install pillow　　

 1 from PIL import Image
 2 
 3 import time
 4 from selenium import webdriver
 5 
 6 
 7 
 8 def get_code_img(driver):
 9 
10    time.sleep(1)
11 
12    # 擷取整個瀏覽器圖
13    driver.save_screenshot('webImg.png')
14 
15    # 獲取code元素座標
16    code_element = driver.find_element_by_id('vcodeImg')
17 
18    # 獲取code圖片座標值
19    left_location = code_element.location['x']
20    top_location = code_element.location['y']
21 
22    right_location = code_element.size['width'] + left_location
23    below_location = code_element.size['height'] + top_location
24 
25    # 通過座標值得到code image圖
26    web_img = Image.open("webImg.png")
27    code_img = web_img.crop((left_location,top_location,right_location,below_location))
28    code_img.save("codeImg.png")

　　save_screenshot：webdriver中提供的一個方法，擷取整個web頁面

　　code_element.location：獲取某個的位置

　　例如：print(code_element.location)的結果為：{'x': 632, 'y': 511}

　　他是以圖片的左上角為基準點，向右為x，向下為y

　　code_element.size：獲取圖片的尺寸

　　crop：是通過四個座標點獲取位置截圖並且生成一張新圖，他是Image 中的一個方法。

執行程式碼

1 if __name__ == '__main__':
2 
3    base_url = 'http://reg.email.163.com/unireg/call.do?cmd=register.entrance&from=126mail'
4 
5    driver = webdriver.Chrome()
6    driver.maximize_window()
7    driver.get(base_url)
8    get_code_img(driver)
9    driver.close()

執行後獲得兩張圖片webImg.png和codeImg.png。codeImg如下：

>>>>灰度處理/增加對比色

　　將圖片的顏色變成灰色並且增加對比色，識別時減少不必要的干擾。

 1 def gray_img(img):
 2    code_img = Image.open(img)
 3    # 轉換為灰度
 4    gray_img = code_img.convert('L')
 5    # 增強亮度
 6    enhance_img = ImageEnhance.Contrast(gray_img)
 7    enhance_img = enhance_img.enhance(3)
 8    return enhance_img
 9 
10 
11 
12 if __name__ == '__main__':
13 
14      gray_img('codeImg.png').show()

　　執行後結果

>>>>降噪

　　根據一個點A的RGB值，與周圍的4個點的RGB值進行比較，最初設定一個值N即判斷數量（0<N<4），當A的RGB值與周圍4個點的RGB相等數小於N時會被視為燥點，被消除。

 1 def clear_noise(img):
 2 
 3 noise_img = img.load()
 4 # 獲取圖片的尺寸
 5 w,h = img.size
 6 
 7 for y in range(1,h-1):
 8  for x in range(1,w-1):
 9   count = 0
10   if noise_img[x,y-1] > 245:
11    count = count + 1
12   if noise_img[x,y+1] > 245:
13    count = count + 1
14   if noise_img[x-1,y] > 245:
15    count = count + 1
16   if noise_img[x+1,y] > 245:
17    count = count + 1
18   if noise_img[x-1,y-1] > 245:
19    count = count + 1
20   if noise_img[x-1,y+1] > 245:
21    count = count + 1
22   if noise_img[x+1,y-1] > 245:
23    count = count + 1
24   if noise_img[x+1,y+1] > 245:
25    count = count + 1
26   if count > 4:
27       noise_img[x,y] = 255
28 return img
29 
30 if __name__ == '__main__':
31    img = gray_img('codeImg.png')
32    clear_noise(img).show()

執行後結果

>>>>識別

　　識別使用的是pytesseract包。

　　Pytesseract包依賴於tesseract，安裝的時候兩個都需安裝

　　詳情參考：

　　　　tesseract： https://github.com/sirfz/tesserocr

　　　　pytesseract：https://github.com/madmaze/pytesseract

1 text = pytesseract.image_to_string(img)
2 print(text)

　　很遺憾，上面的圖沒有識別出來。

完整程式碼執行識別

以下圖驗證碼為例

 1 from PIL import Image, ImageEnhance
 2 import time
 3 import pytesseract
 4 from selenium import webdriver
 5 
 6 
 7 def clear_noise(img):
 8 noise_img = img.load()
 9 # 獲取圖片的尺寸
10 w,h = img.size
11 
12 for y in range(1,h-1):
13  for x in range(1,w-1):
14   count = 0
15   if noise_img[x,y-1] > 245:
16    count = count + 1
17   if noise_img[x,y+1] > 245:
18    count = count + 1
19   if noise_img[x-1,y] > 245:
20    count = count + 1
21   if noise_img[x+1,y] > 245:
22    count = count + 1
23   if noise_img[x-1,y-1] > 245:
24    count = count + 1
25   if noise_img[x-1,y+1] > 245:
26    count = count + 1
27   if noise_img[x+1,y-1] > 245:
28    count = count + 1
29   if noise_img[x+1,y+1] > 245:
30    count = count + 1
31   if count > 4:
32       noise_img[x,y] = 255
33 return img
34 
35 
36 def get_code_img(driver):
37 
38    time.sleep(1)
39 
40    # 擷取整個瀏覽器圖
41    driver.save_screenshot('webImg.png')
42 
43    # 獲取code元素座標
44    code_element = driver.find_element_by_id('vcodeImg')
45 
46    # 獲取code圖片座標值
47    left_location = code_element.location['x']
48    top_location = code_element.location['y']
49 
50    right_location = code_element.size['width'] + left_location
51    below_location = code_element.size['height'] + top_location
52 
53    # 通過座標值得到code image圖
54    web_img = Image.open("webImg.png")
55    code_img = web_img.crop((left_location,top_location,right_location,below_location))
56    code_img.save("codeImg.png")
57 
58 
59 def gray_img(img):
60    code_img = Image.open(img)
61    # 轉換為灰度
62    gray_img = code_img.convert('L')
63    # 增強亮度
64    enhance_img = ImageEnhance.Contrast(gray_img)
65    enhance_img = enhance_img.enhance(3)
66    return enhance_img
67 
68 
69 if __name__ == '__main__':
70 
71    # base_url = 'http://reg.email.163.com/unireg/call.do?cmd=register.entrance&from=126mail'
72    #
73    # driver = webdriver.Chrome()
74    # driver.maximize_window()
75    # driver.get(base_url)
76    # get_code_img(driver)
77    # driver.close()
78    img = gray_img('d.png')
79    img = clear_noise(img)
80    img.show()
81    text = pytesseract.image_to_string(img)
82    print(text)

　　執行結果

　　雖然還是失敗的。但至少已經接近了...

　　此次只是對驗證碼的識別做簡單的嘗試。雖然此方法識別率不是很高。當然網上有很多收費的識別平臺，通過大量聯絡識別率是很高的，有興趣的可以去了解下。

　　在認識驗證碼後我又來興趣了，想去探個究竟驗證碼是怎樣生成的...下次分享（皮一下）

python之驗證碼的生成

　　在識別驗證碼的玩虐後，決定去看看他是怎麼生成的。

大致步驟：

1.建立圖片

2.對背景畫素處理

3.寫入識別碼

4.增加干擾線

5.濾鏡處理

用到的庫

1 import random
2 
3 from PIL import Image, ImageFont, ImageDraw,ImageFilter

　　在開始之前，瞭解下Image下圖片的基本屬性

　　print（Image.open('img.jpeg')）

　　結果：<PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=500x291 at 0x103BA3FD0>

　　　　列印的是：圖片格式、mode：彩色值、size：尺寸

　　也可以直接獲取該圖片的相關屬性

　　img = Image.open('img.jpeg')

　　print(img.size, img.format, img.mode)

　　　　結果： (500, 291) JPEG RGB

現在開始生成驗證碼

>>>>建立圖片

1 from PIL import Image
2 
3 width = 240
4 height = 60
5 
6 # 影像生成
7 image = Image.new('RGB', (width,height), color='red')
8 image.show()

　　new（）是建立一個圖片，第一個引數為圖片mode也就是色彩值；

　　第二個引數為圖片的大小；

　　第三個引數是圖片顏色。

　　show（）方法是展示圖片

　　執行後結果

>>>>對背景畫素處理

1 # 填充每個畫素點
2 for i in range(width):
3    for j in range(height
4 
5 ):
6        draw.point((i,j), fill=random_bgcolor())

　　random_bgcolor（）：也是自定義的方法，隨機產生顏色。

def random_bgcolor():
   return (random.randint(60,200), random.randint(60,200),random.randint(60,200))

　　返回一個RGB色彩值，其中的顏色取值根據需要設定吧。

列印結果

>>>>寫入識別碼

1 draw = ImageDraw.Draw(image)
2 # 寫入資訊
3 for i in range(4):
4    draw.text((60*i+10, 10), get_random(1,4), font=font, fill=random_color())

　　ImageDraw.Draw(image)是在圖片image上建立一個畫筆

　　For迴圈：迴圈產生4個數字或字母

　　draw.text()方法是寫入的內容，

　　　　第一個引數是座標，座標自己通過圖片尺寸稍為計算下，合理佈局；

　　　　第二個引數是寫入的內容值，這裡傳入的是讓系統隨機產生一個數，方法可以自己定義；

　　　　第三個font為字型，設定的字型必須存在

　　　　第四個是對寫入的內容附上顏色，這裡傳入的是讓系統隨機產生一個顏色，方法可以自己定義；

第二個引數的方法如下：

 1 def get_random(num,many):
 2        for i in range(many):
 3            s = ""
 4            for j in range(num):
 5                n = random.randint(1,2) # n==1生成數字，n=2生成字母
 6                if n == 1:
 7                    num1 = random.randint(0, 9)
 8                    s +=str(num1)
 9                else:
10                    s +=str(random.choice(string.ascii_letters))
11 
12    return s

第三個引數字型：

font = ImageFont.truetype('Arial.ttf',36)

第四個引數的方法如下：

　　直接返回RGB顏色值

1 def random_color():
2    return (random.randint(64,255), random.randint(64,255), random.randint(64,255))

　　執行上面程式碼結果：

>>>>增加干擾線

　　在生成的驗證碼圖片上新增一條干擾線

1 for i in range(2):
2    x1 = random.randint(0, width)
3    y1 = random.randint(0, height)
4    x2 = random.randint(0, width)
5    y2 = random.randint(0, height)
6    draw.line((x1, y1, x2, y2), fill=random_bgcolor(),width=3)

　　draw.line（）是畫線方法

　　第一個引數：線條座標，即位置。如上是在圖片範圍內位置隨機

　　第二個引數：線條的顏色，還是讓隨機產生

　　第三個引數：線條的寬度，不設定的話預設為0

　　執行結果

>>>>濾鏡處理

　　增加濾鏡，可以增加顏色的不同

　　很簡單，一行程式碼搞定

1 image = image.filter(ImageFilter.BLUR)

結果如下：

　　非常抱歉，我設定產生的隨機色顏色值沒有調對，導致背景色和字型色顏色太接近，效果看起來不是很好。

　　但是濾鏡不是必須項，可以不設定。

完整程式碼如下

 1 import string
 2 
 3 import random
 4 from PIL import Image, ImageFont, ImageDraw,ImageFilter
 5 
 6 # 生成隨機大小數字
 7 def get_random(num,many):
 8        for i in range(many):
 9            s = ""
10            for j in range(num):
11                n = random.randint(1,2) # n==1生成數字，n=2生成字母
12                if n == 1:
13                    num1 = random.randint(0, 9)
14                    s +=str(num1)
15                else:
16                    s +=str(random.choice(string.ascii_letters))
17            return s
18 
19 # 隨機顏色RGB
20 def random_color():
21    return (random.randint(64,255), random.randint(64,255), random.randint(64,255))
22 
23 # 隨機顏色RGB
24 def random_bgcolor():
25    return (random.randint(60,200), random.randint(60,200), random.randint(60,200))
26 
27 # 字型，字型大小
28 font = ImageFont.truetype('Arial.ttf',36)
29 
30 # 圖片尺寸
31 width = 240
32 height = 60
33 
34 # 影像生成
35 image = Image.new('RGB', (width,height), color='red')
36 
37 # 建立繪圖物件
38 draw = ImageDraw.Draw(image)
39 
40 # 填充背景色
41 for i in range(width):
42    for j in range(height):
43        draw.point((i,j), fill=random_bgcolor())
44 
45 # 寫入資訊
46 for i in range(4):
47     draw.text((60*i+10, 10), get_random(1,4), font=font, fill=random_color())
48 
49 # 插入干擾線
50 for i in range(2):
51    x1 = random.randint(0, width)
52    y1 = random.randint(0, height)
53    x2 = random.randint(0, width)
54    y2 = random.randint(0, height)
55    draw.line((x1, y1, x2, y2), fill=random_bgcolor(),width=3)
56 
57 # 新增濾鏡
58 image = image.filter(ImageFilter.BLUR)
59 
60 # 展示圖片
61 image.show()
62 
63 # 儲存
64 image.save('code.png')

原文釋出在自動化軟體測試微信公眾號，歡迎關注

原文地址：https://mp.weixin.qq.com/s/x3QT8njMX2wKPXKxqDPRyg

10 行 Python 程式碼，批量壓縮圖片 500 張，簡直太強大了
2019-05-08
Python
大環境不好，囂張如我在公司也夾著尾巴做人了
2024-04-12
flowable 繞過idm自帶的身份驗證
2022-04-27
每次登入驗證都用Python來識別驗證碼，真的是太方便了！
2018-09-12
Python
有了這個Python庫，免費實現驗證碼識別！
2023-03-31
Python
他來了! 他來了! 他帶著 Gopher 專屬衛衣來了~
2020-11-30
Go
安全性測試入門 (五)：Insecure CAPTCHA 驗證碼繞過
2019-05-30
APT
完了，這個硬體成精了，它竟然繞過了 CPU...
2020-08-16
xx開啟了朋友驗證
2018-06-28
Bash從GoogleDrive下載大檔案時繞過病毒驗證頁面的指令碼
2024-12-11
Go指令碼
業界跳票王，數它最囂張
2020-12-25
搞定了！OAuth2使用驗證碼進行授權
2022-05-20
OAuth
我太菜了
2024-11-28
Python快速生成驗證碼
2019-10-07
Python
同事加密壓縮包密碼忘記了，我用python幫他破解！
2020-12-01
加密密碼Python
vue+webpack繞過QQ音樂介面對host的驗證
2018-06-27
VueWeb
Duo Security 研究人員對PayPal雙重驗證的繞過
2020-08-19
VMware 修復CVSS評分9.8的身份驗證繞過漏洞
2022-08-03
DrissionPage 過滑動驗證碼
2024-12-04
恕我直言，Python - http.client接入簡訊驗證碼，看這一篇就夠了
2020-11-10
PythonHTTPclient
關於這個“微信提現”的問題，太炸裂了，以至於我寫了段程式碼來驗證！
2023-02-20
找工作太難了。
2024-07-08
更新了！帶Agent的Cursor太瘋狂了
2024-11-25
華為 OD 過了，經驗貼分享
2024-06-04
powershell程式碼混淆繞過
2020-06-21
sqlmap常用繞過指令碼
2020-10-20
SQL指令碼
阿里排查神器，太強了！
2023-01-10
阿里
python利用Tesseract識別驗證碼
2019-01-21
Python
Python識別網站驗證碼
2020-08-19
Python網站
Stack Overflow首席大神，他回答了超過3萬個問題
2018-12-19
python 驗證碼識別示例（一）某個網站驗證碼識別
2018-08-03
Python網站
關於HttpClient繞過SSL認證以及NTLM認證
2018-11-06
HTTPclient
playwright--自動化（二）：過滑塊驗證碼驗證碼缺口識別
2022-01-04
利用基於 NTP 的 TOTP 演算法缺陷繞過 WordPress 登陸驗證
2020-08-19
演算法
Laravel 8.55 新新增了條件驗證規則
2021-08-25
Laravel
他們測試了上萬款APP應用，總結了APP測試的經驗及流程
2020-01-07
APP
vue表單驗證你真的會了嗎？元件之表單驗證（form）validate
2019-04-04
Vue元件ORM
js繞過-前端加密繞過
2021-08-12
JS前端加密

太囂張了！他竟用Python繞過了“驗證碼”

python之驗證碼的生成

相關文章