快速入門PaddleOCR，並使用其開發一個搜題小工具

陝西顏值扛把子發表於2021-09-04

原文網址 : https://www.cnblogs.com/puzhiwei/p/15227450.html

介紹

PaddleOCR 是一個基於百度飛槳的OCR工具庫，包含總模型僅8.6M的超輕量級中文OCR，單模型支援中英文數字組合識別、豎排文字識別、長文字識別。同時支援多種文字檢測、文字識別的訓練演算法。

本教程將介紹PaddleOCR的基本使用方法以及如何使用它開發一個自動搜題的小工具。

專案地址：

https://gitee.com/puzhiweizuishuai/OCR-CopyText-And-Search

https://github.com/PuZhiweizuishuai/OCR-CopyText-And-Search

安裝

雖然PaddleOCR支援服務端部署並提供識別API，但根據我們的需求，搭建一個本地離線的OCR識別環境，所以此次我們只介紹如何在本地安裝並是被的做法。

安裝PaddlePaddle飛槳框架

一、環境準備

1.1 目前飛槳支援的環境

Windows 7/8/10 專業版/企業版 (64bit)

GPU版本支援CUDA 10.1/10.2/11.0/11.2，且僅支援單卡

Python 版本 3.6+/3.7+/3.8+/3.9+ (64 bit)

pip 版本 20.2.2或更高版本 (64 bit)

二、安裝命令

pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple

(注意此版本為CPU版本，如需GPU版本請檢視PaddlePaddle文件)

安裝完成後您可以使用 python 進入python直譯器，輸入import paddle ，再輸入 paddle.utils.run_check()

如果出現PaddlePaddle is installed successfully!，說明您已成功安裝。

安裝PaddleOCR

pip install "paddleocr>=2.0.1" # 推薦使用2.0.1+版本

程式碼使用

安裝完成後你可以使用以下程式碼來進行簡單的功能測試


from paddleocr import PaddleOCR, draw_ocr

# Paddleocr目前支援中英文、英文、法語、德語、韓語、日語，可以通過修改lang引數進行切換
# 引數依次為`ch`, `en`, `french`, `german`, `korean`, `japan`。
ocr = PaddleOCR(use_angle_cls=True, lang="ch")  # need to run only once to download and load model into memory
# 選擇你要識別的圖片路徑
img_path = '11.jpg'
result = ocr.ocr(img_path, cls=True)
for line in result:
    print(line)

# 顯示結果
from PIL import Image

image = Image.open(img_path).convert('RGB')
boxes = [line[0] for line in result]
txts = [line[1][0] for line in result]
scores = [line[1][1] for line in result]
im_show = draw_ocr(image, boxes, txts, scores, font_path='/path/to/PaddleOCR/doc/fonts/simfang.ttf')
im_show = Image.fromarray(im_show)
im_show.save('result.jpg')

結果是一個list，每個item包含了文字框，文字和識別置信度

[[[24.0, 36.0], [304.0, 34.0], [304.0, 72.0], [24.0, 74.0]], ['純臻營養護髮素', 0.964739]]
[[[24.0, 80.0], [172.0, 80.0], [172.0, 104.0], [24.0, 104.0]], ['產品資訊/引數', 0.98069626]]
[[[24.0, 109.0], [333.0, 109.0], [333.0, 136.0], [24.0, 136.0]], ['（45元/每公斤，100公斤起訂）', 0.9676722]]
......

視覺化效果

至此我們就掌握了 PaddleOCR 的基本使用，基於這個我們就能開發出一個OCR的搜題小工具了。

更多使用方法請參考：https://aistudio.baidu.com/aistudio/projectdetail/507159

搜題小工具

現在有很多那種答題競賽的小遊戲，在限定時間內看誰答題正確率更高。或者現在一些單位會搞一些大練兵什麼的競賽，需要在網上答題，這個時候手動輸入題目去搜尋就很慢，效率也不會太高，所以我們就可以來寫一個指令碼，幫助我們完成搜題的過程。

基本思路就是通過ADB擷取當前螢幕，然後剪下出題目所在位置，然後通過PaddleOCR來獲取題目文字，之後開啟搜尋引擎搜尋或者開啟題庫搜尋。

安裝ADB

你可以到這裡下載安裝ADB之後配置環境變數。

配置完環境變數後在終端輸入adb,如果出現以下字元則證明adb安裝完成。

Android Debug Bridge version 1.0.41
Version 31.0.3-7562133

截圖並儲存題目區域圖片

import os
from PIL import Image

# 截圖
def pull_screenshot():
    os.system('adb shell screencap -p /sdcard/screenshot.png')
    os.system('adb pull /sdcard/screenshot.png .')

img = Image.open("./screenshot.png")
# 切割問題區域
# (起始點的橫座標，起始點的縱座標，寬度，高度）
question  = img.crop((10, 400, 1060, 1000))
# 儲存問題區域
question.save("./question.png")

OCR識別，獲取題目

ocr = PaddleOCR(use_angle_cls=False, 
                        lang="ch", 
                        show_log=False
                        )  # need to run only once to download and load model into memory
img_path = 'question.png'
result = ocr.ocr(img_path, cls=False)

# 獲取題目文字
questionList = [line[1][0] for line in result]
text = ""
# 將陣列轉換為字串
for str in questionList :
    text += str
print(text)

開啟瀏覽器搜尋

import webbrowser
webbrowser.open('https://baidu.com/s?wd=' + urllib.parse.quote(question))

之後你就可以檢視搜尋結果了

如果有題庫，你還可以使用pyautogui來模擬滑鼠鍵盤操作，去操作Word等軟體在題庫中進行搜尋。

完整程式碼

# -*- coding: utf-8 -*-

# @Author  : Pu Zhiwei
# @Time    : 2021-09-02 20:29

from PIL import Image
import os
import matplotlib.pyplot as plt
from paddleocr import PaddleOCR, draw_ocr
import pyperclip
import pyautogui
import time
import webbrowser
import urllib.parse


# 滑鼠位置
currentMouseX, currentMouseY = 60, 282

# 截圖獲取當前題目
def pull_screenshot():
    os.system('adb shell screencap -p /sdcard/screenshot.png')
    os.system('adb pull /sdcard/screenshot.png .')

# 移動滑鼠到搜尋框搜尋
def MoveMouseToSearch():
    # duration 引數，移動時間，即用時0.1秒移動到對應位置
    pyautogui.moveTo(currentMouseX, currentMouseY, duration=0.1)
    # 左鍵點選
    pyautogui.click()
    pyautogui.click()
    # 模擬組合鍵，貼上
    pyautogui.hotkey('ctrl', 'v')

# 擴充問題
def AddText(list, length, text):
    if length > 3:
        return text + list[3]
    else:
        return text
# 開啟瀏覽器
def open_webbrowser(question):
    webbrowser.open('https://baidu.com/s?wd=' + urllib.parse.quote(question))


# 顯示所識別的題目
def ShowAllQuestionText(list):
    text = ""
    for str in list:
        text += str
    print(text)



if __name__ == "__main__":
    while True:
        print("\n\n請將滑鼠放在Word的搜尋框上，三秒後指令碼將自動獲取Word搜尋框位置！\n\n")
        # 延時三秒輸出滑鼠位置
        time.sleep(3)
        # 獲取當前滑鼠位置
        currentMouseX, currentMouseY = pyautogui.position()
        print('當前滑鼠位置為: {0} , {1}'.format(currentMouseX, currentMouseY))
        start = input("按y鍵程式開始執行，按其他鍵重新獲取搜尋框位置：")
        if start == 'y':
            break

    while True:
        t = time.perf_counter()
        pull_screenshot()
        img = Image.open("./screenshot.png")
        # 切割問題區域
        # (起始點的橫座標，起始點的縱座標，寬度，高度）
        question  = img.crop((10, 400, 1060, 1000))
        # 儲存問題區域
        question.save("./question.png")


        # 載入 PaddleOCR
        # Paddleocr目前支援中英文、英文、法語、德語、韓語、日語，可以通過修改lang引數進行切換
        # 引數依次為`ch`, `en`, `french`, `german`, `korean`, `japan`。

        # 自定義模型地址
        # det_model_dir='./inference/ch_ppocr_server_v2.0_det_train', 
        #                rec_model_dir='./inference/ch_ppocr_server_v2.0_rec_pre',
        #                cls_model_dir='./inference/ch_ppocr_mobile_v2.0_cls_train',
        ocr = PaddleOCR(use_angle_cls=False, 
                        lang="ch", 
                        show_log=False
                        )  # need to run only once to download and load model into memory
        img_path = 'question.png'
        result = ocr.ocr(img_path, cls=False)

        questionList = [line[1][0] for line in result]
        length = len(questionList)
        text = ""
        if length < 1:
            text = questionList[0]
        elif length == 2:
            text = questionList[1]
        else:
            text = questionList[1] + questionList[2]

        print('\n\n')
        ShowAllQuestionText(questionList)
        # 將結果寫入剪下板
        pyperclip.copy(text)
        # 點選搜尋
        MoveMouseToSearch()
        
        # 計算時間
        print('\n\n')
        end_time3 = time.perf_counter()
        print('用時: {0}'.format(end_time3 - t))
        
        go = input('輸入回車繼續執行,輸入 e 開啟瀏覽器搜尋，輸入 a 增加題目長度，輸入 n 結束程式執行： ')
        if go == 'n':
            break
  
        if go == 'a':
            text = AddText(questionList, length, text)
            pyperclip.copy(text)
            # 點選搜尋
            MoveMouseToSearch()
            stop = input("輸入回車繼續")
        elif go == 'e':
            # 開啟瀏覽器
            open_webbrowser(text)
            stop = input("輸入回車繼續")

        print('\n------------------------\n\n')

使用Electron製作一個快速搜尋應用（入門向）
2018-11-27
快速開發一個自定義 Spring Boot Starter，並使用它
2019-01-17
Spring Boot
一、鴻蒙開發-ArkTS快速入門
2024-11-10
鴻蒙
前端開發快速入門
2020-10-22
前端
HarmonyOS快速開發入門
2021-07-23
使用 typescript 快速開發一個 cli
2020-12-08
TypeScript
Python GUI開發- PyQt5 開發小工具環境入門
2024-04-24
PythonGUIQT
Koa2開發快速入門
2019-02-21
GO 語言快速開發入門
2020-06-10
Go
【SpringBoot學習一】開發入門--快速建立springboot程式
2022-03-31
Spring Boot
Web開發初探之JavaScript 快速入門
2020-10-04
WebJavaScript
Springboot快速入門篇，圖文並茂
2020-08-09
Spring Boot
PaddleOCR 安裝使用遇到的問題
2024-07-04
從一個小專案快速入門Scss
2018-10-02
CSS
JS快速入門（一）
2022-02-12
JS
（一）TypeScript開發入門
2024-10-27
TypeScript
Java 開發者的 Python 快速入門指南
2024-11-30
JavaPython
通俗易懂的ArcGis開發快速入門
2022-04-26
Util應用框架快速入門(4) - 整合測試開發入門
2023-10-26
框架
如何使用Tampermonkey開發並使用一個瀏覽器指令碼
2023-11-27
瀏覽器指令碼
如何找到並快速上手一個開源專案
2024-07-01
vue 快速入門系列 —— 使用 vue-cli 3 搭建一個專案（下）
2021-11-15
Vue
vue 快速入門系列 —— 使用 vue-cli 3 搭建一個專案（上）
2021-11-12
Vue
8天讓iOS開發者上手Flutter之一：快速入門Flutter
2021-07-10
iOSFlutter
IOS 初級開發入門教程（二）第一個HelloWorld工程及StoryBoard使用
2018-05-18
iOS
一篇帶你快速入門ansible和使用
2020-09-23
Spring Boot （一）快速入門
2019-01-19
Spring Boot
RabbitMQ（一）：RabbitMQ快速入門
2019-07-20
MQ
Logback 快速入門 / 使用詳解
2021-08-13
輸入多個編碼並支援模糊搜尋，引數是一個list
2024-05-15
24 個例項入門並掌握「Webpack4」(一)
2019-04-12
Web
整理js開發中的實用小工具（一）：做一個整合儲存的小工具
2019-01-01
JS
一個後端開發的 Vue 筆記【入門級】
2020-09-07
後端Vue筆記
SpringBoot整合RabbitMQ(一)快速入門
2019-01-08
Spring BootMQ
Solon詳解（一）- 快速入門
2020-08-17
一文快速入門Docker
2019-06-22
Docker
node.js快速入門（一）
2020-12-17
Node.js
【Spring註解驅動開發】使用@Import註解給容器中快速匯入一個元件
2020-06-10
SpringImport元件