程式碼倉庫
Hyperf
tesseract-ocr-for-php
建立專案
composer create hyperf/biz-skeleton tesseract-demo
安裝擴充套件
composer require thiagoalessio/tesseract_ocr
編寫介面
編寫單元測試
首先,我們編寫介面單元測試。當沒有傳 url
時,錯誤碼返回 1000
,當傳入圖片地址後,返回對應的識別資訊。
<?php
declare(strict_types=1);
/**
* This file is part of Hyperf.
*
* @link https://www.hyperf.io
* @document https://doc.hyperf.io
* @contact group@hyperf.io
* @license https://github.com/hyperf-cloud/hyperf/blob/master/LICENSE
*/
namespace HyperfTest\Cases;
use HyperfTest\HttpTestCase;
/**
* @internal
* @coversNothing
*/
class ImageTest extends HttpTestCase
{
public function testImageReadFailed()
{
$res = $this->json('/image/read');
$this->assertSame(1000, $res['code']);
}
public function testImageRead()
{
$res = $this->json('/image/read', [
'url' => 'https://raw.githubusercontent.com/thiagoalessio/tesseract-ocr-for-php/master/tests/EndToEnd/images/8055.png',
]);
$this->assertSame(0, $res['code']);
$this->assertSame('8055', $res['data']);
}
}
新增 Task
因為我們要使用的 thiagoalessio/tesseract_ocr
庫不確定是否可以進行協程排程,所以我們把具體邏輯放到 Task 中執行。
composer require hyperf/task
改造我們的 server.php
<?php
declare(strict_types=1);
use Hyperf\Server\SwooleEvent;
return [
// 這裡省略了其它不相關的配置項
'settings' => [
// Task Worker 數量,根據您的伺服器配置而配置適當的數量
'task_worker_num' => 8,
// 因為 `Task` 主要處理無法協程化的方法,所以這裡推薦設為 `false`,避免協程下出現資料混淆的情況
'task_enable_coroutine' => false,
],
'callbacks' => [
// Task callbacks
SwooleEvent::ON_TASK => [Hyperf\Framework\Bootstrap\TaskCallback::class, 'onTask'],
SwooleEvent::ON_FINISH => [Hyperf\Framework\Bootstrap\FinishCallback::class, 'onFinish'],
],
];
編寫實現邏輯
實現邏輯,非常簡單,我們把網圖儲存到本地,然後透過 tesseract
來讀取即可。
<?php
declare(strict_types=1);
/**
* This file is part of Hyperf.
*
* @link https://www.hyperf.io
* @document https://doc.hyperf.io
* @contact group@hyperf.io
* @license https://github.com/hyperf-cloud/hyperf/blob/master/LICENSE
*/
namespace App\Service;
use Hyperf\Guzzle\ClientFactory;
use Hyperf\Task\Annotation\Task;
use thiagoalessio\TesseractOCR\TesseractOCR;
class ImageService
{
public function read(string $url)
{
$path = $this->save($url);
return $this->tesseract($path);
}
/**
* @Task
*/
public function tesseract(string $path)
{
return (new TesseractOCR($path))->run();
}
protected function save(string $url): string
{
$client = di()->get(ClientFactory::class)->create();
$content = $client->get($url)->getBody()->getContents();
$path = BASE_PATH . '/runtime/' . uniqid();
file_put_contents($path, $content);
return $path;
}
}
控制器程式碼如下
<?php
declare(strict_types=1);
/**
* This file is part of Hyperf.
*
* @link https://www.hyperf.io
* @document https://doc.hyperf.io
* @contact group@hyperf.io
* @license https://github.com/hyperf-cloud/hyperf/blob/master/LICENSE
*/
namespace App\Controller;
use App\Constants\ErrorCode;
use App\Exception\BusinessException;
use App\Service\ImageService;
use Hyperf\Di\Annotation\Inject;
use Hyperf\HttpServer\Annotation\AutoController;
/**
* @AutoController(prefix="/image")
*/
class ImageController extends Controller
{
/**
* @Inject
* @var ImageService
*/
protected $service;
public function read()
{
$url = $this->request->input('url');
if (empty($url)) {
throw new BusinessException(ErrorCode::PARAMS_INVALID);
}
$result = $this->service->read($url);
return $this->response->success($result);
}
}
執行單元測試
$ composer test
> co-phpunit -c phpunit.xml --colors=always
Scanning ...
Scan completed.
[DEBUG] Event Hyperf\Framework\Event\BootApplication handled by Hyperf\Di\Listener\BootApplicationListener listener.
[DEBUG] Event Hyperf\Framework\Event\BootApplication handled by Hyperf\Config\Listener\RegisterPropertyHandlerListener listener.
[DEBUG] Event Hyperf\Framework\Event\BootApplication handled by Hyperf\Paginator\Listener\PageResolverListener listener.
PHPUnit 7.5.14 by Sebastian Bergmann and contributors.
... 3 / 3 (100%)
Time: 1.11 seconds, Memory: 42.00 MB
OK (3 tests, 13 assertions)
當然,也可以透過 CURL 訪問,檢視結果
$ curl http://127.0.0.1:9501/image/read -H 'Content-Type:application/json' -d '{"url":"https://raw.githubusercontent.com/thiagoalessio/tesseract-ocr-for-php/master/tests/EndToEnd/images/8055.png"}'
{"code":0,"data":"8055"}%
修改 Docker
因為預設的 Docker環境
沒有 tesseract
,所以我們修改一下 Dockerfile
# Default Dockerfile
#
# @link https://www.hyperf.io
# @document https://doc.hyperf.io
# @contact group@hyperf.io
# @license https://github.com/hyperf-cloud/hyperf/blob/master/LICENSE
FROM hyperf/hyperf:7.2-alpine-cli
LABEL maintainer="Hyperf Developers <group@hyperf.io>" version="1.0" license="MIT"
##
# ---------- env settings ----------
##
# --build-arg timezone=Asia/Shanghai
ARG timezone
ENV TIMEZONE=${timezone:-"Asia/Shanghai"} \
COMPOSER_VERSION=1.8.6 \
APP_ENV=prod
# update
RUN set -ex \
&& apk update \
# 安裝 tesseract-ocr
&& apk add tesseract-ocr \
# install composer
&& cd /tmp \
&& wget https://github.com/composer/composer/releases/download/${COMPOSER_VERSION}/composer.phar \
&& chmod u+x composer.phar \
&& mv composer.phar /usr/local/bin/composer \
# show php version and extensions
&& php -v \
&& php -m \
# ---------- some config ----------
&& cd /etc/php7 \
# - config PHP
&& { \
echo "upload_max_filesize=100M"; \
echo "post_max_size=108M"; \
echo "memory_limit=1024M"; \
echo "date.timezone=${TIMEZONE}"; \
} | tee conf.d/99-overrides.ini \
# - config timezone
&& ln -sf /usr/share/zoneinfo/${TIMEZONE} /etc/localtime \
&& echo "${TIMEZONE}" > /etc/timezone \
# ---------- clear works ----------
&& rm -rf /var/cache/apk/* /tmp/* /usr/share/man \
&& echo -e "\033[42;37m Build Completed :).\033[0m\n"
WORKDIR /opt/www
COPY . /opt/www
RUN composer install --no-dev \
&& composer dump-autoload -o \
&& php /opt/www/bin/hyperf.php di:init-proxy
EXPOSE 9501
ENTRYPOINT ["php", "/opt/www/bin/hyperf.php", "start"]
構造映象
docker build -t tesseract-demo .
啟動映象
docker run --rm -p 9501:9501 -d --name tesseract-demo tesseract-demo
測試
$ curl http://127.0.0.1:9501/image/read -H 'Content-Type:application/json' -d '{"url":"https://raw.githubusercontent.com/thiagoalessio/tesseract-ocr-for-php/master/tests/EndToEnd/images/8055.png"}'
{"code":0,"data":"8055"}%
本作品採用《CC 協議》,轉載必須註明作者和本文連結