作者:閆永強,演算法工程師,Datawhale成員
本文透過自建手勢資料集,利用YOLOv5s檢測,然後透過開源資料集訓練squeezenet進行手部關鍵點預測,最後透過指間的夾角演算法來判斷具體的手勢,並顯示出來。文章第四部分為用C++實現整體的ncnn推理(程式碼較長,可先馬後看)
一、YOLOV5訓練手部檢測
訓練及部署思路類似表情識別,需要將handpose資料集標籤改成一類,只檢測手部,簡化流程,更易上手。
此部分資料集來源格物鈦 https://gas.graviti.cn/dataset/datawhale/HandPose,具體的效果如圖:
本教程所用訓練環境: 系統環境:Ubuntu16.04 cuda版本:10.2 cudnn版本:7.6.5 pytorch版本:1.6.0 python版本:3.8 部署環境: 編譯器:vs2015 依賴庫:opencv ncnn 外設:普通USB攝像頭
二、手部關節點檢測
1、依賴環境
和YOLOV5訓練手部檢測一致。
2、檢測資料集準備
該資料集包括網路圖片以及資料集<Large-scale Multiview 3D Hand Pose Dataset> 篩選動作重複度低的圖片,進行製作大概有5w張資料樣本。其中<Large-scale Multiview 3D Hand Pose Dataset>資料集的官網地址:http://www.rovit.ua.es/dataset/mhpdataset/,其中標註檔案示例如圖2所示
製作好可以直接訓練的資料集放在了開源資料平臺格物鈦:https://gas.graviti.com/dataset/datawhale/HandPoseKeyPoints
3、資料集線上使用
步驟1:安裝格物鈦平臺SDK
pip install tensorbay
步驟2: 資料預處理
要使用已經處理好可以直接訓練的資料集,步驟如下:
a. 開啟本文對應資料集連結 https://gas.graviti.cn/dataset/datawhale/HandPose,在資料集頁面,fork資料集到自己賬戶下;
b. 點選網頁上方開發者工具 --> AccessKey --> 新建一個AccessKey --> 複製這個Key:KEY = 'Acces...........'
我們可以在不下載資料集的情況下,透過格物鈦進行資料預處理,並將結果儲存在本地。下面以使用HandPose資料集為例,使用HandPoseKeyPoints資料集操作同HandPose操作一樣。
資料集開源地址:
https://gas.graviti.com/dataset/datawhale/HandPoseKeyPoints
完整專案程式碼:
https://github.com/datawhalechina/HandPoseKeyPoints
import numpy as np from PIL import Image from tensorbay import GAS from tensorbay.dataset import Dataset def read_gas_image(data): with data.open() as fp: image = Image.open(fp) image.load() return np.array(image) # Authorize a GAS client. gas = GAS('填入你的AccessKey') # Get a dataset. dataset = Dataset("HandPose", gas)
dataset.enable_cache("data") # List dataset segments. segments = dataset.keys() # Get a segment by name segment = dataset["train"] for data in segment: # 圖片資料 image = read_gas_image(data) # 標籤資料 # Use the data as you like. for label_box2d in data.label.box2d: xmin = label_box2d.xmin ymin = label_box2d.ymin xmax = label_box2d.xmax ymax = label_box2d.ymax box2d_category = label_box2d.category
break
資料集頁面視覺化效果:
#資料集劃分
print(segments)
# ("train",'val')
print(len(dataset["train"]), "images in train dataset")
print(len(dataset["val"]), "images in valid dataset")
# 1306 images in train dataset
# 14 images in valid datas
4、關節點檢測原理
關節點檢測pipeline流程是:
1)輸入圖片對應手部的42個關節點座標,
2)整個網路的backbone可以是任何分類網路,我這裡採用的是squeezenet,然後損失函式是wingloss。
3)整個過程就是輸入原圖經過squeezenet網路計算出42個座標值,然後透過wingloss進行迴歸計算更新權重,最後達到指定閾值,得出最終模型。
5、手部關節點訓練
手部關節點演算法採用開原始碼參考地址:https://gitcode.net/EricLee/handpose_x
1)預訓練模型
預訓練模型在上述連結中有相應的網盤連結,可以直接下載。如果不想用預訓練模型,可以直接從原始分類網路的原始權重開始訓練。
2)模型的訓練
以下是訓練網路指定引數解釋,其意義直接看圖中註釋就可以了。
訓練只需要執行訓練命令,指定自己想要指定的引數就可以跑起來了,如下圖:
6、手部關節點模型轉換
1)安裝依賴庫
pip install onnx coremltools onnx-simplifi
2)匯出onnx模型
python model2onnx.py --model_path squeezenet1_1-size-256-loss-wing_loss-model_epoch-2999.pth --model squeezenet1
會出現如下圖所示
其中model2onnx.py檔案是在上述連結工程目錄下的。此時當前資料夾下會出現一個相應的onnx模型export。
3)用onnx-simplifer簡化模型
為什麼要簡化?
因為在訓練完深度學習的pytorch或者tensorflow模型後,有時候需要把模型轉成onnx,但是很多時候,很多節點比如cast節點,Identity這些節點可能都不需要,需要進行簡化,這樣會方便把模型轉成ncnn mnn等端側部署模型格式。
python -m onnxsim squeezenet1_1_size-256.onnx squeezenet1_1_sim.on
會出現下圖:
上述過程完成後就生成了簡化版本的模型squeezenet1_1_sim.onnx。
4)把檢測模型轉換成ncnn模型
可以直接利用網頁線上版本轉換模型,地址:https://convertmodel.com/ 頁面如圖:
選擇目標格式ncnn,選擇輸入格式onnx,點選選擇,選擇本地的簡化版本的模型,然後選擇轉換,可以看到轉換成功,下面兩個就是轉換成功的模型檔案,如圖。
三、利用關節點手勢識別演算法
透過對檢測到的手部關節點之間的角度計算,可以實現簡單的手勢識別。例如:計算大拇指向量0-2和3-4之間的角度,它們之間的角度大於某一個角度閾值(經驗值)定義為彎曲,小於某一個閾值(經驗值)為伸直。具體效果如下面三張圖。
四、工程推理部署整體實現
此關節點手勢識別的整體過程總結:首先是利用目標檢測模型檢測到手的位置,然後利用手部關節點檢測模型,檢測手部關節點具體位置,繪製關節點,以及關節點之間的連線。再利用簡單的向量之間角度進行手勢識別。
整體的ncnn推理C++ 程式碼實現:
#include <string>
#include <vector>
#include "iostream"
#include<cmath>
// ncnn
#include "ncnn/layer.h"
#include "ncnn/net.h"
#include "ncnn/benchmark.h"
#include "opencv2/core/core.hpp"
#include "opencv2/highgui/highgui.hpp"
#include <opencv2/imgproc.hpp>
#include "opencv2/opencv.hpp"
using namespace std;
using namespace cv;
static ncnn::UnlockedPoolAllocator g_blob_pool_allocator;
static ncnn::PoolAllocator g_workspace_pool_allocator;
static ncnn::Net yolov5;
static ncnn::Net hand_keyPoints;
class YoloV5Focus : public ncnn::Layer
{
public:
YoloV5Focus()
{
one_blob_only = true;
}
virtual int forward(const ncnn::Mat& bottom_blob, ncnn::Mat& top_blob, const ncnn::Option& opt) const
{
int w = bottom_blob.w;
int h = bottom_blob.h;
int channels = bottom_blob.c;
int outw = w / 2;
int outh = h / 2;
int outc = channels * 4;
top_blob.create(outw, outh, outc, 4u, 1, opt.blob_allocator);
if (top_blob.empty())
return -100;
#pragma omp parallel for num_threads(opt.num_threads)
for (int p = 0; p < outc; p++)
{
const float* ptr = bottom_blob.channel(p % channels).row((p / channels) % 2) + ((p / channels) / 2);
float* outptr = top_blob.channel(p);
for (int i = 0; i < outh; i++)
{
for (int j = 0; j < outw; j++)
{
*outptr = *ptr;
outptr += 1;
ptr += 2;
}
ptr += w;
}
}
return 0;
}
};
DEFINE_LAYER_CREATOR(YoloV5Focus)
struct Object
{
float x;
float y;
float w;
float h;
int label;
float prob;
};
static inline float intersection_area(const Object& a, const Object& b)
{
if (a.x > b.x + b.w || a.x + a.w < b.x || a.y > b.y + b.h || a.y + a.h < b.y)
{
// no intersection
return 0.f;
}
float inter_width = std::min(a.x + a.w, b.x + b.w) - std::max(a.x, b.x);
float inter_height = std::min(a.y + a.h, b.y + b.h) - std::max(a.y, b.y);
return inter_width * inter_height;
}
static void qsort_descent_inplace(std::vector<Object>& faceobjects, int left, int right)
{
int i = left;
int j = right;
float p = faceobjects[(left + right) / 2].prob;
while (i <= j)
{
while (faceobjects[i].prob > p)
i++;
while (faceobjects[j].prob < p)
j--;
if (i <= j)
{
std::swap(faceobjects[i], faceobjects[j]);
i++;
j--;
}
}
#pragma omp parallel sections
{
#pragma omp section
{
if (left < j) qsort_descent_inplace(faceobjects, left, j);
}
#pragma omp section
{
if (i < right) qsort_descent_inplace(faceobjects, i, right);
}
}
}
static void qsort_descent_inplace(std::vector<Object>& faceobjects)
{
if (faceobjects.empty())
return;
qsort_descent_inplace(faceobjects, 0, faceobjects.size() - 1);
}
static void nms_sorted_bboxes(const std::vector<Object>& faceobjects, std::vector<int>& picked, float nms_threshold)
{
picked.clear();
const int n = faceobjects.size();
std::vector<float> areas(n);
for (int i = 0; i < n; i++)
{
areas[i] = faceobjects[i].w * faceobjects[i].h;
}
for (int i = 0; i < n; i++)
{
const Object& a = faceobjects[i];
int keep = 1;
for (int j = 0; j < (int)picked.size(); j++)
{
const Object& b = faceobjects[picked[j]];
float inter_area = intersection_area(a, b);
float union_area = areas[i] + areas[picked[j]] - inter_area;
// float IoU = inter_area / union_area
if (inter_area / union_area > nms_threshold)
keep = 0;
}
if (keep)
picked.push_back(i);
}
}
static inline float sigmoid(float x)
{
return static_cast<float>(1.f / (1.f + exp(-x)));
}
static void generate_proposals(const ncnn::Mat& anchors, int stride, const ncnn::Mat& in_pad, const ncnn::Mat& feat_blob, float prob_threshold, std::vector<Object>& objects)
{
const int num_grid = feat_blob.h;
int num_grid_x;
int num_grid_y;
if (in_pad.w > in_pad.h)
{
num_grid_x = in_pad.w / stride;
num_grid_y = num_grid / num_grid_x;
}
else
{
num_grid_y = in_pad.h / stride;
num_grid_x = num_grid / num_grid_y;
}
const int num_class = feat_blob.w - 5;
const int num_anchors = anchors.w / 2;
for (int q = 0; q < num_anchors; q++)
{
const float anchor_w = anchors[q * 2];
const float anchor_h = anchors[q * 2 + 1];
const ncnn::Mat feat = feat_blob.channel(q);
for (int i = 0; i < num_grid_y; i++)
{
for (int j = 0; j < num_grid_x; j++)
{
const float* featptr = feat.row(i * num_grid_x + j);
// find class index with max class score
int class_index = 0;
float class_score = -FLT_MAX;
for (int k = 0; k < num_class; k++)
{
float score = featptr[5 + k];
if (score > class_score)
{
class_index = k;
class_score = score;
}
}
float box_score = featptr[4];
float confidence = sigmoid(box_score) * sigmoid(class_score);
if (confidence >= prob_threshold)
{
float dx = sigmoid(featptr[0]);
float dy = sigmoid(featptr[1]);
float dw = sigmoid(featptr[2]);
float dh = sigmoid(featptr[3]);
float pb_cx = (dx * 2.f - 0.5f + j) * stride;
float pb_cy = (dy * 2.f - 0.5f + i) * stride;
float pb_w = pow(dw * 2.f, 2) * anchor_w;
float pb_h = pow(dh * 2.f, 2) * anchor_h;
float x0 = pb_cx - pb_w * 0.5f;
float y0 = pb_cy - pb_h * 0.5f;
float x1 = pb_cx + pb_w * 0.5f;
float y1 = pb_cy + pb_h * 0.5f;
Object obj;
obj.x = x0;
obj.y = y0;
obj.w = x1 - x0;
obj.h = y1 - y0;
obj.label = class_index;
obj.prob = confidence;
objects.push_back(obj);
}
}
}
}
}
extern "C" {
void release()
{
fprintf(stderr, "YoloV5Ncnn finished!");
}
int init_handKeyPoint() {
ncnn::Option opt;
opt.lightmode = true;
opt.num_threads = 4;
opt.blob_allocator = &g_blob_pool_allocator;
opt.workspace_allocator = &g_workspace_pool_allocator;
opt.use_packing_layout = true;
fprintf(stderr, "handKeyPoint init!\n");
hand_keyPoints.opt = opt;
int ret_hand = hand_keyPoints.load_param("squeezenet1_1.param"); //squeezenet1_1 resnet_50
if (ret_hand != 0) {
std::cout << "ret_hand:" << ret_hand << std::endl;
}
ret_hand = hand_keyPoints.load_model("squeezenet1_1.bin"); //squeezenet1_1 resnet_50
if (ret_hand != 0) {
std::cout << "ret_hand:" << ret_hand << std::endl;
}
return 0;
}
int init()
{
fprintf(stderr, "YoloV5sNcnn init!\n");
ncnn::Option opt;
opt.lightmode = true;
opt.num_threads = 4;
opt.blob_allocator = &g_blob_pool_allocator;
opt.workspace_allocator = &g_workspace_pool_allocator;
opt.use_packing_layout = true;
yolov5.opt = opt;
yolov5.register_custom_layer("YoloV5Focus", YoloV5Focus_layer_creator);
// init param
{
int ret = yolov5.load_param("yolov5s.param");
if (ret != 0)
{
std::cout << "ret= " << ret << std::endl;
fprintf(stderr, "YoloV5Ncnn, load_param failed");
return -301;
}
}
// init bin
{
int ret = yolov5.load_model("yolov5s.bin");
if (ret != 0)
{
fprintf(stderr, "YoloV5Ncnn, load_model failed");
return -301;
}
}
return 0;
}
int detect(cv::Mat img, std::vector<Object> &objects)
{
double start_time = ncnn::get_current_time();
const int target_size = 320;
const int width = img.cols;
const int height = img.rows;
int w = img.cols;
int h = img.rows;
float scale = 1.f;
if (w > h)
{
scale = (float)target_size / w;
w = target_size;
h = h * scale;
}
else
{
scale = (float)target_size / h;
h = target_size;
w = w * scale;
}
cv::resize(img, img, cv::Size(w, h));
ncnn::Mat in = ncnn::Mat::from_pixels(img.data, ncnn::Mat::PIXEL_BGR2RGB, w, h);
int wpad = (w + 31) / 32 * 32 - w;
int hpad = (h + 31) / 32 * 32 - h;
ncnn::Mat in_pad;
ncnn::copy_make_border(in, in_pad, hpad / 2, hpad - hpad / 2, wpad / 2, wpad - wpad / 2, ncnn::BORDER_CONSTANT, 114.f);
{
const float prob_threshold = 0.4f;
const float nms_threshold = 0.51f;
const float norm_vals[3] = { 1 / 255.f, 1 / 255.f, 1 / 255.f };
in_pad.substract_mean_normalize(0, norm_vals);
ncnn::Extractor ex = yolov5.create_extractor();
ex.input("images", in_pad);
std::vector<Object> proposals;
{
ncnn::Mat out;
ex.extract("output", out);
ncnn::Mat anchors(6);
anchors[0] = 10.f;
anchors[1] = 13.f;
anchors[2] = 16.f;
anchors[3] = 30.f;
anchors[4] = 33.f;
anchors[5] = 23.f;
std::vector<Object> objects8;
generate_proposals(anchors, 8, in_pad, out, prob_threshold, objects8);
proposals.insert(proposals.end(), objects8.begin(), objects8.end());
}
{
ncnn::Mat out;
ex.extract("771", out);
ncnn::Mat anchors(6);
anchors[0] = 30.f;
anchors[1] = 61.f;
anchors[2] = 62.f;
anchors[3] = 45.f;
anchors[4] = 59.f;
anchors[5] = 119.f;
std::vector<Object> objects16;
generate_proposals(anchors, 16, in_pad, out, prob_threshold, objects16);
proposals.insert(proposals.end(), objects16.begin(), objects16.end());
}
{
ncnn::Mat out;
ex.extract("791", out);
ncnn::Mat anchors(6);
anchors[0] = 116.f;
anchors[1] = 90.f;
anchors[2] = 156.f;
anchors[3] = 198.f;
anchors[4] = 373.f;
anchors[5] = 326.f;
std::vector<Object> objects32;
generate_proposals(anchors, 32, in_pad, out, prob_threshold, objects32);
proposals.insert(proposals.end(), objects32.begin(), objects32.end());
}
// sort all proposals by score from highest to lowest
qsort_descent_inplace(proposals);
std::vector<int> picked;
nms_sorted_bboxes(proposals, picked, nms_threshold);
int count = picked.size();
objects.resize(count);
for (int i = 0; i < count; i++)
{
objects[i] = proposals[picked[i]];
float x0 = (objects[i].x - (wpad / 2)) / scale;
float y0 = (objects[i].y - (hpad / 2)) / scale;
float x1 = (objects[i].x + objects[i].w - (wpad / 2)) / scale;
float y1 = (objects[i].y + objects[i].h - (hpad / 2)) / scale;
// clip
x0 = std::max(std::min(x0, (float)(width - 1)), 0.f);
y0 = std::max(std::min(y0, (float)(height - 1)), 0.f);
x1 = std::max(std::min(x1, (float)(width - 1)), 0.f);
y1 = std::max(std::min(y1, (float)(height - 1)), 0.f);
objects[i].x = x0;
objects[i].y = y0;
objects[i].w = x1;
objects[i].h = y1;
}
}
return 0;
}
}
static const char* class_names[] = {"hand"};
void draw_face_box(cv::Mat& bgr, std::vector<Object> object)
{
for (int i = 0; i < object.size(); i++)
{
const auto obj = object[i];
cv::rectangle(bgr, cv::Point(obj.x, obj.y), cv::Point(obj.w, obj.h), cv::Scalar(0, 255, 0), 3, 8, 0);
std::cout << "label:" << class_names[obj.label] << std::endl;
string emoji_path = "emoji\\" + string(class_names[obj.label]) + ".png";
cv::Mat logo = cv::imread(emoji_path);
if (logo.empty()) {
std::cout << "imread logo failed!!!" << std::endl;
return;
}
resize(logo, logo, cv::Size(80, 80));
cv::Mat imageROI = bgr(cv::Range(obj.x, obj.x + logo.rows), cv::Range(obj.y, obj.y + logo.cols));
logo.copyTo(imageROI);
}
}
static int detect_resnet(const cv::Mat& bgr,std::vector<float>& output) {
ncnn::Mat in = ncnn::Mat::from_pixels_resize(bgr.data,ncnn::Mat::PIXEL_RGB,bgr.cols,bgr.rows,256,256);
const float mean_vals[3] = { 104.f,117.f,123.f };//
const float norm_vals[3] = { 1/255.f, 1/255.f, 1/255.f };//1/255.f
in.substract_mean_normalize(mean_vals, norm_vals); //0 mean_vals, norm_vals
ncnn::Extractor ex = hand_keyPoints.create_extractor();
ex.input("input", in);
ncnn::Mat out;
ex.extract("output",out);
std::cout << "out.w:" << out.w <<" out.h: "<< out.h <<std::endl;
output.resize(out.w);
for (int i = 0; i < out.w; i++) {
output[i] = out[i];
}
return 0;
}
float vector_2d_angle(cv::Point p1,cv::Point p2) {
//求解二維向量的角度
float angle = 0.0;
try {
float radian_value = acos((p1.x*p2.x+p1.y*p2.y)/(sqrt(p1.x*p1.x+p1.y*p1.y)*sqrt(p2.x*p2.x+p2.y*p2.y)));
angle = 180*radian_value/3.1415;
}catch(...){
angle = 65535.;
}
if (angle > 180.) {
angle = 65535.;
}
return angle;
}
std::vector<float> hand_angle(std::vector<int>& hand_x,std::vector<int>& hand_y) {
//獲取對應手相關向量的二維角度,根據角度確定手勢
float angle = 0.0;
std::vector<float> angle_list;
//------------------- thumb 大拇指角度
angle = vector_2d_angle(cv::Point((hand_x[0]-hand_x[2]),(hand_y[0]-hand_y[2])),cv::Point((hand_x[3]-hand_x[4]),(hand_y[3]-hand_y[4])));
angle_list.push_back(angle);
//--------------------index 食指角度
angle = vector_2d_angle(cv::Point((hand_x[0] - hand_x[6]), (hand_y[0] - hand_y[6])), cv::Point((hand_x[7] - hand_x[8]), (hand_y[7] - hand_y[8])));
angle_list.push_back(angle);
//---------------------middle 中指角度
angle = vector_2d_angle(cv::Point((hand_x[0] - hand_x[10]), (hand_y[0] - hand_y[10])), cv::Point((hand_x[11] - hand_x[12]), (hand_y[11] - hand_y[12])));
angle_list.push_back(angle);
//----------------------ring 無名指角度
angle = vector_2d_angle(cv::Point((hand_x[0] - hand_x[14]), (hand_y[0] - hand_y[14])), cv::Point((hand_x[15] - hand_x[16]), (hand_y[15] - hand_y[16])));
angle_list.push_back(angle);
//-----------------------pink 小拇指角度
angle = vector_2d_angle(cv::Point((hand_x[0] - hand_x[18]), (hand_y[0] - hand_y[18])), cv::Point((hand_x[19] - hand_x[20]), (hand_y[19] - hand_y[20])));
angle_list.push_back(angle);
return angle_list;
}
string h_gestrue(std::vector<float>& angle_lists) {
//二維約束的方式定義手勢
//fist five gun love one six three thumbup yeah
float thr_angle = 65.;
float thr_angle_thumb = 53.;
float thr_angle_s = 49.;
string gesture_str;
bool flag = false;
for (int i = 0; i < angle_lists.size(); i++) {
if (abs(65535 - int(angle_lists[i])) > 0) {
flag = true; //進入手勢判斷標識
}
}
std::cout << "flag:" << flag << std::endl;
if (flag) {
if (angle_lists[0] > thr_angle_thumb && angle_lists[1] > thr_angle
&& angle_lists[2] > thr_angle && angle_lists[3] > thr_angle
&& angle_lists[4] > thr_angle) {
gesture_str = "fist";
}
else if (angle_lists[0] < thr_angle_s && angle_lists[1] < thr_angle_s
&& angle_lists[2] < thr_angle_s && angle_lists[3] < thr_angle_s
&& angle_lists[4] < thr_angle_s) {
gesture_str = "five";
}
else if(angle_lists[0] < thr_angle_s && angle_lists[1] < thr_angle_s
&& angle_lists[2] > thr_angle && angle_lists[3] > thr_angle
&& angle_lists[4] > thr_angle){
gesture_str = "gun";
}
else if (angle_lists[0] < thr_angle_s && angle_lists[1] < thr_angle_s
&& angle_lists[2] > thr_angle && angle_lists[3] > thr_angle
&& angle_lists[4] < thr_angle_s) {
gesture_str = "love";
}
else if (angle_lists[0] < 5 && angle_lists[1] < thr_angle_s
&& angle_lists[2] > thr_angle && angle_lists[3] > thr_angle
&& angle_lists[4] > thr_angle) {
gesture_str = "one";
}
else if (angle_lists[0] < thr_angle_s && angle_lists[1] > thr_angle
&& angle_lists[2] > thr_angle && angle_lists[3] > thr_angle
&& angle_lists[4] < thr_angle_s) {
gesture_str = "six";
}
else if (angle_lists[0] > thr_angle_thumb && angle_lists[1] < thr_angle_s
&& angle_lists[2] < thr_angle_s && angle_lists[3] < thr_angle_s
&& angle_lists[4] > thr_angle) {
gesture_str = "three";
}
else if (angle_lists[0] < thr_angle_s && angle_lists[1] > thr_angle
&& angle_lists[2] > thr_angle && angle_lists[3] > thr_angle
&& angle_lists[4] > thr_angle) {
gesture_str = "thumbUp";
}
else if (angle_lists[0] > thr_angle_thumb && angle_lists[1] < thr_angle_s
&& angle_lists[2] < thr_angle_s && angle_lists[3] > thr_angle
&& angle_lists[4] > thr_angle) {
gesture_str = "two";
}
}
return gesture_str;
}
int main()
{
Mat frame;
VideoCapture capture(0);
init();
init_handKeyPoint();
while (true)
{
capture >> frame;
if (!frame.empty()) {
std::vector<Object> objects;
detect(frame, objects);
std::vector<float> hand_output;
for (int j = 0; j < objects.size(); ++j) {
cv::Mat handRoi;
int x, y, w, h;
try {
x = (int)objects[j].x < 0 ? 0 : (int)objects[j].x;
y = (int)objects[j].y < 0 ? 0 : (int)objects[j].y;
w = (int)objects[j].w < 0 ? 0 : (int)objects[j].w;
h = (int)objects[j].h < 0 ? 0 : (int)objects[j].h;
if (w > frame.cols){
w = frame.cols;
}
if (h > frame.rows) {
h = frame.rows;
}
}
catch (cv::Exception e) {
}
//把手區域向外擴30個畫素
x = max(0, x - 30);
y = max(0, y - 30);
int w_ = min(w - x + 30, 640);
int h_ = min(h - y + 30, 480);
cv::Rect roi(x,y,w_,h_);
handRoi = frame(roi);
cv::resize(handRoi,handRoi,cv::Size(256,256));
//detect_resnet(handRoi, hand_output);
detect_resnet(handRoi, hand_output);
std::vector<float> angle_lists;
string gesture_string;
std::vector<int> hand_points_x; //
std::vector<int> hand_points_y;
for (int k = 0; k < hand_output.size()/2; k++) {
int x = int(hand_output[k * 2 + 0] * handRoi.cols);//+int(roi.x)-1;
int y = int(hand_output[k * 2 + 1] * handRoi.rows);// +int(roi.y) - 1;
//x1 = x1 < 0 ? abs(x1) : x1;
//x2 = x2 < 0 ? abs(x2) : x2;
hand_points_x.push_back(x);
hand_points_y.push_back(y);
std::cout << "x1: " << x << " x2: " << y << std::endl;
cv::circle(handRoi, cv::Point(x,y), 3, (0, 255, 0), 3);
cv::circle(handRoi, cv::Point(x,y), 3, (0, 255, 0), 3);
}
cv::line(handRoi, cv::Point(hand_points_x[0], hand_points_y[0]), cv::Point(hand_points_x[1], hand_points_y[1]), cv::Scalar(255, 0, 0), 3);
cv::line(handRoi, cv::Point(hand_points_x[1], hand_points_y[1]), cv::Point(hand_points_x[2], hand_points_y[2]), cv::Scalar(255, 0, 0), 3);
cv::line(handRoi, cv::Point(hand_points_x[2], hand_points_y[2]), cv::Point(hand_points_x[3], hand_points_y[3]), cv::Scalar(255, 0, 0), 3);
cv::line(handRoi, cv::Point(hand_points_x[3], hand_points_y[3]), cv::Point(hand_points_x[4], hand_points_y[4]), cv::Scalar(255, 0, 0), 3);
cv::line(handRoi, cv::Point(hand_points_x[0], hand_points_y[0]), cv::Point(hand_points_x[5], hand_points_y[5]), cv::Scalar(0, 255, 0), 3);
cv::line(handRoi, cv::Point(hand_points_x[5], hand_points_y[5]), cv::Point(hand_points_x[6], hand_points_y[6]), cv::Scalar(0, 255, 0), 3);
cv::line(handRoi, cv::Point(hand_points_x[6], hand_points_y[6]), cv::Point(hand_points_x[7], hand_points_y[7]), cv::Scalar(0, 255, 0), 3);
cv::line(handRoi, cv::Point(hand_points_x[7], hand_points_y[7]), cv::Point(hand_points_x[8], hand_points_y[8]), cv::Scalar(0, 255, 0), 3);
cv::line(handRoi, cv::Point(hand_points_x[0], hand_points_y[0]), cv::Point(hand_points_x[9], hand_points_y[9]), cv::Scalar(0, 0, 255), 3);
cv::line(handRoi, cv::Point(hand_points_x[9], hand_points_y[9]), cv::Point(hand_points_x[10], hand_points_y[10]), cv::Scalar(0, 0, 255), 3);
cv::line(handRoi, cv::Point(hand_points_x[10], hand_points_y[10]), cv::Point(hand_points_x[11], hand_points_y[11]), cv::Scalar(0, 0, 255), 3);
cv::line(handRoi, cv::Point(hand_points_x[11], hand_points_y[11]), cv::Point(hand_points_x[12], hand_points_y[12]), cv::Scalar(0, 0, 255), 3);
cv::line(handRoi, cv::Point(hand_points_x[0], hand_points_y[0]), cv::Point(hand_points_x[13], hand_points_y[13]), cv::Scalar(255, 0, 0), 3);
cv::line(handRoi, cv::Point(hand_points_x[13], hand_points_y[13]), cv::Point(hand_points_x[14], hand_points_y[14]), cv::Scalar(255, 0, 0), 3);
cv::line(handRoi, cv::Point(hand_points_x[14], hand_points_y[14]), cv::Point(hand_points_x[15], hand_points_y[15]), cv::Scalar(255, 0, 0), 3);
cv::line(handRoi, cv::Point(hand_points_x[15], hand_points_y[15]), cv::Point(hand_points_x[16], hand_points_y[16]), cv::Scalar(255, 0, 0), 3);
cv::line(handRoi, cv::Point(hand_points_x[0], hand_points_y[0]), cv::Point(hand_points_x[17], hand_points_y[17]), cv::Scalar(0, 255, 0), 3);
cv::line(handRoi, cv::Point(hand_points_x[17], hand_points_y[17]), cv::Point(hand_points_x[18], hand_points_y[18]), cv::Scalar(0, 255, 0), 3);
cv::line(handRoi, cv::Point(hand_points_x[18], hand_points_y[18]), cv::Point(hand_points_x[19], hand_points_y[19]), cv::Scalar(0, 255, 0), 3);
cv::line(handRoi, cv::Point(hand_points_x[19], hand_points_y[19]), cv::Point(hand_points_x[20], hand_points_y[20]), cv::Scalar(0, 255, 0), 3);
angle_lists = hand_angle(hand_points_x, hand_points_y);
gesture_string = h_gestrue(angle_lists);
std::cout << "getsture_string:" << gesture_string << std::endl;
cv::putText(handRoi,gesture_string,cv::Point(30,30),cv::FONT_HERSHEY_COMPLEX,1, cv::Scalar(0, 255, 255), 1, 1, 0);
cv::imshow("handRoi", handRoi);
cv::waitKey(10);
angle_lists.clear();
hand_points_x.clear();
hand_points_y.clear();
}
}
if (cv::waitKey(20) == 'q')
break;
}
capture.release();
return 0;
}
更多資訊請訪問格物鈦官網