基於OpenCV和YOLOv3深度學習的目標檢測

峻峰飛陽發表於2019-04-16

原文網址 : https://blog.csdn.net/while0/article/details/89337575

本文翻譯自Deep Learning based Object Detection using YOLOv3 with OpenCV ( Python / C++ )

本文，我們學習如何在OpenCV上使用目前較為先進的目標檢測技術YOLOv3。

YOLOv3是當前流行的目標檢測演算法YOLO(You Only Look Once)的最新變種演算法。所發行的模型能識別圖片和視訊中的80種物體，而且更重要的是它實時性強，而且準確度接近Single Shot MultiBox（SSD）。

從OpenCV 3.4.2開始，我們可以很容易的在OpenCV應用中使用YOLOv3模型（即OpemCV-3.4.2開始支援YOLOv3這網路框架）。

YOLO是什麼原理？

我們可以把目標檢測看成是目標定位和目標識別的結合。

在傳統的計算機視覺方法中，採用滑動視窗查詢不同區域和大小的目標。因為這是消耗量較大的演算法，通常假定目標的縱橫比是固定的。

早期的基於深度學習的目標檢測演算法，如R-CNN和快速R-CNN，採用選擇型搜尋（Selective Search）來縮小必須測試的邊界框的數量（本文的邊界框指的是，在預測到疑似所識別到的目標後，在圖片上把物件框出的一個矩形）。

另外一種稱為Overfeat的方法，通過卷積地計算滑動視窗，以多個尺度掃描了影象。

然後有人提出了快速R-CNN演算法，使用Region Proposal Network(RPN)區別將要測試的邊界框。通過巧妙的設計，用於目標識別的特徵點，也被RPN用於提出潛在的邊界框，因此節省了大量的計算。

然而，YOLO使用了完全不同的方法解決目標檢測問題。它將影象進行神經網路的一次性正向處理。SSD是另外一種將影象進行神經網路一次性正向處理的方法，但是YOLOv3比SSD實現了更高的精度，同時又較快的運算速度。YOLOv3在M40，TitanX和1080Ti這類GPU上實時效果更好。

讓我們看看YOLO如何在一張圖片中檢測目標。

首先，它把原圖按比例平均分解成一張有13x13網格的圖片。這169個單元會根據原圖的大小而改變。對於一張416x416畫素的圖片，每個圖片單元的大小是32x32畫素。處理圖片時，會以圖片單元為單位，預測單位中的多個邊界框。

對於每個邊界框，這個網路會計算所包含物體的邊界框的置信度，同時計算所包含的目標是屬於一個特定類別的可能性大小。

非最大抑制（non-maximum suppression）可以消除低置信度的邊界框，以及把同時包圍著單個物體的多個高置信度的邊界框消除到只剩下一個。

YOLOv3的作者，Joseph Redmon和Ali Farhadi，讓YOLOv3比前一代YOLOv2更加精確和快速。YOLOv3在處理多個不同尺寸圖片的場合中得到了優化。他們還通過加大了網路，並新增快捷連結將其引入剩餘網路來改進網路。

為什麼選擇OpenCV的YOLO

這裡有三個理由。

容易整合到現有的OpenCV程式中：如果應用程式已經使用了OpenCV，並想簡單地使用YOLOv3，完全不需要擔心Darknet原始碼的編譯和建立。
OpenCV的CPU版本的運算速度比Darknet+OpenMP快9倍：OpenCV的DNN模組，其CPU執行是十分快的。舉個例子，當用了OpenMP的Darknet在CPU上處理一張圖片消耗2秒，OpenCV的實現只需要0.22秒。具體請看下面的表格。
支援Python。Darknet是用C語言寫的，因此並不支援Python。相反，OpenCV是支援Python的。會有支援Darknet的程式設計介面。

在Darknet和OpenCV上跑YOLOv3的速度測試

下面的表格展示了在Darknet和OpenCV上YOLOv3的效能差距，輸入圖片的尺寸是416x416。不出所料，GPU版本的Darknet在效能上比其他方式優越。同時，理所當然的Darknet配合OpenMP會好於沒有OpenMP的Darknet，因為OpenMP支援多核的CPU。

意外的是，CPU版本的OpenCV在執行DNN的運算速度，是9倍的快過Darknet和OpenML。

表1. 分別在Darknet和OpenCV上跑YOLOv3的速度對比
OS	Framework	CPU/GPU	Time(ms)/Frame
Linux 16.04	Darknet	12x Intel Core i7-6850K CPU @ 3.60GHz	9370
Linux 16.04	Darknet + OpenMP	12x Intel Core i7-6850K CPU @ 3.60GHz	1942
Linux 16.04	OpenCV [CPU]	12x Intel Core i7-6850K CPU @ 3.60GHz	220
Linux 16.04	Darknet	NVIDIA GeForce 1080 Ti GPU	23
macOS	DarkNet	2.5 GHz Intel Core i7 CPU	7260
macOS	OpenCV [CPU]	2.5 GHz Intel Core i7 CPU	400

注意：我們在GPU版本的OpenCV上跑DNN時候遇到了困難。本文工作只是測試了Intel的GPU，因此如果沒有Intel的GPU，程式會自動切換到CPU上跑相關演算法。

採用YOLOv3的目標檢測，C++/Python兩種語言

讓我們看看，如何在YOLOv3在OpenCV執行目標檢測。

第1步：下載模型。

我們先從命令列中執行指令碼getModels.sh開始。

sudo chmod a+x getModels.sh
./getModels.sh

//譯者新增：
Windows下替代方案：
1、http://gnuwin32.sourceforge.net/packages/wget.htm 安裝wget
cd 到wget安裝目錄，執行
wget https://pjreddie.com/media/files/yolov3.weights
wget https://github.com/pjreddie/darknet/blob/master/cfg/yolov3.cfg?raw=true -O ./yolov3.cfg
wget https://github.com/pjreddie/darknet/blob/master/data/coco.names?raw=true -O ./coco.names

執行命令後開始下載yolov3.weights檔案（包括了提前訓練好的網路的權值），和yolov3.cfg檔案（包含了網路的配置方式）和coco.names（包括了COCO資料庫中使用的80種不同的目標種類名字）。

第2步：初始化引數

YOLOv3演算法的預測結果就是邊界框。每一個邊界框都旁隨著一個置信值。第一階段中，全部低於置信度閥值的都會排除掉。

對剩餘的邊界框執行非最大抑制演算法，以去除重疊的邊界框。非最大抑制由一個引數nmsThrehold控制。讀者可以嘗試改變這個數值，觀察輸出的邊界框的改變。

接下來，設定輸入圖片的寬度（inpWidth）和高度（inpHeight）。我們設定他們為416，以便對比YOLOv3作者提供的Darknets的C程式碼。如果想要更快的速度，讀者可以把寬度和高度設定為320。如果想要更準確的結果，改變他們到608。

Python程式碼：

# Initialize the parameters
confThreshold = 0.5 #Confidence threshold
nmsThreshold = 0.4 #Non-maximum suppression threshold
inpWidth = 416 #Width of network's input image
inpHeight = 416 #Height of network's input image

C++程式碼：

// Initialize the parameters
float confThreshold = 0.5; // Confidence threshold
float nmsThreshold = 0.4; // Non-maximum suppression threshold
int inpWidth = 416; // Width of network's input image
int inpHeight = 416; // Height of network's input image

第3步：讀取模型和類別

檔案coco.names包含了訓練好的模型能識別的所有目標名字。我們讀出各個類別的名字。

接著，我們讀取了網路，其包含兩個部分：

yolov3.weights: 預訓練得到的權重。
yolov3.cfg：配置檔案

我們把DNN的後端設定為OpenCV，目標設定為CPU。可以通過使cv.dnn.DNN_TARGET_OPENCL置為GPU，嘗試設定偏好的執行目標為GPU。但是要記住當前的OpenCV版本只在Intel的GPU上測試，如果沒有Intel的GPU則程式會自動設定為CPU。

Python:

# Load names of classes
classesFile = "coco.names";
classes = None
with open(classesFile, 'rt') as f:
classes = f.read().rstrip('\n').split('\n')
# Give the configuration and weight files for the model and load the network using them.
modelConfiguration = "yolov3.cfg";
modelWeights = "yolov3.weights";
net = cv.dnn.readNetFromDarknet(modelConfiguration, modelWeights)
net.setPreferableBackend(cv.dnn.DNN_BACKEND_OPENCV)
net.setPreferableTarget(cv.dnn.DNN_TARGET_CPU)

C++

// Load names of classes
string classesFile = "coco.names";
ifstream ifs(classesFile.c_str());
string line;
while (getline(ifs, line)) classes.push_back(line);
// Give the configuration and weight files for the model
String modelConfiguration = "yolov3.cfg";
String modelWeights = "yolov3.weights";
// Load the network
Net net = readNetFromDarknet(modelConfiguration, modelWeights);
net.setPreferableBackend(DNN_BACKEND_OPENCV);
net.setPreferableTarget(DNN_TARGET_CPU);

第4步：讀取輸入

這一步我們讀取影象，視訊流或者網路攝像頭。另外，我們也使用Videowriter（OpenCV裡的一個類）以視訊方式儲存帶有輸出邊界框的每一幀圖片。

Python

outputFile = "yolo_out_py.avi"
if (args.image):
# Open the image file
if not os.path.isfile(args.image):
print("Input image file ", args.image, " doesn't exist")
sys.exit(1)
cap = cv.VideoCapture(args.image)
outputFile = args.image[:-4]+'_yolo_out_py.jpg'
elif (args.video):
# Open the video file
if not os.path.isfile(args.video):
print("Input video file ", args.video, " doesn't exist")
sys.exit(1)
cap = cv.VideoCapture(args.video)
outputFile = args.video[:-4]+'_yolo_out_py.avi'
else:
# Webcam input
cap = cv.VideoCapture(0)
# Get the video writer initialized to save the output video
if (not args.image):
vid_writer = cv.VideoWriter(outputFile, cv.VideoWriter_fourcc('M','J','P','G'), 30, (round(cap.get(cv.CAP_PROP_FRAME_WIDTH)),round(cap.get(cv.CAP_PROP_FRAME_HEIGHT))))

C++

outputFile = "yolo_out_cpp.avi";
if (parser.has("image"))
{
// Open the image file
str = parser.get<String>("image");
ifstream ifile(str);
if (!ifile) throw("error");
cap.open(str);
str.replace(str.end()-4, str.end(), "_yolo_out.jpg");
outputFile = str;
}
else if (parser.has("video"))
{
// Open the video file
str = parser.get<String>("video");
ifstream ifile(str);
if (!ifile) throw("error");
cap.open(str);
str.replace(str.end()-4, str.end(), "_yolo_out.avi");
outputFile = str;
}
// Open the webcaom
else cap.open(parser.get<int>("device"));
// Get the video writer initialized to save the output video
if (!parser.has("image")) {
video.open(outputFile, VideoWriter::fourcc('M','J','P','G'), 28, Size(cap.get(CAP_PROP_FRAME_WIDTH), cap.get(CAP_PROP_FRAME_HEIGHT)));
}

第5步：處理每一幀

輸入到神經網路的影象需要以一種叫bolb的格式儲存。

讀取了輸入圖片或者視訊流的一幀影象後，這幀影象需要經過bolbFromImage()函式處理為神經網路的輸入型別bolb。在這個過程中，影象畫素以一個1/255的比例因子，被縮放到0到1之間。同時，影象在不裁剪的情況下，大小調整到416x416。注意我們沒有降低影象平均值，因此傳遞[0,0,0]到函式的平均值輸入，保持swapRB引數到預設值1。

輸出的bolb傳遞到網路，經過網路正向處理，網路輸出了所預測到的一個邊界框清單。這些邊界框通過後處理，濾除了低置信值的。我們隨後再詳細的說明後處理的步驟。我們在每一幀的左上方列印出了推斷時間。伴隨著最後的邊界框的完成，影象儲存到硬碟中，之後可以作為影象輸入或者通過Videowriter作為視訊流輸入。

Python：

while cv.waitKey(1) < 0:
# get frame from the video
hasFrame, frame = cap.read()
# Stop the program if reached end of video
if not hasFrame:
print("Done processing !!!")
print("Output file is stored as ", outputFile)
cv.waitKey(3000)
break
# Create a 4D blob from a frame.
blob = cv.dnn.blobFromImage(frame, 1/255, (inpWidth, inpHeight), [0,0,0], 1, crop=False)
# Sets the input to the network
net.setInput(blob)
# Runs the forward pass to get output of the output layers
outs = net.forward(getOutputsNames(net))
# Remove the bounding boxes with low confidence
postprocess(frame, outs)
# Put efficiency information. The function getPerfProfile returns the
# overall time for inference(t) and the timings for each of the layers(in layersTimes)
t, _ = net.getPerfProfile()
label = 'Inference time: %.2f ms' % (t * 1000.0 / cv.getTickFrequency())
cv.putText(frame, label, (0, 15), cv.FONT_HERSHEY_SIMPLEX, 0.5, (0, 0, 255))
# Write the frame with the detection boxes
if (args.image):
cv.imwrite(outputFile, frame.astype(np.uint8));
else:
vid_writer.write(frame.astype(np.uint8))

c++

// Process frames.
while (waitKey(1) < 0)
{
// get frame from the video
cap >> frame;
// Stop the program if reached end of video
if (frame.empty()) {
cout << "Done processing !!!" << endl;
cout << "Output file is stored as " << outputFile << endl;
waitKey(3000);
break;
}
// Create a 4D blob from a frame.
blobFromImage(frame, blob, 1/255.0, cvSize(inpWidth, inpHeight), Scalar(0,0,0), true, false);
//Sets the input to the network
net.setInput(blob);
// Runs the forward pass to get output of the output layers
vector<Mat> outs;
net.forward(outs, getOutputsNames(net));
// Remove the bounding boxes with low confidence
postprocess(frame, outs);
// Put efficiency information. The function getPerfProfile returns the
// overall time for inference(t) and the timings for each of the layers(in layersTimes)
vector<double> layersTimes;
double freq = getTickFrequency() / 1000;
double t = net.getPerfProfile(layersTimes) / freq;
string label = format("Inference time for a frame : %.2f ms", t);
putText(frame, label, Point(0, 15), FONT_HERSHEY_SIMPLEX, 0.5, Scalar(0, 0, 255));
// Write the frame with the detection boxes
Mat detectedFrame;
frame.convertTo(detectedFrame, CV_8U);
if (parser.has("image")) imwrite(outputFile, detectedFrame);
else video.write(detectedFrame);
}

現在，讓我們詳細分析一下上面呼叫的函式。

第5a步：得到輸出層的名字

OpenCV的網路類中的前向功能需要結束層，直到它在網路中執行。因為我們需要執行整個網路，所以我們需要識別網路中的最後一層。我們通過使用getUnconnectedOutLayers()獲得未連線的輸出層的名字，該層基本就是網路的最後層。然後我們執行前向網路，得到輸出，如前面的程式碼片段（net.forward(getOutputsNames(net))）。
python:

# Get the names of the output layers
def getOutputsNames(net):
# Get the names of all the layers in the network
layersNames = net.getLayerNames()
# Get the names of the output layers, i.e. the layers with unconnected outputs
return [layersNames[i[0] - 1] for i in net.getUnconnectedOutLayers()]

c++

// Get the names of the output layers
vector<String> getOutputsNames(const Net& net)
{
static vector<String> names;
if (names.empty())
{
//Get the indices of the output layers, i.e. the layers with unconnected outputs
vector<int> outLayers = net.getUnconnectedOutLayers();
//get the names of all the layers in the network
vector<String> layersNames = net.getLayerNames();
// Get the names of the output layers in names
names.resize(outLayers.size());
for (size_t i = 0; i < outLayers.size(); ++i)
names[i] = layersNames[outLayers[i] - 1];
}
return names;
}

第5b步：後處理網路輸出

網路輸出的每個邊界框都分別由一個包含著類別名字和5個元素的向量表示。

頭四個元素代表center_x, center_y, width和height。第五個元素表示包含著目標的邊界框的置信度。

其餘的元素是和每個類別（如目標種類）有關的置信度。邊界框分配給最高分數對應的那一種類。

一個邊界框的最高分數也叫做它的置信度（confidence）。如果邊界框的置信度低於規定的閥值，演算法上不再處理這個邊界框。

置信度大於或等於置信度閥值的邊界框，將進行非最大抑制。這會減少重疊的邊界框數目。
Python

# Remove the bounding boxes with low confidence using non-maxima suppression
def postprocess(frame, outs):
frameHeight = frame.shape[0]
frameWidth = frame.shape[1]
classIds = []
confidences = []
boxes = []
# Scan through all the bounding boxes output from the network and keep only the
# ones with high confidence scores. Assign the box's class label as the class with the highest score.
classIds = []
confidences = []
boxes = []
for out in outs:
for detection in out:
scores = detection[5:]
classId = np.argmax(scores)
confidence = scores[classId]
if confidence > confThreshold:
center_x = int(detection[0] * frameWidth)
center_y = int(detection[1] * frameHeight)
width = int(detection[2] * frameWidth)
height = int(detection[3] * frameHeight)
left = int(center_x - width / 2)
top = int(center_y - height / 2)
classIds.append(classId)
confidences.append(float(confidence))
boxes.append([left, top, width, height])
# Perform non maximum suppression to eliminate redundant overlapping boxes with
# lower confidences.
indices = cv.dnn.NMSBoxes(boxes, confidences, confThreshold, nmsThreshold)
for i in indices:
i = i[0]
box = boxes[i]
left = box[0]
top = box[1]
width = box[2]
height = box[3]
drawPred(classIds[i], confidences[i], left, top, left + width, top + height)

c++

// Remove the bounding boxes with low confidence using non-maxima suppression
void postprocess(Mat& frame, const vector<Mat>& outs)
{
vector<int> classIds;
vector<float> confidences;
vector<Rect> boxes;
for (size_t i = 0; i < outs.size(); ++i)
{
// Scan through all the bounding boxes output from the network and keep only the
// ones with high confidence scores. Assign the box's class label as the class
// with the highest score for the box.
float* data = (float*)outs[i].data;
for (int j = 0; j < outs[i].rows; ++j, data += outs[i].cols)
{
Mat scores = outs[i].row(j).colRange(5, outs[i].cols);
Point classIdPoint;
double confidence;
// Get the value and location of the maximum score
minMaxLoc(scores, 0, &confidence, 0, &classIdPoint);
if (confidence > confThreshold)
{
int centerX = (int)(data[0] * frame.cols);
int centerY = (int)(data[1] * frame.rows);
int width = (int)(data[2] * frame.cols);
int height = (int)(data[3] * frame.rows);
int left = centerX - width / 2;
int top = centerY - height / 2;
classIds.push_back(classIdPoint.x);
confidences.push_back((float)confidence);
boxes.push_back(Rect(left, top, width, height));
}
}
}
// Perform non maximum suppression to eliminate redundant overlapping boxes with
// lower confidences
vector<int> indices;
NMSBoxes(boxes, confidences, confThreshold, nmsThreshold, indices);
for (size_t i = 0; i < indices.size(); ++i)
{
int idx = indices[i];
Rect box = boxes[idx];
drawPred(classIds[idx], confidences[idx], box.x, box.y,
box.x + box.width, box.y + box.height, frame);
}
}

非最大抑制由引數nmsThreshold控制。如果nmsThreshold設定太少，比如0.1，我們可能檢測不到相同或不同種類的重疊目標。如果設定得太高，比如1，可能出現一個目標有多個邊界框包圍。所以我們在上面的程式碼使用了0.4這個中間的值。下面的gif展示了NMS閥值改變時候的效果。

第5c步：畫出計算得到的邊界框

最後，經過非最大抑制後，得到了邊界框。我們把邊界框在輸入幀上畫出，並標出種類名和置信值。

Python

# Draw the predicted bounding box
def drawPred(classId, conf, left, top, right, bottom):
# Draw a bounding box.
cv.rectangle(frame, (left, top), (right, bottom), (0, 0, 255))
label = '%.2f' % conf
# Get the label for the class name and its confidence
if classes:
assert(classId < len(classes))
label = '%s:%s' % (classes[classId], label)
#Display the label at the top of the bounding box
labelSize, baseLine = cv.getTextSize(label, cv.FONT_HERSHEY_SIMPLEX, 0.5, 1)
top = max(top, labelSize[1])
cv.putText(frame, label, (left, top), cv.FONT_HERSHEY_SIMPLEX, 0.5, (255,255,255))

c++

// Draw the predicted bounding box
void drawPred(int classId, float conf, int left, int top, int right, int bottom, Mat& frame)
{
//Draw a rectangle displaying the bounding box
rectangle(frame, Point(left, top), Point(right, bottom), Scalar(0, 0, 255));
//Get the label for the class name and its confidence
string label = format("%.2f", conf);
if (!classes.empty())
{
CV_Assert(classId < (int)classes.size());
label = classes[classId] + ":" + label;
}
//Display the label at the top of the bounding box
int baseLine;
Size labelSize = getTextSize(label, FONT_HERSHEY_SIMPLEX, 0.5, 1, &baseLine);
top = max(top, labelSize.height);
putText(frame, label, Point(left, top), FONT_HERSHEY_SIMPLEX, 0.5, Scalar(255,255,255));

訂閱&下載程式碼
如果你喜歡本文，想下載程式碼（C++和Python），和在文中的例子圖片，請訂閱我們的時事通訊。你會獲得一封免費的計算機視覺指南。在我們的時事通訊上，我們共享了C++/Python語言的OpenCV教程和例子，同時還有計算機視覺和機器學習演算法和新聞。

以上就是原文的全部內容。

原文地址：https://www.learnopencv.com/deep-learning-based-object-detection-using-yolov3-with-opencv-python-c/

作者：Sunita Nayak

可參考：YOLOv3 Tech Report獲得與本文相關的知識內容。

有幾句話是機翻協助的。當時也沒標記。2018年9月18日進行了一次潤色，已經修復部分翻譯錯誤。第一遍快到結束了，按了下退格鍵+空格鍵，頁面後退了，內容沒了，痛心。然後重新潤了一遍，沒那麼好的耐心了。如有錯漏請多多包涵。

深度學習之目標檢測
2019-02-20
深度學習
基於深度學習的計算機視覺應用之目標檢測
2018-04-02
深度學習計算機視覺
《基於深度學習的目標檢測綜述》論文獲發表
2023-01-30
深度學習
深度學習之目標檢測與目標識別
2018-06-05
深度學習
52 個深度學習目標檢測模型
2020-03-27
深度學習模型
深入學習OpenCV檢測及分割影象的目標區域
2019-07-27
OpenCV
深度學習之影像目標檢測速覽
2019-08-31
深度學習
目標檢測網路之 YOLOv3
2018-10-17
YOLO
深度學習|基於MobileNet的多目標跟蹤深度學習演算法
2022-11-09
深度學習演算法
基於深度學習的機器人目標識別和跟蹤
2022-08-02
深度學習機器人
這才是目標檢測YOLOv3的真實面目
2019-02-22
YOLO
深度學習目標檢測(object detection)系列（六）YOLO2
2019-03-01
深度學習ObjectYOLO
深度學習目標檢測(object detection)系列（一） R-CNN
2018-09-14
深度學習ObjectCNN
深度學習“吃雞外掛”——目標檢測 SSD 實驗
2018-05-18
深度學習
深度學習目標檢測(object detection)系列（五） R-FCN
2018-09-19
深度學習Object
深度學習與CV教程(13) | 目標檢測 (SSD,YOLO系列)
2022-06-09
深度學習YOLO
經典目標檢測方法Faster R-CNN和Mask R-CNN|基於PaddlePaddle深度學習平臺的實戰
2019-04-03
ASTCNN深度學習
使用深度學習的交通標誌檢測
2019-07-02
深度學習
深度學習目標檢測(object detection)系列（四） Faster R-CNN
2018-09-18
深度學習ObjectASTCNN
目標檢測演算法學習
2019-03-16
演算法
Pytorch 目標檢測學習 Day 2
2021-01-03
PyTorch
faced：基於深度學習的CPU實時人臉檢測
2018-09-28
深度學習
COVID-19：利用Opencv, Keras/Tensorflow和深度學習進行口罩檢測
2020-06-03
OpenCVKeras深度學習
Selective Search——Region Proposal的源頭 (目標檢測)(two-stage)(深度學習)(IJCV 2013）
2018-11-28
深度學習
pytorch實現yolov3(5) 實現端到端的目標檢測
2019-07-20
PyTorchYOLO
深度學習與CV教程(12) | 目標檢測 (兩階段,R-CNN系列)
2022-06-07
深度學習CNN
TF專案實戰（基於SSD目標檢測）——人臉檢測1
2019-07-20
基於深度學習的場景文字檢測和識別（Scene Text Detection and Recognition）綜述
2020-12-04
深度學習
opencv學習之邊緣檢測
2022-05-15
OpenCV
0-目標檢測模型的基礎
2022-12-01
模型
Python OpenCV 3 使用背景減除進行目標檢測
2021-09-09
PythonOpenCV
目標檢測
2018-04-24
51 個深度學習目標檢測模型彙總，論文、原始碼一應俱全！
2019-01-23
深度學習模型原始碼
52 個深度學習目標檢測模型彙總，論文、原始碼一應俱全！
2020-03-16
深度學習模型原始碼
基於混合高斯模型的運動目標檢測演算法
2018-07-21
模型演算法
Python+OpenCV目標跟蹤實現基本的運動檢測
2019-01-17
PythonOpenCV
[OpenCV實戰]1 基於深度學習識別人臉性別和年齡
2019-03-04
OpenCV深度學習
不帶Anchors和NMS的目標檢測
2021-07-11

基於OpenCV和YOLOv3深度學習的目標檢測

YOLO是什麼原理？

為什麼選擇OpenCV的YOLO

在Darknet和OpenCV上跑YOLOv3的速度測試

採用YOLOv3的目標檢測，C++/Python兩種語言

第1步：下載模型。

第2步：初始化引數

第3步：讀取模型和類別

第4步：讀取輸入

第5步：處理每一幀

第5a步：得到輸出層的名字

第5b步：後處理網路輸出

第5c步：畫出計算得到的邊界框

相關文章