Fast & Furious face detection with OpenCV
Posted on : 18-06-2009 | By : Rhondasw | In : OpenCV
21
In OpenCV/Samples there is facedetect program. This program can detect faces on images and video. It’s very fun, but its speed leaves much to be desired =(. Of course with OpenMP, it works faster; on Intel Core Duo 2.7GHZ, it works fast; but will it work fast on ARM? I have big doubts. I compiled facedetect without OpenMP and on average it takes 600 ms for 640×480 resolution to find one face. I wanted to find out, if it’s possible to improve this time by software means or not… After some investigations, code refactoring and improvements, facedetect started to work 2.5 time faster, even on ARM. Of course, without big quality loss =)
I started investigation with profiling cvHaarDetectObjects on 640×480 image. Function cvRunHaarClassifierCascade tooks 70% of computation time. But cvRunHaarClassifierCascade is not so heavy, why it takes so much time? Scanning 20×20 window is moved on X-direction and Y-direction and Scale-direction and on each scanning window, cvRunHaarClassifierCascade is called. Totally we have 160000 calls!
So to reduce time, we need optimize this triple cycle. I know several ways:
- change parameters in cvHaarDetectObjects function. Sometimes, it really helps, but let’s resort to such shamanism another time. I used “default” parameters: 1.1 scale factor, 20×20 window.
- use fixed point in algorithm. We did it here
- optimize OpenCV default frontal face cascade. Cascade generation takes much time and who knows, will it be good or not =)
- somehow reduce number of cvRunHaarClassifierCascade calls. Image contains only several real faces, not 160000 – so all this makes sence.
We have researched a lot of approaches and combination of ways above and got the result (Intel Core Duo 2.7GHZ):
Original face detect | Fast face detect | |||||
ColorFERET frontal |
LabeledFaces InTheWild |
NoFaces (Negative) |
ColorFERET frontal |
LabeledFaces InTheWild |
NoFaces (Negative) |
|
512×768 | 250×250 |
up to 1280×1024 |
512×768 | 250×250 |
up to 1280×1024 |
|
Total | 5444 | 1872 | 1748 | 5276 | 1872 | 1748 |
Fount | 5420 | 1765 | 5191 | 1685 | ||
Hit rate | 99,6% | 94,3% | 98,4% | 90,0% | ||
FP (incorrect found) | 57 | 12 | 37 | 18 | 10 | 13 |
False alarm rate | 2,1% | 0,7% | ||||
FN (not found) | 23 | 107 | 85 | 187 | ||
Average time, ms | ||||||
not found | 623,98 | 85,43 | 775,23 | 139,07 | 39,48 | 287,98 |
one face found | 629,53 | 87,07 | 1053,49 | 248,26 | 42,99 | 455,31 |
two or more face found | 632,32 | 88,04 | 245,12 | 43,39 |
what approach did you use? In previous edition you mentioned skin filter….???
Hi Snik,
Yes, we are using the skin filter. Skin filter reduces false positives a lot but it does performance better up to 1.3 times. It is not enough for our goals. Thus we did find another significant technique. Except the filter, we are using original heuristics which allows to reach speed as mentioned in this article. Unfortunately, our management doesn’t allow to open our technologies, share code and such innovation techniques except result in scientific style but you can discuss with them (http://www.rhonda.ru/eng/feedback) and get full version (api library or even our code) if it is needed for you (sure it could require something from you as well )
Sorry that we cannot open it… it is not dependent on us.
Aleksey
> Unfortunately, our management doesn’t allow to open our technologies,
> share code and such innovation techniques except result in scientific style
Are these results published anywhere in the scientific periodicals?
Could you provide me with a reference, or, the best, a copy of your paper?
Thanks.
No, this blog is only one place where we published our results.
I’ve found other perfomance trouble in face detect.c (more precisely in cvHaarDetectObjects) .
It’s not about limited platform. This repeat on win32 and freebsd, and about to multithreading.
I modified facedetect.c so it put result only to console. Facedetect.exe find faces on lena.jpg ~ 0.5 sec on my computer.
But if I run three(3) facedetect.exe instances simultaneously (same lena.jpg) – it works ~ 1.6 sec!
But expected ~0.5 sec (+ thread overhead).
I.e. it works like as three sequential running.
My little investigations say that’s about cvhaarDetectObject(..) method.
Somewhere inside happens something so system process(or thread) is blocked.
This appears for OpenCV 1.0 and last SVN snapshot.
I’ve compile source under VS 2008 and mingw – same result. Also I build facedetect on FreeBSD using current port version – same bad news.
What is the problem? It’s my headache last four days..
Thanks for any suggestions.
I have just compiled facedetect.c with cvsample project from OpenCV 1.1. I just removed code which shows image on screen. My time is 699 ms for three instances. One instance takes 297 ms. So, I didn’t see the same problem on PC (1 core, P4 3.0 GHz).
Please try just use cvsample and facedetect.c without any modifications.
Let me know please your result.
Aleksey
Thanks for reply!
But I guess I made a mistake.. I forgot that face-detecting is very highload operation.
Yes, I get result like your for scale=1.2, min_neighbor=2, flags=0. (AMD 3500 ~ 2.2 GHz)
Now this options is my best for speed and quality.
And if I unterstand correctly, no more way increase speed (for ~same detect quality) without source editing?
(I now about CV_HAAR_FIND_BIGGEST_OBJECT)
Hi Andrey,
I was wonder with your question but I was not able to repeat it and I did think that something missing… Good that your problem was resolved!
>>no more way increase speed
>>(for ~same detect quality) without source editing?
[Aleksey] Basically, you are right. But your approach has impacted the quality. I could suggest to don’t change quality i.e. use default opencv parameters for HaarDetect: setting scale to 1.1 and min_neighbors at least to 3 (or more) but it will reduce the speed essentially!
So, I could only recommend our solution as the best way (sorry for self-marketing but it is true) as described in this article. (if you would like to get our code for this, you could ask our management/marketing … see “about” page for details).
Let me know if you have any more question.
Aleksey
I have to ask one question about your solution program. Basically is your sol is works based on Haar like feature with some other program?
Very intresting.. Why did you remove my comment?
Hiding real or shame problem?
>>Very intresting.. Why did you remove my comment?
>>Hiding real or shame problem?
The comments are being reviewed by admin to avoid spam. Your previous comment is approved. Let me know if you posted other comments and it was removed.
Aleksey
[...] We have changed facedetector and get about 15 fps – which is real time. You can see results here and [...]
Hi Aleksey, can you tell me, how did you get the 99,6% Hit Rate with the cascade(s) given in OpenCV? I’ve made an exprience with 800 images, each of them has 1 (and only one) frontal face, and got a Hit Rate of 30%. Here are the main lines in my programme:
char* cascade_file_name = “c:\\program files\\opencv\\data\\haarcascades\\haarcascade_frontalface_alt_tree.xml”;
CvHaarClassifierCascade* cascade = (CvHaarClassifierCascade*)cvLoad(cascade_file_name,0,0,0);
…
image = cvLoadImage(image_file_name);
gray = cvCreateImage(cvSize(image->width,image->height),8,1);
cvCvtColor(image,gray,CV_BGR2GRAY);
faces = cvHaarDetectObjects(gray,cascade,storage,1.1,2,0,cvSize(30,30));
Hi Anh,
We use another cascade + unique parameters.
Hi Aleksey
Thank for your kindness upload your experiment in this blog. Btw I have some questions about your experiment
1. How about the size image in training data set ? all of them (both of positive and negative samples) with size 20×20?
2. I have run my first experiment. Positive data set is come from FRGC data base (700 images with size 24×24) and negative set is come from background image (1394 image swith size 160 x120). I use 40 stages, but unfortunately for the 13th stages the time consuming very long (more than 2 days) so would you please tell me what is my problem?
thank for your help
regards
Hi, please see http://www.computer-vision-software.com/blog/2009/11/faq-opencv-haartraining/ (FAQ: OpenCV Haartraining). Shortly,
1. Don’t build 40 stages, use 24, 20 or less;
2. All your positive images will be rescaled to the same size during creating vec file so you can use any sizes of positive images maintaining proportions;
3. Negative images must have much bigger size than positive samples, size 160×120 is insufficient, use 1280×1024 or more. If you take small background images, haartraining will not be able to extract negative samples for high stages. The more stages you use, the bigger negative images you need.
4. Haartraining can fall into infinite loop, unfortunately. Try to stop it, change negative images and restart the program. It will start from last successful stage.
Hi Andrey
Great, firstly thank to reply my comment
Btw I still have question about your comment.
Why haartraining can fall in to infinite loop? In my experiment it’s happen if the results of previous stages HR =1 and FA =0. So I try free code from MATLAB center to check adaboost algorithms. But as I know it will never infinite although HR =1 and FA =0. So I don’t know what is the problem. Would you please which part in the haartraing code it makes infinite?
thank for your help
regards
Hi there thanks for your article! I’ve been porting the OpenCV Haar Detection algorithm to our company’s MCU. In your approach for optimizing the detector, did you use the
CV_HAAR_SCALE_IMAGE parameter? For our particular system it was more favorable to scale the images instead of the features, but is there any difference on the quality?
Also in cvSetImagesForHaarClassifierCascade did you turn on the CV_ADJUST_FEATURES and CV_ADJUST_WEIGHTS options?
For CV_ADJUST_FEATURES the source code’s comment said something about aligning blocks, is that really necessary? And I couldn’t quite understand what CV_ADJUST_WEIGHTS was trying to do at all, is there any significant difference in quality with these options?
Thanks in advance.
Hi Aleksey, can you tell me, how to use the “performance.exe” ? I use it like this “performance.exe -data test.xml -info test.txt -w 32 -h 22 -sf 1.1″,but I get the result–0 hits, 0 missed, 0 false, I think there is something wrong with my test.txt. I really
can’t sure how to creat the test.txt. I am annoyed! Can you help me?
thank for your help!
regards
Hi Aleksey,
You got 2.5 times gain on ARM. Was it on dual core or quad core ? And is this gain only due to parallelization ?
Regrads,
Deepak
Hi Deepak,
“facedetect started to work 2.5 time faster, even on ARM” – it is was on ARM11 and ARM Cortex-A8 – both are single core (no parallel calculation).
Aleksey