Author: Tony Fu
Date: August 12, 2023
Device: MacBook Pro 16-inch, Late 2021 (M1 Pro)
Reference: Chapter 6 of Qt 5 and OpenCV 4 Computer Vision Projects by Zhuo Qingliang
-
Camera Privacy Error on Startup (Repeated from Chapter 3):
- Problem: App crash with an error about accessing privacy-sensitive data.
- Solution: Added the following to the
Info.plist
file:<key>NSCameraUsageDescription</key> <string>We need access to the camera to capture video for motion detection.</string>
-
Failure to Build OpenCV 3.4.5: This step builds the executables:
opencv_createsamples
andopencv_traincascade
, both are deprecated functions needed to train the Haar Cascade classifier for no-entry sign detection. I first downloaded OpenCV 3.4.5 from this link.Then, I created a
build
directory.Inside the directory, I ran the following command:
cmake -D CMAKE_BUILD_TYPE=RELEASE \ -D CMAKE_INSTALL_PREFIX=/Users/tonyfu/Desktop/OnlineCourses/OpenCV-Qt-App/06_ObjectDetection/opencv-3.4.5/build \ -D BUILD_opencv_apps=yes \ -D JPEG_INCLUDE_DIR=/opt/homebrew/include \ -D JPEG_LIBRARY=/path/to/jpeg/library \ ..
Next, I ran
make
and got the error:[ 50%] Linking CXX shared library ../../lib/libopencv_imgcodecs.dylib Undefined symbols for architecture arm64: "_jpeg_default_qtables", referenced from: cv::JpegEncoder::write(cv::Mat const&, std::__1::vector<int, std::__1::allocator<int> > const&) in grfmt_jpeg.cpp.o ld: symbol(s) not found for architecture arm64 clang: error: linker command failed with exit code 1 (use -v to see invocation) make[2]: *** [lib/libopencv_imgcodecs.3.4.5.dylib] Error 1 make[1]: *** [modules/imgcodecs/CMakeFiles/opencv_imgcodecs.dir/all] Error 2 make: *** [all] Error 2
Solution: Currently under investigation. Maybe there are OpenCV 4 equivalents for the two executables.
-
Undefined Template 'std::basic_ifstream': This error occurs when compiling the
capture_thread.cpp
file. The error is as follows:capture_thread.cpp:121:18: error: implicit instantiation of undefined template 'std::basic_ifstream<char>' ifstream ifs(namesFile.c_str());
Solution: include the appropriate header file:
#include <fstream>
This is very similar to Face Detection in Chapter 4. The only difference is that we are using a different classifier.
-
Classifier Storage:
Haar cascade classifiers are located in/opt/homebrew/share/opencv4/haarcascades/
in XML format. -
Haar Cascade Implementation:
To integrate the Haar Cascade classifiers:- Append
-lopencv_objdetect
to the LIBS within the .pro file. - Incorporate the
DEFINES += OPENCV_DATA_DIR=\\\"/opt/homebrew/share/opencv4/\\\"
macro. This will be referenced later during classifier loading.
Here's a simple guide to face detection:
#include "opencv2/objdetect.hpp" cv::CascadeClassifier *classifier = new cv::CascadeClassifier(OPENCV_DATA_DIR "haarcascades/haarcascade_frontalcatface_extended.xml"); while (running) { cap >> tmp_frame; // Face detection process vector<cv::Rect> faces; cv::Mat gray_frame; cv::cvtColor(tmp_frame, gray_frame, cv::COLOR_BGR2GRAY); classifier->detectMultiScale(gray_frame, faces, 1.3, 5); // Drawing red bounding boxes around detected faces cv::Scalar color = cv::Scalar(0, 0, 255); // red for (size_t i = 0; i < faces.size(); i++) { cv::rectangle(tmp_frame, faces[i], color, 1); } // Continuation of the code (frame update, signal emission, etc.) }
- Append
Results:
OpenCV 4.0.0 and above supports the loading of neural network models from various frameworks. OpenCV cannot train its own DNN, but can load DNN from the following file extensions in inference mode:
- Caffe:
.prototxt
,.caffemodel
- TensorFlow:
.pb
,.pbtxt
- Torch:
.t7
,.net
, TorchScript - Darknet (YOLO):
.cfg
,.weights
- ONNX:
.onnx
- DLA: Custom format
- OpenVINO:
.xml
,.bin
- Apache MXNet:
.json
,.params
- Apple Core ML:
.mlmodel
- Chainer:
npz
This code block demonstrates how to use YOLOv3 (You Only Look Once version 3) for object detection using OpenCV.
-
Include OpenCV's Deep Neural Network (DNN) module: This is required to work with neural networks, including YOLO.
#include "opencv2/dnn.hpp"
-
Set Input Dimensions: Define the width and height that the model expects for its input.
int inputWidth = 416; int inputHeight = 416;
-
Load the YOLO Model: Provide the paths to the configuration and weights files.
string modelConfig = "data/yolov3.cfg"; string modelWeights = "data/yolov3.weights"; cv::dnn::Net net = cv::dnn::readNetFromDarknet(modelConfig, modelWeights);
-
Load Class Names: Read the class names from a file and store them in a vector. There are 80 categories available for detection. You can view them in the coco.names file.
ifstream
stands for "input file," and an ifstream object is used to open and read from files.#include <fstream> vector<string> objectClasses; string name; string namesFile = "data/coco.names"; ifstream ifs(namesFile.c_str()); while(getline(ifs, name)) objectClasses.push_back(name);
-
Preprocess the Image: Convert the image to a blob (binary large object), a format required for feeding into the neural network.
cv::Mat blob; cv::dnn::blobFromImage(frame, blob, 1 / 255.0, cv::Size(inputWidth, inputHeight), cv::Scalar(0, 0, 0), true, false); net.setInput(blob);
-
Run Forward Pass: Pass the blob through the network to get the detection results.
vector<cv::Mat> outs; net.forward(outs, getOutputsNames(net));
-
Decode Outputs: Process the network's output to extract the detected objects' information.
vector<int> outClassIds; vector<float> outConfidences; vector<cv::Rect> outBoxes; decodeOutLayers(frame, outs, outClassIds, outConfidences, outBoxes);
-
Draw Bounding Boxes and Labels: Iterate through the detected objects, draw bounding boxes around them, and display the class name and confidence level.
for(size_t i = 0; i < outClassIds.size(); i ++) { // drawing and labeling code here }
Results:
Detecting time on a single frame: 101-110 ms YOLO: Inference time on a single frame: 98-105 ms