-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
create face detection demo app #29
Comments
related/kinda superset of #27 |
Let me add some infos and background from my side, in particular regarding the ML stuff. In ML software is often split into 2 pieces: a) training/learning and b) detection/run-time. "face detection component": this would be the b) in above. It needs to load/access an already trained model, and only apply that model to new incoming data and output prediction ("Is this picture/video frame a human face, yes or no?") so we actually should have 2 components for the ML part:
for the demo, pattern == human face is perfect. but we should design the components in a way that generalizes to pattern (see below) The specific ML algorithm that we should use for this is "Haar cascades". A good intro can be found here: http://www.willberger.org/cascade-haar-explained/ The output (when using OpenCV for haar cascades via The OpenCV project provides a bunch of ready-to-use trained models here: https://github.com/opencv/opencv/blob/master/data/haarcascades/ One model provided is
The training component (different from the run-time component) essentially needs to do:
The detection run-time component (processing the live video frames) needs to do:
We could for example have WAMP procs in the ML run-time component:
And then eg have the ML training component call into some more links:
|
the reason for above generalization (pattern vs only faces plus run-time and training component) is: doing so makes this actually much more than a demo! eg we could add further down the line UI that allows an end user to upload and define training sets of arbitrary pictures/images for other application:
because: face detection is obviously not sth an industrial use would practically do. however, "broken parts vs ok parts" is actually very very relevant |
For the initial version, which can use the existing model that Open CV provides
|
We want an application which provides detection of human faces in a live video stream and can show this in a browser-based frontend.
This is microservice-based, and I see three components as part of this:
Interaction:
With this, we can use the video display component twice, once to show the raw image stream, once to show the face positions as part of our demo.
My initial (naive) assumption regarding coordination between the two data streams is that this is via a timecode generated by the camera capture component, which is then also used for the face detection data stream. The video display component can then cache either until the required pairs are present.
@om26er - does the above sound reasonable?
The text was updated successfully, but these errors were encountered: