Skip to content

Commit 70848f5

Browse files
authored
Add FaceDetection TFJS implementation (#912)
* Add FaceDetection TFJS implementation * Add predictIrises default * Rename classes to clarify use cases * Add TODO for changing face landmarks config
1 parent fe3db69 commit 70848f5

27 files changed

+2271
-637
lines changed

face-landmarks-detection/.npmignore

+21
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
.DS_Store
2+
.yalc/
3+
.vscode/
4+
.rpt2_cache/
5+
demos/
6+
scripts/
7+
src/
8+
test_data/
9+
coverage/
10+
node_modules/
11+
karma.conf.js
12+
*.tgz
13+
.travis.yml
14+
.npmignore
15+
tslint.json
16+
yarn.lock
17+
yalc.lock
18+
cloudbuild.yml
19+
dist/*_test.js
20+
dist/*_test.js.map
21+
dist/test_util*

face-landmarks-detection/README.md

+3-3
Original file line numberDiff line numberDiff line change
@@ -59,8 +59,8 @@ Example output:
5959
height: 246.87222836072945
6060
},
6161
keypoints: [
62-
{x: 406.53152857172876, y: 256.8054528661723, name: "lips"},
63-
{x: 406.544237446397, y: 230.06933367750395},
62+
{x: 406.53152857172876, y: 256.8054528661723, z: 10.2, name: "lips"},
63+
{x: 406.544237446397, y: 230.06933367750395, z: 8},
6464
...
6565
],
6666
}
@@ -69,7 +69,7 @@ Example output:
6969

7070
The `box` represents the bounding box of the face in the image pixel space, with `xMin`, `xMax` denoting the x-bounds, `yMin`, `yMax` denoting the y-bounds, and `width`, `height` are the dimensions of the bounding box.
7171

72-
For the `keypoints`, x and y represent the actual keypoint position in the image pixel space.
72+
For the `keypoints`, x and y represent the actual keypoint position in the image pixel space. z represents the depth with the center of the head being the origin, and the smaller the value the closer the keypoint is to the camera. The magnitude of z uses roughly the same scale as x.
7373

7474
The name provides a label for some keypoint, such as 'lips', 'leftEye', etc. Note that not each keypoint will have a label.
7575

face-landmarks-detection/src/create_detector.ts

+9-6
Original file line numberDiff line numberDiff line change
@@ -15,9 +15,11 @@
1515
* =============================================================================
1616
*/
1717

18-
import {FaceDetector} from './face_detector';
19-
import {load as loadMediaPipeFaceMeshMediaPipeDetector} from './mediapipe/detector';
18+
import {FaceLandmarksDetector} from './face_landmarks_detector';
19+
import {load as loadMediaPipeFaceMeshMediaPipeLandmarksDetector} from './mediapipe/detector';
2020
import {MediaPipeFaceMeshMediaPipeModelConfig, MediaPipeFaceMeshModelConfig} from './mediapipe/types';
21+
import {loadMeshModel as loadMediaPipeFaceMeshTfjsLandmarksDetector} from './tfjs/detector';
22+
import {MediaPipeFaceMeshTfjsModelConfig} from './tfjs/types';
2123
import {SupportedModels} from './types';
2224

2325
/**
@@ -28,18 +30,19 @@ import {SupportedModels} from './types';
2830
*/
2931
export async function createDetector(
3032
model: SupportedModels,
31-
modelConfig?: MediaPipeFaceMeshMediaPipeModelConfig):
32-
Promise<FaceDetector> {
33+
modelConfig?: MediaPipeFaceMeshMediaPipeModelConfig|
34+
MediaPipeFaceMeshTfjsModelConfig): Promise<FaceLandmarksDetector> {
3335
switch (model) {
3436
case SupportedModels.MediaPipeFaceMesh:
3537
const config = modelConfig as MediaPipeFaceMeshModelConfig;
3638
let runtime;
3739
if (config != null) {
3840
if (config.runtime === 'tfjs') {
39-
throw new Error('TFJS runtime is not yet supported.');
41+
return loadMediaPipeFaceMeshTfjsLandmarksDetector(
42+
config as MediaPipeFaceMeshTfjsModelConfig);
4043
}
4144
if (config.runtime === 'mediapipe') {
42-
return loadMediaPipeFaceMeshMediaPipeDetector(
45+
return loadMediaPipeFaceMeshMediaPipeLandmarksDetector(
4346
config as MediaPipeFaceMeshMediaPipeModelConfig);
4447
}
4548
runtime = config.runtime;

face-landmarks-detection/src/face_detector.ts renamed to face-landmarks-detection/src/face_landmarks_detector.ts

+6-5
Original file line numberDiff line numberDiff line change
@@ -15,12 +15,13 @@
1515
* =============================================================================
1616
*/
1717
import {MediaPipeFaceMeshMediaPipeEstimationConfig} from './mediapipe/types';
18-
import {Face, FaceDetectorInput} from './types';
18+
import {MediaPipeFaceMeshTfjsEstimationConfig} from './tfjs/types';
19+
import {Face, FaceLandmarksDetectorInput} from './types';
1920

2021
/**
2122
* User-facing interface for all face pose detectors.
2223
*/
23-
export interface FaceDetector {
24+
export interface FaceLandmarksDetector {
2425
/**
2526
* Finds faces in the input image.
2627
*
@@ -29,9 +30,9 @@ export interface FaceDetector {
2930
* @param estimationConfig common config for `estimateFaces`.
3031
*/
3132
estimateFaces(
32-
input: FaceDetectorInput,
33-
estimationConfig?: MediaPipeFaceMeshMediaPipeEstimationConfig):
34-
Promise<Face[]>;
33+
input: FaceLandmarksDetectorInput,
34+
estimationConfig?: MediaPipeFaceMeshMediaPipeEstimationConfig|
35+
MediaPipeFaceMeshTfjsEstimationConfig): Promise<Face[]>;
3536

3637
/**
3738
* Dispose the underlying models from memory.

face-landmarks-detection/src/index.ts

+3-2
Original file line numberDiff line numberDiff line change
@@ -16,10 +16,11 @@
1616
*/
1717

1818
export {createDetector} from './create_detector';
19-
// FaceDetector class.
20-
export {FaceDetector} from './face_detector';
19+
// FaceLandmarksDetector class.
20+
export {FaceLandmarksDetector} from './face_landmarks_detector';
2121
// Entry point to create a new detector instance.
2222
export {MediaPipeFaceMeshMediaPipeEstimationConfig, MediaPipeFaceMeshMediaPipeModelConfig} from './mediapipe/types';
23+
export {MediaPipeFaceMeshTfjsEstimationConfig, MediaPipeFaceMeshTfjsModelConfig} from './tfjs/types';
2324

2425
// Supported models enum.
2526
export * from './types';

face-landmarks-detection/src/mediapipe/README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -63,7 +63,7 @@ Pass in `faceLandmarksDetection.SupportedModels.MediaPipeFaceMesh` from the
6363

6464
* *maxFaces*: Defaults to 1. The maximum number of faces that will be detected by the model. The number of returned faces can be less than the maximum (for example when no faces are present in the input). It is highly recommended to set this value to the expected max number of faces, otherwise the model will continue to search for the missing faces which can slow down the performance.
6565

66-
* *predictIrises*: If set to true, refines the landmark coordinates around the eyes and lips, and output additional landmarks around the irises.
66+
* *refineLandmarks*: Defaults to false. If set to true, refines the landmark coordinates around the eyes and lips, and output additional landmarks around the irises.
6767

6868
* *solutionPath*: The path to where the wasm binary and model files are located.
6969

face-landmarks-detection/src/mediapipe/constants.ts

+1-1
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ export const DEFAULT_FACE_MESH_MODEL_CONFIG:
2020
MediaPipeFaceMeshMediaPipeModelConfig = {
2121
runtime: 'mediapipe',
2222
maxFaces: 1,
23-
predictIrises: false
23+
refineLandmarks: false
2424
};
2525

2626
export const DEFAULT_FACE_MESH_ESTIMATION_CONFIG:

face-landmarks-detection/src/mediapipe/detector.ts

+9-7
Original file line numberDiff line numberDiff line change
@@ -18,18 +18,19 @@ import * as faceMesh from '@mediapipe/face_mesh';
1818
import * as tf from '@tensorflow/tfjs-core';
1919

2020
import {MEDIAPIPE_KEYPOINTS} from '../constants';
21-
import {FaceDetector} from '../face_detector';
21+
import {FaceLandmarksDetector} from '../face_landmarks_detector';
2222
import {Keypoint} from '../shared/calculators/interfaces/common_interfaces';
2323
import {landmarksToDetection} from '../shared/calculators/landmarks_to_detection';
24-
import {Face, FaceDetectorInput} from '../types';
24+
import {Face, FaceLandmarksDetectorInput} from '../types';
2525

2626
import {validateModelConfig} from './detector_utils';
2727
import {MediaPipeFaceMeshMediaPipeEstimationConfig, MediaPipeFaceMeshMediaPipeModelConfig} from './types';
2828

2929
/**
3030
* MediaPipe detector class.
3131
*/
32-
class MediaPipeFaceMeshMediaPipeDetector implements FaceDetector {
32+
class MediaPipeFaceMeshMediaPipeLandmarksDetector implements
33+
FaceLandmarksDetector {
3334
private readonly faceMeshSolution: faceMesh.FaceMesh;
3435

3536
// This will be filled out by asynchronous calls to onResults. They will be
@@ -52,7 +53,7 @@ class MediaPipeFaceMeshMediaPipeDetector implements FaceDetector {
5253
}
5354
});
5455
this.faceMeshSolution.setOptions({
55-
refineLandmarks: config.predictIrises,
56+
refineLandmarks: config.refineLandmarks,
5657
selfieMode: this.selfieMode,
5758
maxNumFaces: config.maxFaces,
5859
});
@@ -80,6 +81,7 @@ class MediaPipeFaceMeshMediaPipeDetector implements FaceDetector {
8081
const keypoint: Keypoint = {
8182
x: landmark.x * this.width,
8283
y: landmark.y * this.height,
84+
z: landmark.z * this.width,
8385
};
8486

8587
const name = MEDIAPIPE_KEYPOINTS.get(i);
@@ -113,7 +115,7 @@ class MediaPipeFaceMeshMediaPipeDetector implements FaceDetector {
113115
* @return An array of `Face`s.
114116
*/
115117
async estimateFaces(
116-
input: FaceDetectorInput,
118+
input: FaceLandmarksDetectorInput,
117119
estimationConfig?: MediaPipeFaceMeshMediaPipeEstimationConfig):
118120
Promise<Face[]> {
119121
if (estimationConfig && estimationConfig.flipHorizontal &&
@@ -158,9 +160,9 @@ class MediaPipeFaceMeshMediaPipeDetector implements FaceDetector {
158160
* `MediaPipeFaceMeshMediaPipeModelConfig` interface.
159161
*/
160162
export async function load(modelConfig: MediaPipeFaceMeshMediaPipeModelConfig):
161-
Promise<FaceDetector> {
163+
Promise<FaceLandmarksDetector> {
162164
const config = validateModelConfig(modelConfig);
163-
const detector = new MediaPipeFaceMeshMediaPipeDetector(config);
165+
const detector = new MediaPipeFaceMeshMediaPipeLandmarksDetector(config);
164166
await detector.initialize();
165167
return detector;
166168
}

face-landmarks-detection/src/mediapipe/detector_utils.ts

+2-2
Original file line numberDiff line numberDiff line change
@@ -33,8 +33,8 @@ export function validateModelConfig(
3333
config.maxFaces = DEFAULT_FACE_MESH_MODEL_CONFIG.maxFaces;
3434
}
3535

36-
if (config.predictIrises == null) {
37-
config.predictIrises = DEFAULT_FACE_MESH_MODEL_CONFIG.predictIrises;
36+
if (config.refineLandmarks == null) {
37+
config.refineLandmarks = DEFAULT_FACE_MESH_MODEL_CONFIG.refineLandmarks;
3838
}
3939

4040
return config;

face-landmarks-detection/src/mediapipe/mediapipe_test.ts

+9-9
Original file line numberDiff line numberDiff line change
@@ -93,8 +93,8 @@ const EXPECTED_BOX: BoundingBox = {
9393
};
9494

9595
export async function expectFaceMesh(
96-
detector: faceDetection.FaceDetector, image: HTMLImageElement,
97-
staticImageMode: boolean, predictIrises: boolean, numFrames: number,
96+
detector: faceDetection.FaceLandmarksDetector, image: HTMLImageElement,
97+
staticImageMode: boolean, refineLandmarks: boolean, numFrames: number,
9898
epsilon: number) {
9999
for (let i = 0; i < numFrames; ++i) {
100100
const result = await detector.estimateFaces(image, {staticImageMode});
@@ -112,14 +112,14 @@ export async function expectFaceMesh(
112112
result[0].keypoints.map(keypoint => [keypoint.x, keypoint.y]);
113113
expect(keypoints.length)
114114
.toBe(
115-
predictIrises ? MEDIAPIPE_FACEMESH_NUM_KEYPOINTS_WITH_IRISES :
116-
MEDIAPIPE_FACEMESH_NUM_KEYPOINTS);
115+
refineLandmarks ? MEDIAPIPE_FACEMESH_NUM_KEYPOINTS_WITH_IRISES :
116+
MEDIAPIPE_FACEMESH_NUM_KEYPOINTS);
117117

118118
for (const [eyeIdx, gtLds] of EYE_INDICES_TO_LANDMARKS) {
119119
expectArraysClose(keypoints[eyeIdx], gtLds, epsilon);
120120
}
121121

122-
if (predictIrises) {
122+
if (refineLandmarks) {
123123
for (const [irisIdx, gtLds] of IRIS_INDICES_TO_LANDMARKS) {
124124
expectArraysClose(keypoints[irisIdx], gtLds, epsilon);
125125
}
@@ -142,15 +142,15 @@ describeWithFlags('MediaPipe FaceMesh ', BROWSER_ENVS, () => {
142142
});
143143

144144
async function expectMediaPipeFaceMesh(
145-
image: HTMLImageElement, staticImageMode: boolean, predictIrises: boolean,
146-
numFrames: number) {
145+
image: HTMLImageElement, staticImageMode: boolean,
146+
refineLandmarks: boolean, numFrames: number) {
147147
// Note: this makes a network request for model assets.
148148
const model = faceDetection.SupportedModels.MediaPipeFaceMesh;
149149
const detector = await faceDetection.createDetector(
150-
model, {...MEDIAPIPE_MODEL_CONFIG, predictIrises});
150+
model, {...MEDIAPIPE_MODEL_CONFIG, refineLandmarks});
151151

152152
await expectFaceMesh(
153-
detector, image, staticImageMode, predictIrises, numFrames,
153+
detector, image, staticImageMode, refineLandmarks, numFrames,
154154
EPSILON_IMAGE);
155155
}
156156

face-landmarks-detection/src/mediapipe/types.ts

+2-2
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ import {EstimationConfig, ModelConfig} from '../types';
2222
*/
2323
export interface MediaPipeFaceMeshModelConfig extends ModelConfig {
2424
runtime: 'mediapipe'|'tfjs';
25-
predictIrises: boolean;
25+
refineLandmarks: boolean;
2626
}
2727

2828
export interface MediaPipeFaceMeshEstimationConfig extends EstimationConfig {}
@@ -32,7 +32,7 @@ export interface MediaPipeFaceMeshEstimationConfig extends EstimationConfig {}
3232
*
3333
* `runtime`: Must set to be 'mediapipe'.
3434
*
35-
* `predictIrises`: If set to true, refines the landmark coordinates around
35+
* `refineLandmarks`: If set to true, refines the landmark coordinates around
3636
* the eyes and lips, and output additional landmarks around the irises.
3737
*
3838
* `solutionPath`: Optional. The path to where the wasm binary and model files
+117
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,117 @@
1+
# MediaPipeFaceMesh
2+
3+
MediaPipeHands-TFJS uses TF.js runtime to execute the model, the preprocessing and postprocessing steps.
4+
5+
Please try our our live [demo](https://storage.googleapis.com/tfjs-models/demos/face-landmarks-detection/index.html?model=mediapipe_facemesh).
6+
In the runtime-backend dropdown, choose 'tfjs-webgl'.
7+
8+
--------------------------------------------------------------------------------
9+
10+
## Table of Contents
11+
12+
1. [Installation](#installation)
13+
2. [Usage](#usage)
14+
15+
## Installation
16+
17+
To use MediaPipeFaceMesh, you need to first select a runtime (TensorFlow.js or MediaPipe).
18+
This guide is for TensorFlow.js
19+
runtime. The guide for MediaPipe runtime can be found
20+
[here](https://github.com/tensorflow/tfjs-models/tree/master/face-landmarks-detection/src/mediapipe).
21+
22+
Via script tags:
23+
24+
```html
25+
<!-- Require the peer dependencies of face-landmarks-detection. -->
26+
<script src="https://cdn.jsdelivr.net/npm/@mediapipe/face_mesh"></script>
27+
<script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs-core"></script>
28+
<script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs-converter"></script>
29+
30+
<!-- You must explicitly require a TF.js backend if you're not using the TF.js union bundle. -->
31+
<script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs-backend-webgl"></script>
32+
33+
<script src="https://cdn.jsdelivr.net/npm/@tensorflow-models/face-landmarks-detection"></script>
34+
```
35+
36+
Via npm:
37+
38+
```sh
39+
yarn add @tensorflow-models/face-landmarks-detection
40+
yarn add @tensorflow/tfjs-core, @tensorflow/tfjs-converter
41+
yarn add @tensorflow/tfjs-backend-webgl
42+
yarn add @mediapipe/face_mesh
43+
```
44+
45+
-----------------------------------------------------------------------
46+
## Usage
47+
48+
If you are using the face-landmarks-detection API via npm, you need to import the libraries first.
49+
50+
### Import the libraries
51+
52+
```javascript
53+
import * as faceLandmarksDetection from '@tensorflow-models/face-landmarks-detection';
54+
import '@tensorflow/tfjs-core';
55+
// Register WebGL backend.
56+
import '@tensorflow/tfjs-backend-webgl';
57+
import '@mediapipe/face_mesh';
58+
```
59+
### Create a detector
60+
61+
Pass in `handPoseDetection.SupportedModels.MediaPipeFaceMesh` from the
62+
`faceLandmarksDetection.SupportedModel` enum list along with a `detectorConfig` to the
63+
`createDetector` method to load and initialize the model.
64+
65+
`detectorConfig` is an object that defines MediaPipeFaceMesh specific configurations for `MediaPipeFaceMeshTfjsModelConfig`:
66+
67+
* *runtime*: Must set to be 'tfjs'.
68+
69+
* *maxFaces*: Defaults to 1. The maximum number of faces that will be detected by the model. The number of returned faces can be less than the maximum (for example when no faces are present in the input). It is highly recommended to set this value to the expected max number of faces, otherwise the model will continue to search for the missing faces which can slow down the performance.
70+
71+
* *refineLandmarks*: Defaults to false. If set to true, refines the landmark coordinates around the eyes and lips, and output additional landmarks around the irises.
72+
73+
* *detectorModelUrl*: An optional string that specifies custom url of
74+
the detector model. This is useful for area/countries that don't have access to the model hosted on tf.hub. It also accepts `io.IOHandler` which can be used with
75+
[tfjs-react-native](https://github.com/tensorflow/tfjs/tree/master/tfjs-react-native)
76+
to load model from app bundle directory using
77+
[bundleResourceIO](https://github.com/tensorflow/tfjs/blob/master/tfjs-react-native/src/bundle_resource_io.ts#L169).
78+
* *landmarkModelUrl* An optional string that specifies custom url of
79+
the landmark model. This is useful for area/countries that don't have access to the model hosted on tf.hub. It also accepts `io.IOHandler` which can be used with
80+
[tfjs-react-native](https://github.com/tensorflow/tfjs/tree/master/tfjs-react-native)
81+
to load model from app bundle directory using
82+
[bundleResourceIO](https://github.com/tensorflow/tfjs/blob/master/tfjs-react-native/src/bundle_resource_io.ts#L169).
83+
84+
```javascript
85+
const model = faceLandmarksDetection.SupportedModels.MediaPipeFaceMesh;
86+
const detectorConfig = {
87+
runtime: 'tfjs',
88+
};
89+
detector = await faceLandmarksDetection.createDetector(model, detectorConfig);
90+
```
91+
92+
### Run inference
93+
94+
Now you can use the detector to detect faces. The `estimateFaces` method
95+
accepts both image and video in many formats, including: `tf.Tensor3D`,
96+
`HTMLVideoElement`, `HTMLImageElement`, `HTMLCanvasElement` and `Tensor3D`. If you want more
97+
options, you can pass in a second `estimationConfig` parameter.
98+
99+
`estimationConfig` is an object that defines MediaPipeFaceMesh specific configurations for `MediaPipeFaceMeshTfjsEstimationConfig`:
100+
101+
* *flipHorizontal*: Optional. Defaults to false. When image data comes from camera, the result has to flip horizontally.
102+
103+
* *staticImageMode*: Optional. Defaults to false. If set to true, face detection
104+
will run on every input image, otherwise if set to false then detection runs
105+
once and then the model simply tracks those landmarks without invoking
106+
another detection until it loses track of any of the faces (ideal for videos).
107+
108+
The following code snippet demonstrates how to run the model inference:
109+
110+
```javascript
111+
const estimationConfig = {flipHorizontal: false};
112+
const faces = await detector.estimateFaces(image, estimationConfig);
113+
```
114+
115+
Please refer to the Face API
116+
[README](https://github.com/tensorflow/tfjs-models/blob/master/face-landmarks-detection/README.md#how-to-run-it)
117+
about the structure of the returned `faces` array.

0 commit comments

Comments
 (0)