17 KiB

Raw Blame History

FaceDetection

Introduction
Benchmark and Model Zoo
Quick Start
Face key-point detection
Algorithm Description
Contributing

Introduction

The goal of FaceDetection is to provide efficient and high-speed face detection solutions, including cutting-edge and classic models.

Benchmark and Model Zoo

PaddleDetection Supported architectures is shown in the below table, please refer to Algorithm Description for details of the algorithm.

	Original	Lite ¹	NAS ²
BlazeFace	✓	✓	✓
FaceBoxes	✓	✓	x

[1] Lite edition means reduces the number of network layers and channels.
[2] NAS edition means use Neural Architecture Search algorithm to optimized network structure.

Model Zoo

mAP in WIDER FACE

Architecture	Type	Size	Img/gpu	Lr schd	Easy Set	Medium Set	Hard Set	Download	Configs
BlazeFace	Original	640	8	32w	0.915	0.892	0.797	model	config
BlazeFace	Lite	640	8	32w	0.909	0.885	0.781	model	config
BlazeFace	NAS	640	8	32w	0.837	0.807	0.658	model	config
BlazeFace	NAS_V2	640	8	32W	0.870	0.837	0.685	model	config
FaceBoxes	Original	640	8	32w	0.878	0.851	0.576	model	config
FaceBoxes	Lite	640	8	32w	0.901	0.875	0.760	model	config

NOTES:

Get mAP in Easy/Medium/Hard Set by multi-scale evaluation in tools/face_eval.py. For details can refer to Evaluation.
BlazeFace-Lite Training and Testing ues blazeface.yml configs file and set lite_edition: true.

mAP in FDDB

Architecture	Type	Size	DistROC	ContROC
BlazeFace	Original	640	0.992	0.762
BlazeFace	Lite	640	0.990	0.756
BlazeFace	NAS	640	0.981	0.741
FaceBoxes	Original	640	0.987	0.736
FaceBoxes	Lite	640	0.988	0.751

NOTES:

Get mAP by multi-scale evaluation on the FDDB dataset. For details can refer to Evaluation.

Infer Time and Model Size comparison

Architecture	Type	Size	P4(trt32) (ms)	CPU (ms)	CPU (ms)(enable_mkldmm)	Qualcomm SnapDragon 855(armv8) (ms)	Model size (MB)
BlazeFace	原始版本	128	1.387	23.461	4.92	6.036	0.777
BlazeFace	Lite版本	128	1.323	12.802	7.16	6.193	0.68
BlazeFace	NAS版本	128	1.03	6.714	3.641	2.7152	0.234
BlazeFace	NAS_V2版本	128	0.909	9.58	7.903	3.499	0.383
FaceBoxes	原始版本	128	3.144	14.972	9,852	19.2196	3.6
FaceBoxes	Lite版本	128	2.295	11.276	6.969	8.5278	2
BlazeFace	原始版本	320	3.01	132.408	20.762	70.6916	0.777
BlazeFace	Lite版本	320	2.535	69.964	35.612	69.9438	0.68
BlazeFace	NAS版本	320	2.392	36.962	14.443	39.8086	0.234
BlazeFace	NAS_V2版本	320	1.487	52.038	38.693	56.137	0.383
FaceBoxes	原始版本	320	7.556	84.531	48.465	52.1022	3.6
FaceBoxes	Lite版本	320	18.605	78.862	46.488	59.8996	2
BlazeFace	原始版本	640	8.885	519.364	78.825	149.896	0.777
BlazeFace	Lite版本	640	6.988	284.13	131.385	149.902	0.68
BlazeFace	NAS版本	640	7.448	142.91	56.725	69.8266	0.234
BlazeFace	NAS_V2版本	640	4.201	197.695	153.626	88.278	0.383
FaceBoxes	原始版本	640	78.201	394.043	239.201	169.877	3.6
FaceBoxes	Lite版本	640	59.47	313.683	168.73	139.918	2

NOTES:

CPU: Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz.
P4(trt32) and CPU tests based on PaddlePaddle, PaddlePaddle version is 1.8.0.
ARM test environment:
- Qualcomm SnapDragon 855(armv8);
- Single thread;
- Paddle-Lite version develop.

Quick Start

Data Pipline

We use the WIDER FACE dataset to carry out the training and testing of the model, the official website gives detailed data introduction.

WIDER Face data source:
Loads wider_face type dataset with directory structures like this:

dataset/wider_face/
├── wider_face_split
│   ├── wider_face_train_bbx_gt.txt
│   ├── wider_face_val_bbx_gt.txt
├── WIDER_train
│   ├── images
│   │   ├── 0--Parade
│   │   │   ├── 0_Parade_marchingband_1_100.jpg
│   │   │   ├── 0_Parade_marchingband_1_381.jpg
│   │   │   │   ...
│   │   ├── 10--People_Marching
│   │   │   ...
├── WIDER_val
│   ├── images
│   │   ├── 0--Parade
│   │   │   ├── 0_Parade_marchingband_1_1004.jpg
│   │   │   ├── 0_Parade_marchingband_1_1045.jpg
│   │   │   │   ...
│   │   ├── 10--People_Marching
│   │   │   ...

Download dataset manually:
To download the WIDER FACE dataset, run the following commands:

cd dataset/wider_face && ./download.sh

Download dataset automatically: If a training session is started but the dataset is not setup properly (e.g, not found in dataset/wider_face), PaddleDetection can automatically download them from WIDER FACE dataset, the decompressed datasets will be cached in ~/.cache/paddle/dataset/ and can be discovered automatically subsequently.

Data Augmentation

Data-anchor-sampling: Randomly transform the scale of the image to a certain range of scales, greatly enhancing the scale change of the face. The specific operation is to obtain v=\sqrt{width * height} according to the randomly selected face height and width, and judge the value of v in which interval of [16,32,64,128]. Assuming v=45 && 32<v<64, and any value of [16,32,64] is selected with a probability of uniform distribution. If 64 is selected, the face's interval is selected in [64 / 2, min(v * 2, 64 * 2)].
Other methods: Including RandomDistort,ExpandImage,RandomInterpImage,RandomFlipImage etc. Please refer to READER.md for details.

Training and Inference

Training and Inference please refer to GETTING_STARTED.md
NOTES:

BlazeFace and FaceBoxes is trained in 4 GPU with batch_size=8 per gpu (total batch size as 32) and trained 320000 iters.(If your GPU count is not 4, please refer to the rule of training parameters in the table of calculation rules).
Currently we do not support evaluation in training.

Evaluation

Currently we support evaluation on the WIDER FACE dataset and the FDDB dataset. First run tools / face_eval.py to generate the evaluation result file, and then use matlab(WIDER FACE) or OpenCV(FDDB) calculates specific evaluation indicators.
Among them, the optional arguments list for running tools / face_eval.py is as follows:

-f or --output_eval: Evaluation file directory, default is output/pred.
-e or --eval_mode: Evaluation mode, include widerface and fddb, default is widerface.
--multi_scale: If you add this action button in the command, it will select multi_scale evaluation. Default is False, it will select single-scale evaluation.

Evaluate on the WIDER FACE

Evaluate and generate results files:

export CUDA_VISIBLE_DEVICES=0
python -u tools/face_eval.py -c configs/face_detection/blazeface.yml \
       -o weights=output/blazeface/model_final \
       --eval_mode=widerface

After the evaluation is completed, the test result in txt format will be generated in output/pred.

Download the official evaluation script to evaluate the AP metrics:

wget http://mmlab.ie.cuhk.edu.hk/projects/WIDERFace/support/eval_script/eval_tools.zip
unzip eval_tools.zip && rm -f eval_tools.zip

Modify the result path and the name of the curve to be drawn in eval_tools/wider_eval.m:

# Modify the folder name where the result is stored.
pred_dir = './pred';  
# Modify the name of the curve to be drawn
legend_name = 'Fluid-BlazeFace';

wider_eval.m is the main execution program of the evaluation module. The run command is as follows:

matlab -nodesktop -nosplash -nojvm -r "run wider_eval.m;quit;"

Evaluate on the FDDB

We provide a FDDB data set evaluation process (currently only supports Linux systems), please refer to FDDB official website for other specific details.

1)Download and install OpenCV：
Download OpenCV: go to OpenCV library to Manual download
Install OpenCV：Please refer to Official OpenCV Installation Tutorial to install through source code.
2)Download datasets, evaluation code, and formatted data:

./dataset/fddb/download.sh

3)Compile FDDB evaluation code: Go to the dataset/fddb/evaluation directory and modify the contents of the MakeFile file as follows:

evaluate: $(OBJS)
    $(CC) $(OBJS) -o $@ $(LIBS)

Modify the content in common.hpp to the following form:

#define __IMAGE_FORMAT__ ".jpg"
//#define __IMAGE_FORMAT__ ".ppm"
#define __CVLOADIMAGE_WORKING__

According to the grep -r "CV_RGB" command, find the code segment containing CV_RGB, change CV_RGB to Scalar, and add using namespace cv; in cpp, then compile:

make clean && make

4)Start evaluation:
Modify the contents of the dataset_dir and annotation fields in the config file:

EvalReader:
  ...
  dataset:
    dataset_dir: dataset/fddb
    anno_path: FDDB-folds/fddb_annotFile.txt
    ...

Evaluate and generate results files:

python -u tools/face_eval.py -c configs/face_detection/blazeface.yml \
       -o weights=output/blazeface/model_final \
       --eval_mode=fddb

After the evaluation is completed, the test result in txt format will be generated in output/pred/pred_fddb_res.txt.
Generate ContROC and DiscROC data:

cd dataset/fddb/evaluation
./evaluate -a ./FDDB-folds/fddb_annotFile.txt \
           -f 0 -i ./ -l ./FDDB-folds/filePath.txt -z .jpg \
           -d {RESULT_FILE} \
           -r {OUTPUT_DIR}

NOTES:
(1)RESULT_FILE is the FDDB prediction result file output by tools/face_eval.py;
(2)OUTPUT_DIR is the prefix of the FDDB evaluation output file, which will generate two files {OUTPUT_DIR}ContROC.txt、{OUTPUT_DIR}DiscROC.txt;
(3)The interpretation of the argument can be performed by ./evaluate --help.

Face key-point detection

(1)Download face key-point annotation file in WIDER FACE dataset(Link), and copy to the folder wider_face/wider_face_split:

cd dataset/wider_face/wider_face_split/
wget https://dataset.bj.bcebos.com/wider_face/wider_face_train_bbx_lmk_gt.txt

(2)Use configs/face_detection/blazeface_keypoint.yml configuration file for training and evaluation, the method of use is the same as the previous section.

Evaluation

Architecture	Size	Img/gpu	Lr schd	Easy Set	Medium Set	Hard Set	Download	Configs
BlazeFace Keypoint	640	16	16w	0.852	0.816	0.662	download	config

Algorithm Description

BlazeFace

Introduction:
BlazeFace is Google Research published face detection model. It's lightweight but good performance, and tailored for mobile GPU inference. It runs at a speed of 200-1000+ FPS on flagship devices.

Particularity:

Anchor scheme stops at 8×8(input 128x128), 6 anchors per pixel at that resolution.
5 single, and 6 double BlazeBlocks: 5×5 depthwise convs, same accuracy with fewer layers.
Replace the non-maximum suppression algorithm with a blending strategy that estimates the regression parameters of a bounding box as a weighted mean between the overlapping predictions.

Edition information:

Original: Reference original paper reproduction.
Lite: Replace 5x5 conv with 3x3 conv, fewer network layers and conv channels.
NAS: use Neural Architecture Search algorithm to optimized network structure, less network layer and conv channel number than Lite.
NAS_V2: this version of model architecture searched based on blazeface-NAS by the SANAS in PaddleSlim, the average precision is 3% higher than blazeface-NAS, the latency is only 5% higher than blazeface-NAS on chip 855.

FaceBoxes

Introduction:
FaceBoxes which named A CPU Real-time Face Detector with High Accuracy is face detector proposed by Shifeng Zhang, with high performance on both speed and accuracy. This paper is published by IJCB(2017).

Particularity:

Anchor scheme stops at 20x20, 10x10, 5x5, which network input size is 640x640, including 3, 1, 1 anchors per pixel at each resolution. The corresponding densities are 1, 2, 4(20x20), 4(10x10) and 4(5x5).
2 convs with CReLU, 2 poolings, 3 inceptions and 2 convs with ReLU.
Use density prior box to improve detection accuracy.

Edition information:

Original: Reference original paper reproduction.
Lite: 2 convs with CReLU, 1 pooling, 2 convs with ReLU, 3 inceptions and 2 convs with ReLU. Anchor scheme stops at 80x80 and 40x40, including 3, 1 anchors per pixel at each resolution. The corresponding densities are 1, 2, 4(80x80) and 4(40x40), using less conv channel number than lite.

Contributing

Contributions are highly welcomed and we would really appreciate your feedback!!

17 KiB Raw Blame History Unescape Escape

FaceDetection

Table of Contents

Introduction

Benchmark and Model Zoo

Model Zoo

mAP in WIDER FACE

mAP in FDDB

Infer Time and Model Size comparison

Quick Start

Data Pipline

Data Augmentation

Training and Inference

Evaluation

Evaluate on the WIDER FACE

Evaluate on the FDDB

Face key-point detection

Evaluation

Algorithm Description

BlazeFace

FaceBoxes

Contributing

17 KiB

Raw Blame History