Project 5 / Face Detection with a Sliding Window

For all the following implementations, the following parameters are unchanged.

num_negative_examples = 10000

lambda = 0.0001

template_size = 36

detection_threshold: 0.75

Iteration 1: Single Scale

In this iteration, only a single scale of 1 is used, which means the size of the face detected is fixed. With basic face detection implementation with the following hog template. In the hog template, we can only roughly see a face-like feature. The average precision in this iteration is 0.311.

With this implementation, there's a lot of undetected faces due to the fixed face size. In the image below, some faces are detected but faces with slightly different sizes are undetected.

Iteration 2: Multiple Scale with cell size of 6

In this iteration, scales of [1, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1] is used to detect faces with different sizes. The hog template remains unchanged. The average precision in this iteration is 0.837.

With the implementation of variable scale, the faces are almost always detected, and the number of false positive increases. Sometimes we get bounding boxes thats way bigger than the ground truth bounding box. We'll see in future implementation with finer scales, we can no longer detect faces like these.

Iteration 3&4: Mutiple Scale with cell size of 4 & 3

In these two iterations, the same scales were used as before, but we reduced the cell size to 4 and 3. Since we reduced the cell sizes in vl_hog, we can clearly make out the features on the hog template. With cell size of 4, we can clearly see the features for eyes, nose, and mouth. With cell size of 3, the features are more detailed.

The average precision in these iterations also went up. With cell size of 4, the average precision is 0.88, and with cell size of 3, the average precision is 0.898.

With cell size of 4, the number of false positive significantly decreased. With a smaller cell size of 3, the number of false positive significantly increased, in some images even higher than that from cell size of 6.

Detections with cell size of 4.

Detections with cell size of 3.

Also, with a smaller cell size, some of the faces in the images which are not perfectly front faced were not detected.