Project 5 / Face Detection with a Sliding Window

Fig 1. Learned HOG template for facial recognition.

This project implemented the sliding window face detection algorithm described by Dalal and Triggs (2005).

Each training patch is 36x36 and broken into a 6x6 grid of HOG features. These features are learned by an SVM with lambda set to 0.00001. The trained SVM achieves 99.9% accuracy on the training data.

The detector itself works on multiple scales stepping by 1 HOG cell (6 pixels in the scaled image). The scaling ranges from 1/4x to 2x. If the given test image is smaller than 36x36 at 1/4x scaling, the minimum scale is bumped up to that necessary to achieve at least 36x36. The detector steps through this scaling range in steps of 0.25x.

The detector achieved an average precision of 57.6% as shown in figure 3.

Fig 2. Average precision plot.

Fig 3. Example detection results.