Project 5 / Face Detection with a Sliding Window

Example of a right floating element.

I implemented a face detection algorithm using the sliding window model. The sliding window model is conceptually simple: independently classify all image patches as being object or non-object. Sliding window classification is the dominant paradigm in object detection and for one object category in particular -- faces -- it is one of the most noticeable successes of computer vision. There are three parts in the pipeline: feature extraction, training classifier, sliding window dection

Feature Extraction

Features are represented as HoG (histogram of gradients). Positive examples are 36x36 portaits. HoG features are extracted and returned. For negative examples, small 36x36 image patches are randomly selected in different scales, then extract the features and return the result.

Train Classifier

I used a linear SVM to train the positive and negetive data with lambda=0.00001

Sliding Window Dection

Compute HoG feature of the image, then for each 36x36 pixel box, use the linear classifier to detect if the box is a face or not with a threshold. Then scale the image by a factor repeatly.

Single Scale Detector

This is the baseline of the implementation. I start with about 10000 negative samples, 6 pixel cell size and lambda is 0.00001. The average precision is about 0.357. Because it runs on single scale, only faces with about the same size 36x36 will be detected, it has low precision.

MultiScale Detector

Instead of runing only on original image, the image is scaled and sampled multiple times.

Face template HoG visualization with different pixel cell.

6 pixel cell 4 pixel cell 3 pixel cell

Precision Recall curve with different pixel cell.

Hard Negative

Negative examples are pick randomly from non-face training set. Then these images are searched exhaustively for false positives. Then re-train the classifier.

Initially, 5000 negative examples are randomly choosen. Then about 9000 false positive are detected. The false positive rate and average precision are improved slightly.

20000 false sample 5000 false sample+15000 hard negative

Sample Results