Example of a right floating element.
I implemented a face detection algorithm using the sliding window model. The sliding window model is conceptually simple: independently classify all image patches as being object or non-object. Sliding window classification is the dominant paradigm in object detection and for one object category in particular -- faces -- it is one of the most noticeable successes of computer vision. There are three parts in the pipeline: feature extraction, training classifier, sliding window dection
Features are represented as HoG (histogram of gradients). Positive examples are 36x36 portaits. HoG features are extracted and returned. For negative examples, small 36x36 image patches are randomly selected in different scales, then extract the features and return the result.
I used a linear SVM to train the positive and negetive data with lambda=0.00001
Compute HoG feature of the image, then for each 36x36 pixel box, use the linear classifier to detect if the box is a face or not with a threshold. Then scale the image by a factor repeatly.
Instead of runing only on original image, the image is scaled and sampled multiple times.
Face template HoG visualization with different pixel cell.
6 pixel cell | 4 pixel cell | 3 pixel cell |
Precision Recall curve with different pixel cell.
Negative examples are pick randomly from non-face training set. Then these images are searched exhaustively for false positives. Then re-train the classifier.
Initially, 5000 negative examples are randomly choosen. Then about 9000 false positive are detected. The false positive rate and average precision are improved slightly.
20000 false sample | 5000 false sample+15000 hard negative |