Project 5 / Face Detection with a Sliding Window

This project's goal is to use implement a Dalal-Triggs style face detector. The sample faces used for training are from the Caltech Web Face Project and the sample non-faces are from the We et al. and the Sun scene database. The testing images are provided by CMU.MIT face detection test set. For the project, I implemented the following sections.

  1. Get positive features
  2. Get random negative features
  3. SVM classification
  4. Run detector

Get positive features

Get positive features is meant to convert a database of faces into histograms of gradients. The input faces are all in greyscale and of size 36x36. I performed the following steps to accomplish this.

  1. Load each image in a file directory
  2. For each image, use VLFeat's vl_hog method to convert the image into a histogram of gradients. VLFeat's vl_hog method takes as input an image and cell size. I used a cell size of 6 when implementing this project.
  3. The histograms vl_hog returns histograms with 31 bins each.
  4. Store all of the generated histograms in a matrix and return it.

Get random negative features

Get random negative features takes patches from non-face images at different scales and converts them to histograms of gradients of the same size as in "get_positive_features". This required performing the following steps.

  1. Load a random image in the directory and convert it to greyscale.
  2. For scales 1, .8, and .64 of the image, convert the image into a histogram of gradients with VLFeat's vl_hog method using the same cell size of 6.
  3. Slide through histograms taking non-overlapping patches so that the patch of histograms matches the size of the the histograms obtained from the templates in "get_positive_features".
  4. Count the number of obtained histogram patches so that there are exactly "num_samples" patches cut out. "num_samples" is an input to the "get_random_negative_features" function
  5. Store all of the generated histograms in a matrix and return it.

SVM classification

This step trains a model to decide the confidence of image patches being a face. It uses a support vector machine to accomplish this. I perform the following steps to accomplish this.

  1. Use the results from "Get_positive_features" and "get_random_negative_features" to create the training data.
  2. select a lambda value for the SVM classification. I found that .0001 was giving me decent results and use that for the implementation
  3. Use VLFeat's vl_svmtrain to create the hyperplane model that will be used to evaluate images.

Run detector

  1. Load each image to search for faces in and convert it to greyscale
  2. Convert multiple scales of the image into cells of histograms of gradients using vl_feat's vl_hog method. I used the same cell size of 6 as before. The scales I used are 1, .8, .64, .512, .4096, .32768, .262144, .2097152.
  3. Slide through histograms taking overlapping patches so that the patch of histograms matches the size of the the histograms obtained from the templates in "get_positive_features".
  4. Use the SVM model to compute a confidence score. This is shown below.
    		score = w'*hogPatch(:) + b;
    	
  5. If the score is above some threshold, store the bounding box, the confidence, and the image into matrices. I tried several thresholds, and obtained a good balance between precision and accuracy at a threshold of .5.
  6. Run non-maximum suppression on all the possible matches of faces on each image. The non-maximum suppression will be run on the results of all the scales of a single image at the same time.
  7. Return the bounding boxes and confidences that have passed through non-maximum suppression.

Results

In the end, I used the following parameter to obtain my results.

I obtained the following results.

Face template HoG visualization for the starter code. This is completely random, but it should actually look like a face once you train a reasonable classifier. Precision Recall curve for the starter code.
Example of detection on the test set from the starter code.
Example of detection on the Class Test example