Project 5: Face detection with a sliding window

The objective of this project is to detect faces of people in images. To this end, we first trained a linear SVM (with lambda=0.0001) to recoginize both faces and non-faces using their corresponding HoG features as the input. After which, we evaluate the performance of our trained SVM on the test set of images using a sliding window approach. The following list of functions are implemented for this project.

  1. get_positive_features.m: This function returns the HoG features from the positive training set.
  2. get_random_negative_features.m: This function returns the HoG features from the negative training set. A fixed number of such features are extracted randomly from each image, and the set of all such features are returned.
  3. run_detector.m: This is the implementation of the sliding window for face detection. The window performs a horizontal search across the image for features which are identified as 'positive' by the trained SVM.
  4. run_detector_w_neg_mining.m: This is the extra credit implementation of the above with hard negative mining.
  5. proj5.m : This is the code used to run the project. Also, the SVM training was self-implemented in this file.

Hard negative mining

For each image, and for each of its horizontal search, the negative features identified by the SVM are used to re-train the SVM.

Results

The performance of the baseline detector is pretty remarkable. The average precisions are summarized below

  1. HoG cell size = 6, num_negative_examples=10000, Average Precision=87.1%
  2. HoG cell size = 6, num_negative_examples=100000, Average Precision=84.1%
  3. HoG cell size = 4, num_negative_examples=10000, Average Precision=89.7%
  4. HoG cell size = 3, num_negative_examples=10000, Average Precision=92.0%
  5. HoG cell size = 6 (with hard negative mining), num_negative_examples=10000, Average Precision=65.6%

Observations

For the baseline detector, smaller HoG cell sizes would lead to higher values of precision at the cost of computation time. The same is true for the value of num_negative_examples, except that its effect on computation time is not as significant. With hard negative mining, fewer false positives are observed, at the cost of average precision. However, hard negative mining takes too long to run, so further experiments are not pursued.

Extra Credit

The following extra credit attempts were made.

  1. Reducing HoG cell size: a lowest size of 3 is used.
  2. Implementation of the hard negative mining

HoG feature (Contour of a face)

The curves for cases 1, 3, and 4 listed above are shown below.

Below are a few examples of the results returned by the detection algorithm. It can be seen in most images that faces are correctly detected (true positive), along with a number of false positives.