The aim of this project is to detect faces in images. We use a sliding window approach where a window is slided through the image and the patch of the image for each of the window positions is passed through a classifier to detect whether there is a face in that window or not. This is done at various scales because different images may have different size of the face in them. The following things were implemented and tested.
The final best average precision I achieved was 0.833 on the test dataset
HoG (Histogram of Gradients) feature is commonly used for object detection. The image is decomposed into small squared cells and a histogram of oriented gradients is is computed in each cell, the result is normalized and a final descriptor is computed. The HoG cell size used is 6. The size of the image is 36*36 and the vl_hog descriptor returned is 6*6*31 which is then reshaped to a vector of size 1*1116
for i=1:num_images
img_path = strcat(train_path_pos, '/', image_files(i).name);
img = im2single(imread(img_path));
if(size(img,3) > 1)
img = rgb2gray(img);
end
features_pos(i,:) = reshape(vl_hog(img, hog_cell_size), 1, D);
end
The initial accuracy on the training data is 1.0 which might be because the model is overfitting the training data.
accuracy: 1.000
true positive rate: 0.398
false positive rate: 0.000
true negative rate: 0.602
false negative rate: 0.000
num_images = length(image_files);
template_size = feature_params.template_size;
hog_cell_size = feature_params.hog_cell_size;
D = (template_size / hog_cell_size)^2 * 31; %hog_window_size
features_neg = []; % num_images * D;
num_samples_per_image = ceil(num_samples/num_images);
for i=1:num_images
img_path = strcat(non_face_scn_path, '/', image_files(i).name);
img = im2single(imread(img_path));
if(size(img,3) > 1)
img = rgb2gray(img);
end
[height width] = size(img);
x = randi([1 width - template_size], 1, num_samples_per_image);
y = randi([1 height- template_size], 1, num_samples_per_image);
for j=1:num_samples_per_image
sample = img(y:y+template_size-1, x:x+template_size-1);
features_neg = [features_neg; reshape(vl_hog(sample, hog_cell_size),1,D)];
end