Project 5 / Face Detection with a Sliding Window

This figure is one of the images that went through the face detection algorithm. As you can see, faces have been detected indicated by the green squares.

This project required me to develop an algorithm to understand facial recognition. For example, it would look scan the photo, like the one above, and detect all the faces in the picture. This type of ai is currently being used in modern cameras that instantly show the user where faces are and then callibrate the camera imagery to focus in on those faces rather than the background of the person. Since I have already completed a sift descriptor, this project focused on the detection process of the faces.

Results

1. Get positive features function

This function required me to take in a bunch of photos with faces in them and converting them to HoG templates. I began by initalizing the features_pos values the number of images and the template dimentionality given to us by the equation:

(feature_params.template_size / feature_params.hog_cell_size)^2 * 31

Once completed, I needed to recurse through all the images, read the file names, and us vl_hog to calculate our HoG. After that, we flattened these values so it is easy to classify for the SVM. I think it is worth noting that the image is already cropped and is 36 by 36 which makes it easier to work with. below is the loop that flattend out the values for easy classification for the SVM.

for current=1:valueSize(2)
       features_pos(index,current) = reshapedValue(1,current);
end
2. Get random negative features function

This function required me to take in a bunch of images, and return all the photos without faces in them. Before returning those images, they should be grayscaled since the positive training data is in grayscale. I used the rgb2gray to turn the file from color to gray scale. this seemed like it may have made my results a bit more inaccurate since a lot of the images probably looked more or less like each other but it was necessary for this project. Next, we cut the image into a 36 by 36 and did that by knowing the width/height of the image and the image size allows us to choose a section to use. Below will be the code that does that:

getImageSize = size(image);
height = getImageSize(1);
width = getImageSize(2);

newValuesCut = single(zeros(36,36));
for a=1:featureSize
    for b=1:featureSize
        currentValue = image(ceil(height-featureSize)+a-1,ceil(width-featureSize)+b-1);
        newValuesCut(a,b) = currentValue;
    end
end

Finally, we use vl_hog to get our feature Neg Values. The vl_hog took in that newValuesCut Value and the cell size as directed by the api.

3. Classifer Training

For this step, we will use vl_svmtrain to get the linear classifier. for the vl_svmtrain, we needed the data and labels to input into the function. The data part was obtained by making an array of the feature_pos and feature_neg. The labels was created by filling an array with one's and then filling the values that fit the other values with a -1 for indication. The lambda that was used was the one that was reccommended at 0.0001. I tried using 0.001 or 0.00001 but they seemed to not perform as well as 0.0001.

4. run detector

This function all of the images in a given path. For this one, I began by making sure i keep on scaling by 1 every time since it would convert the test images to HoG feature space. vl_hog(img, feature_params.hog_cell_size*currentScale) was used to initally get the valuesto HoG. I then iterrated through the size of the HoG parameters (minus the cell sizes to not exceed the matrix dimensions). After extracting the HoG values by the indices in the loops, we checked to see if:

transpose(w)*transpose(hogReShaped)+b

was larger than the threshold that was set above to be at -1. If the value was greater than -1, I then kept track of the scene and updated the cur_bboxes and cur_confidences for detection i.

below will be the some the results in a table.

Results in a table