Project 2: Local Feature Matching

Key points in two images of Mount Rushmore

The goal of this project was to detect key points in an image using Harris corner detector and implement a SIFT pipeline for feature matching between two images. The general technique for matching points between two images, as described in the lecture 8 slides, is as follows:

  1. Find a set of distinctive key-points
  2. Define a region around each keypoint
  3. Compute a local descriptor from the normalized region
  4. Match local descriptors

Finding Distinctive Points

To gather the interest points of the images, I implemented the Harris detector which evaluates points based on the eigen values of the auto-correlation matrix of image gradient squares.

Figure 4.8 in Szeliski

The horizontal and vertical gradients of the second image of the Notre Dame Cathedral

To get the points where both eigen values are large, I took the x and y gradients using Matlab's imgradientxy function, and found the equivalent of the determinant minus the trace squared by using the function given in the lecture slides.


h = (g_ix2.*g_iy2) - (g_ixiy.^2) - (alpha.*(g_ix2 + g_iy2).^2);

Where alpha is assigned to some value between 0.4 and 0.6. I then thresholded the image and performed non-maximum suppression using the colfilt function to produce an image where the interest points are the only non-zero indices.

For the second image of the Notre Dame cathedral: (on the left) the value of h, (on the right) thresholded and non-maximum suppressed.

Defining Features of Points

Figure 4.18 in Szeliski

Using the SIFT pipeline as described in Szeliski, I first took the direction and magnitude gradients of the entire image. Then, I created a patch around the grid of the size specified by the feature width desired and for each value: put its magnitude in the 8 bin histogram for that direction. To weight the magnitudes which are further away from the point, I blurred the 4x4 grid with a gaussian filter, and after assigning each magnitude, filtered the feature with a gaussian again.







dir_idx = mod((9 + fix(direction/45) + sign(direction)), 9);
cell_x = floor(4*(j-1)/feature_width)+1;
cell_y = floor(4*(k-1)/feature_width)+1;

histograms(cell_x, cell_y, dir_idx)...
	= histograms(cell_x, cell_y, dir_idx) + magnitude;

Matching Features

Finally, to match the features, I found the nearest neighbor to each feature based on its 128 attributes. I used the pdist2 function in Matlab and simply sorted the distance matrix and measured the confidence of each match based on the distance of the closest neighbor over the distance of the second closest neighbor.


dist = pdist2(features1,features2);
[sorted, idx] = sort(dist);

for i = 1:size(features2,1)
    %features1
    matches(i,1) = idx(1,i);
    %features2
    matches(i,2) = i;
    confidences(i) = 1/(sorted(1,i)/sorted(2,i));
end
With the two images of Notre Dame, I was able to reach 84% accuracy for the 100 most confident matches, 73% for Mount Rushmore, manipulating the threshold, and an unfortunate 0% for the Episcopal Palace which leads me to believe that another method needs to be used to accomodate for the size differences in the two images.