Project 2: Local Feature Matching

This report mainly contains 3 parts :

  1. Feature detection
  2. Feature description
  3. Feature matching

Each of the above steps have multiple methods of implementation. Report contains observations / notes from those variants tried out.

Feature detection

Feature detection has been performed using Harris Corner Detection. Harris corner detection is based on interpreting the moment matrix consisting of gradient values of the image with multiple variants in terms of thresholding and evaluation criteria. The implementation uses determinant and trace value of the moment matrix to achieve

  1. Rotation invariance
  2. Differentiate edges from features

Code snippet

Harris corner detection using single threshold


% Corner response
R = zeros(size(image));
k = 0.04;
for row = 1:size(image,1)
    for col = 1:size(image,2)
        pixel_moment_mat = [Ix2(row,col) Ixy(row,col); Ixy(row,col) Iy2(row,col)];
        R(row,col) = det(pixel_moment_mat) - k * trace(pixel_moment_mat) * trace(pixel_moment_mat);
    end
end

% thresholding
% threshold = (mean2(R) + std2(R)*.1) * 2;
threshold = 0.00001;
R(R < threshold) = 0;
R([1:(feature_width-1), end-feature_width:end], :) = 0;
R(:,[1:(feature_width-1),end-feature_width:end]) = 0;
R_maximas = colfilt(R,[feature_width/4, feature_width/4],'sliding',@max);
non_suppressed_corners = R.*(R == R_maximas);
[y,x] = find(non_suppressed_corners);

Results

Image Threshold Accuracy
Notre Dame 0.00001 91
Mount Rushmore 0.0002 90
Episcopal Gaudi - Max - 30%

Image observations

Mount Rushmore corner detection

Notre Dame corner detection

Bells and whistles - Adaptive Non maximum suppression

The problem with using a single threshold is

Other issues - The objective is to detect the most stable corners to the extent possible. Instead of setting a hard threshold of a single value, we should be able to detect corners based on other corner response relatively. Rather than using a strict maximum in the neighbourhood region, we try to suppress corners in an adaptive fashion. This is called Adaptive non maximum suppression.

The technique followed here tries to detect those pixels as corners which are not only the local maxima but also 1.1 times the corner response of neighbouring pixels in a radius of r. The radius r is inferred from the corner response matrix by sorting the corner responses in descending order. [1]Radius r is taken as the maximum radius in which the current point is still going to be considered a corner i.e distance between current point and next highest corner response point. An alternate adaptive approach [2] suggests 2 step thresholding. The first step discards all interest points lesser than a lower limit. This way extremely weak points are taken out of consideration. First step also directly includes interest points above the upper threshold as a definitive corner. This way computation can be reduced for the number of points for which we calculate the radius and adaptively approximate a good working threshold.

Spatially well distributed features detected using Adaptive Non Maximum Suppression

Feature description

`Feature description has been done using 128 dimension SIFT vectors. The standard SIFT implementation has been followed. Weighting of the histogram, detecting features at various scales and rotation invariance consideration could have helped improve the local feature matching of the episcopal gaudi image. Normalized patches of images too work as features but not as good as SIFT features. In SIFT , we consider the gradient (magnitude and direction) in each 16*16 patch of the image. The 16*16 is further divided into 4x4 quadrants. The gradient direction in the quadrant contributes to the histogram of 8 bins (0-360 degrees -45 degree intervals). The 16 * 8 histogram bins put together form the 128 dimension SIFT feature vector. All features are normalized, clamped to upper limit 0.2 and re-normalized.

Code snippet

Harris corner detection using single threshold


% Corner response
R = zeros(size(image));
k = 0.04;
for row = 1:size(image,1)
    for col = 1:size(image,2)
        pixel_moment_mat = [Ix2(row,col) Ixy(row,col); Ixy(row,col) Iy2(row,col)];
        R(row,col) = det(pixel_moment_mat) - k * trace(pixel_moment_mat) * trace(pixel_moment_mat);
    end
end

% thresholding
% threshold = (mean2(R) + std2(R)*.1) * 2;
threshold = 0.00001;
R(R < threshold) = 0;
R([1:(feature_width-1), end-feature_width:end], :) = 0;
R(:,[1:(feature_width-1),end-feature_width:end]) = 0;
R_maximas = colfilt(R,[feature_width/4, feature_width/4],'sliding',@max);
non_suppressed_corners = R.*(R == R_maximas);
[y,x] = find(non_suppressed_corners);

Feature detection

Visualizing matching of features

Percentage matched features is 91% for Notre Dame

Feature matching

Feature matching is done using Nearest Neighbour Distance Ratio as the criteria. The 2 nearest neighbours of each feature are calculated in terms of euclidean distance. The ratio between the distances is taken. The ratio is considered against a threshold and relatively a confidence measure is attached to each feature. Since the nearest neighbour distance ratio is indicative of negative confidence, the confidence value is 1-ratio. Matching of neighbours has a high running time due to the search for the first 2 nearest neighbours. While the speed up can be achieved using matlab built-in functions like knnsearch (commented out in code), it reduces accuracy. Hence it is better to stick to plain vanilla implementation of nearest neighbours while continuing to use euclidean distances. The threshold for the ratio was inferred from trial and error.

References

  1. David LOWE SIFT
  2. Brown and Szeliski : Adaptive Non Maximum Suppression
  3. Gioacchino Vino and Angel D. Sappa : Revisiting Harris Corner Detector Algorithm: A Gradual Thresholding Approach