Computer Vision Project

The aim here is to create a local feature matching pipeline, intended to work for instance level matching multiple views of the same physical scene.It involves the following 3 steps:

Algorithm Description

The method 'get_interest_points', computes the gradient of the input image along the x and y directions in order to compute the second moment matrix M. For a point to be a corner, both the eigen values of the second moment matrix must be large values. Therefore, we calculate the corner response function R, using alpha=0.06. We threshold the response of each pixel since corners or good interest points have larger response values i.e. cornerness.

Since we want to keep only the local maximas for the interest points, we perform Non-Maximum Suppression by sliding a window of 16X16 over the entire image, keeping the maximum value in the window and setting all other values to zero.

The second step is 'get_features'. The simplest, although not very accurate, method of generating the features is to take a normalised patch corresponding to each interest point. Here I took a normalised patch of 16X16 for the initial implementation of get_features. The final implementation of SIFT-like local feature descriptors includes computing the gradient at each pixel in a 1616 window around the detected interest point; for each 4X4 quadrant, a gradient orientation histogram is formed by adding the magnitude of gradient value to one of eight orientation histogram bins. In a step wise implementation, I first performed counter type binning method, simply counting the number of graidents in each bin. Then I took into consideration the magnitude of each of the 16X16 gradients. A 4X4X8 length flattened feature is thus obtained for each interest point.

The final step is 'match_features'. We must find the 'most' matching features among the set of features obtained from both/multiple input images. The method computes Eucledian distance between each pair of features using a double for loop, pdist2 method of matlab as well as using k-nearest neighbours for comparison. Then, the ratio of distances of the 2 closest features for each feature of image1 are computed (NNDR=nearest neighbor distance ratio). Confidence for each match is defined as the inverse of NNDR. We keep the 100 most confident matches (having least NNDR values).

Example of code with highlighting


%example code- get_interest_points
for i=feature_width/2:ymax-feature_width/2
    for j=feature_width/2:xmax-feature_width/2
         response(i,j)=(Ix2(i,j)+Iy2(i,j)-0.06*(Ixy(i,j).*Ixy(i,j)));
    end
end


%example code- get_interest_points
for i=feature_width/2:ymax-feature_width/2
    for j=feature_width/2:xmax-feature_width/2
         response(i,j)=(Ix2(i,j)+Iy2(i,j)-0.06*(Ixy(i,j).*Ixy(i,j)));
    end
end


%example code- get_features
[Gmag,Gdir] = imgradient(image);

xmin=floor(x(k)-(feature_width/2));
xmax=floor(x(k)+feature_width/2-1);
ymin=floor(y(k)-(feature_width/2));
ymax=floor(y(k)+feature_width/2-1);
temp_gmag=Gmag(ymin:ymax,xmin:xmax);
temp_gdir=Gdir(ymin:ymax,xmin:xmax);


%example code- match_features
[Sorted_Distance,Indices]=sort(Distance);
corresp_i=Indices(1:2,:);
matches_all(1,:)=1:size(features1,1);
matches_all(2,:)=corresp_i(1,:);

RESULTS

Results for different data set(optimal)

Sacre Coeur:Accuracy=100%(approximately)

Sleeping Beauty Castle Paris:Accuracy=92%(approximately)

Notre Dame: Accuracy=81%

Rushmore: Accuracy=78%

Variations in implementation of get_features

Accuracy using Normalized patches: 57%

Accuracy using SIFT like features: 77%

Accuracy using SIFT like features & normalizing and clamping features: 81%

Result with Normalized Patches

Result with SIFT like features & normalizing and clamping features

Variations in threshold for response in get_interest_points

Threshold=0.0075: Accuracy= 80%

Threshold=0.0009: Accuracy= 80%

Threshold=0.001 : Accuracy= 81%

Threshold=0.0011: Accuracy= 79%

EXTRA CREDIT

PCA Implementation

The aim here is to reduce the dimentionality of feature lower dimensional descriptor. To reduce the running time in match_features(), the following steps are required:

Get orthonormal basis vectors of the features
Store it as a matrix for future use (only once)
Import basis vectors
Compute reduced dimention vector using fewer basis vectors

Results & Analysis of PCA Computer Man

Result with dimension reduction using PCA

Nearest neighbour using KD-Tree

The aim here is to do space partitioning data structure(kdtree) to approximate nearest neighbor of feature to accelerate matching. The following steps are included:

Create kd-tree
Use KNN to get 2 nearest neighbours
Calculate NNDR and confidences

Results & Analysis of KD-Tree Computer Man

Runtime variation