Project 2: Local Feature Matching

Results obtained on the Notre Dame

Local feature detection and matching is important for several computer vision applications like 3D reconstruction, Robot Navigation, Panorama Stitching and many other applications. The purpose of local invariant features is to provide a representation that allows to efficiently match local structures between images. Feature extraction pipeline is as follows:

  1. Detection: Identify the interest points
  2. Description: Extract vector feature descriptor surrounding each interest point.
  3. Matching: Determine correspondence between descriptors in two views

Project Implementation

  1. Keypoint Detection: Implemented Harris Corner Detector Algorithm. Found points with windows that give large corner response.
  2. Generate Feature Descriptor: Implemented SIFT-like local feature algorithm. Found a 16x16 window around the detected corner and generated 8-bine wide histogram of gradients for 4X4 block in the selected window. At the end of the pass, we get 128 (8X4X4) bins for each feature.
  3. Feature Matching: Found pair wise euclidean distance between features from both the images and performed nearest neighbor distance ratio test to find 100 most confident matches.

Local features are detected and matched between a pair of images taken from different views.

description

Notre Dame: Accuracy = 83%

description

Mount Rushmore: Accuracy = 88%

Extra Credits

Weighted Bins

Multiplied bins counts with gradient magnitude while generating feature descriptors.

Feature Normalization

Each generated feature is normalized and passed to match_features function.

Parameter Tuning

Gaussian Filter

  1. For gaussian filter of size=13 and sigma=3 to compute interest points which gave the best results.
  2. Before computing the gradient, blurred the image with a gaussian filter of size=3 and sigma=1.

Binning

The gradients very close to the bin boundary were moved to the next bin which gave better results for Notre Dame.

Non-Maximum Suppression

Used colfilt with size=2 which gave best results.

Other Parameters

Set Harris Detector's alpha = 0.04, 0.05 and threshold = 0.05, 0.025 for Notre Dame and Mount Rushmore respectively to get high accuracy.

PCA

Created a low dimension feature vector to speed up match features function. Obtained 1.5x speedup with accuracy loss of 10% for Notre Dame and 8% for Mount Rushmore.

Used extra_data images (91 in total) to compute PCA basis from SIFT feature obtained using VLFeat library. Selected the first 32 principal components of PCA basis to project 128 dimensional feature vetor to 32 dimensional space.

Adaptive Non-maximal suppression (ANMS)

Tried ANMS implementation from an open-source repository to get the spatially diverse set of interest points. Observed significant drop in accuracy. Reason might be the need for additional parameter tuning.

Conclusion

Implemented SIFT pipeline for local feature matching in this project. Tuned parameters to achieve an accuracy of 83% for Notre Dame image pair and 88% for Mount Rushmore. Implemented PCA for speedup (dimensionality reduction) of 1.5x in match features function for both Notre Dame and Mount Rushmore with an accuracy loss of 10% and 8% respectively. Also tried Adaptive Non-maximal Suppression, however it resulted in the drop of the accuracy.