Project 2: Local Feature Matching

The objective of this project is to create a local feature matching algorithm to find matching local features in a pair of images. The baseline of this project follows a simplified version of SIFT pipeline, which follow 3 steps:

  • Interest point detection
  • Local feature descriptor - Build a SIFT descriptor for the interest points found.
  • Feature matching to find suitable matched pairs of features.

    Interest point Detection

    The first step in the local feature matching algorithm is to find distinctive keypoints in the image. The key point detector used here is Harris Corner detector, which is a widely used corner detection algorithm. It is implemented as follows:

  • Compute the Image derivatives in the x and y directions to get Ix and Iy
  • Compute the squares of the derivatives and convolve with a Gaussian Filter
  • Compute the corner points using the formula:


  • Zero out pixels close to the image border
  • Perform non-maximum suppression by using a 3x3 sliding window to eliminate non-maximal points inside the window.

  • Keypoints detected in the Notre Dame image pair using Harris Corner Detector

    Local Feature Descriptor

    Here, a SIFT-like local feature descriptor, as described by David Lowe. After key point detection, we build a local feature around the keypoint that stores information about the gradient magnitude and orientation. The algorithm used is as follows:

  • Create a 16x16 (or of desired feature width dimensions) image patch around the key point.
  • Compute the gradient magnitude and direction of the window.
  • Use a Gaussian fall off to weigh the magnitudes of the patch.
  • Divide the window into block of size 4x4, and compute the dominant direction. In each block or cell, separate every pixel into 8 bins acording to its direction relative to the dominant direction. Therefore, the feature vector obtained has 4x4x8 dimensions, i.e., 128 dimensions.
  • Normalize the feature vector.
  • As described in the paper, clamp any value greater than 0.2 to 0.2. Normalize the feature vector once more.

  • Feature Matching

    Feature Matching was performed using Ratio Test. Here, pairwise distance between the two feature vectors corresponding to the two images was calculated. For each feature, the 2 smallest distances (Two nearest neighbors) are selected and their ratio is computed. If the ratio is below the threshold, then a match between the two images is comsidered, if not, there is no match. Confidence is the inverse of the ratio, and the matches are sorted in descending order of their confidences.

    Results

    Local feature matching was performed on 2 images depicting the Notre Dame.


    Results of feature matching on the Notre Dame pair.

    Evaluation of the 100 most confident matches. Accuracy = 97%

    Results: Out of the 100 most confident matches, 3 points are matched incorrectly. Therefore, the accuracy here is 97%.


    Consider another test case: Two images of Mount Rushmore. The results of feature matching are shown below:


    Keypoints detected in the Mount Rushmore image pair using Harris Corner Detector


    Results of feature matching on the Mount Rushmore pair.


    Evaluation of the 100 most confident matches. Accuracy = 95%

    For the Mount Rushmore pair, accuracy is 95%, with 5 points out of the 100 most confident matches matched incorrectly.

    Local feature matching performed on two images of Sleeping Beauty's Castle:


    Evaluation of the 100 most confident matches.

    Dimensionality Reduction using PCA

    Principal Component Analysis (PCA) is an algorithm that reduces the dimensionality of the data while still retaining as much of the variance in the dataset as possible. The algorithm identifies directions along which the data has maximum variance, called the principal components, which are ordered by dominant dimensions. These principal components can be used to compute new features, of which the first k with the highest variance can be used. This is dimensionality reduction. This works very well with data sets of large dimensions. For example, if the descriptor is 64 dimensions instead of 128, the distance computation would be about 2 times faster, and would require much less memory.

    Algorithm:
    For each feature:

  • Find the covariance
  • Find the eigen vectors (V) and eigenvalues (D)
  • Sort the eigenvalues in descending order
  • Order the eigenvectors corresponding to the eigenvalues
  • The eigenvectors are the Principal Components, and the eigenvalues are the latency
  • Compute scores S as features * V
  • Recompute features. New features = S * V'.
  • compute the distance between the first k dimensions of the two feature vectors.

  • Results of Local feature matching after implementing PCA

    We observe that as dimensionality is reduced from 128 to any value k, accuracy drops, but computational time and memory requirement decreases.



    Accuracy = 81% with k = 64.

    Accuracy = 94% with k = 96.


    Accuracy = 91% with k = 96.

    Results Summary:

    Image Pair Accuracy Accuracy with PCA
    K = 64 K = 96
    Notre Dame 97 81 94
    Mount Rushmore 95 80 91