Design Decisions Image Matching Results

Project 2: Local Feature Matching

Design Decisions

Interest Points

Interest points were extracted from each image pair using the Harris Corner Detection algorithm. Gaussian derivative filters in the X and Y directions were applied to the image to detect corners. Corner candidates were then thresholded to create a binary image. Connected components - adjacent corner candidates - were parsed from the image. For each connected component, the pixel from that component with the highest cornerness score was selected as the interest point. This reduces the number of interest points only a few pixels apart.

Feature Descriptors

For each interest point, a 128-dimensional feature descriptor was generated in accordance with the majority of the SIFT algorithm. Trilinear interpolation and a Gaussian falloff filter were eschewed in favor of time. While SIFT recommends clipping normalized values in the feature descriptor to 0.2, performance gains were realized when clipping values to 0.25 instead.

Matching Features

Pairs of features were matched by minimizing Euclidean distance between them. A smaller Euclidean distance indicates a closer match. The Nearest-Neighbor Distance Ratio was also used to filter match candidates and report confidence. Each match candidate was filtered on both absolute distance and NNDR. If a match candidate failed to meet either of the criteria, it was not reported as a match. In addition to reporting just matches, each match was also assigned a confidence level according to the function:

confidence = 1 - NNDR

Image Matching Results

The performance of the image matching pipeline was assessed by evaluating the 100 most confident matches between each pair of images. This provides a benchmark for comparing performance between images.

Each circle represents a feature that is one of top 100 most confident matches. A green border indicates a successful match as compared to the ground truth. A red border indicates a failed match as compared to the ground truth.

Notre Dame - 97% accuracy.

The image matching pipeline truly excelled when matching these Notre Dame images: 92% accuracy was reported. There is minimal scale and orientation difference between these two images, which contributes to the high accuracy.

The most significant difference between the images is the tree present in the right image. However, the image matching pipeline managed to produce and find matches that circumvented this obstruction.

 

Mount Rushmore - 97% accuracy.

These images of Mount Rushmore exhibited more differences in illumination, scale, and content than the images of Notre Dame. However, the image matching pipeline was still able to very good accuracy over the top 100 matches.

 

Episcopal Gaudi - 7% accuracy.

This pair of images proved especially troublesome for the image matching pipeline. The significant differences in orientation and scale resulted in very poor feature matching, especially when 100 feature matches were required. In addition, the right image has a spurious structure along the left edge of the image. This structure produced many false positive matches as it doesn't exist in the left image.