Ahmet Cecen
This script showcases self implemented local feature matching functions, and uses them on test cases to match image pairs with each other.
Part 1 - Baseline Implementation using SIFT-Like Features
These are the baseline local feature matching functions implemented following the prompts in the assigment exactly.
- Harris Corner Detection using Algorithm 4.1 in the Szeliski book.
- SIFT like features using gradient histograms.
- Feature matching using ratio test using Eq. 4.18 in the Szeliski book.
Loading the Test Cases
Harris Corner Detection
Extracts corner keypoints without any scale or orientation information using gradient information.
Feature Description
Spatially bins the local neighbourhood using 4x4 patches in a 16x16 domain.
Bins the orientation space to 8 principal directions.
Feature Matching
Finds the 2 nearest neighbors using MATLAB's built in KNN function, then uses nearest neighbor distance ratio test to return a match list ordered by confidence.
Results
Here I run the base framework for all 3 images and evaluate performance.
Part 2 - Implementing Scaling and Orientation
Here I implement simple workarounds to make my features scale and orientation invariant, at least more so than before.
Scale Invariance
Many sources use complicated Gaussian pyramids to cast a very wide net to the scale space. For time and practicality purposes, here I instead decided to use a very simple Gaussian ladder: a 1 octave 4 level "pyramid" which should be able to accomodate the cases in this project.
I can simply wrap a for loop around the usual workflow for each scale. Then, I can scale the x and y coordinates so that I can point to locations in the original resolution.
Results - Scale
Notice that while there is minimal improvement for the first 2 cases, there is a gigantic leap of accuracy for the third case, since the third case involves an image pair with significant difference in scale.
Orientation Invariance
The book suggests that a stable method for this would be to bin the local neighbourhood of a keypoint during identification into an orientation histogram, and find a dominant direction. I have opted to implement a much simpler version here as well, and instead of the 36 bins in the source materials, I use 18 bins. I also find the dominant direction by simply finding the maximum histogram count.
The implementation of the feature descriptor is a bit more tricky. To do this fast, I use the built in imrotate function, and find the maximum size of the filter when I rotate a 16x16 matrix without cropping, which happens at 45 degrees.
Now I can pad and rotate the spatial and orientation bins in the feature space, while keeping the magnitude signal in the original frame.
Here is what the spatial binning looks like now for an example rotation:
I can then simple follow the same framework as before without any changes as the histogram can accomodate any orientation fairly.
Results - Orientation
While there is improvement in the last 2 cases, the first case seems to have lost accuracy as a result of rotation invariance. This is not surprising, as the church has many repeated patterns at different angles. The fact that the photos were taken from a similar viewpoint was coicidentally helping alleviate the disambiguity of these patterns, but rotational invariance dissolved this effect.
Also note that there some points in there that are incorrectly identified as non-matches. This is somewhat evident in the Notre-Dame image, but extremely evident in Mount Rushmore, as the misidentified point clearly corresponds to the same location in both pictures.
Published with MATLABĀ® R2016b