Example of a right floating element.
The goal of this assignment is to create a local feature matching algorithm using a simplified version of the famous SIFT pipeline. The pipeline consist of three parts: Harris corner detector to find interest point, a simplified version of SIFT descriptor to describe interest points, and feature matching. Additionly, I implenment Adaptive Non-maximal Suppression(ANMS) to get a more uniform distribution of interest points when only limited points are selected. And I follow the steps in Lowe's paper to build a scale-invariant interest point detector and feature descriptor. I also include the dominant direction to interest point, but due to the time limit, I did not build a rotation-invariant descriptor.
I follow the Algorithm 4.1 in the textbook. First comput the x,y derivatives of Gaussian, then convolve it with the image to get the gradient of the image. Then compute the products of these gradients, then convolve them with a larger Gaussian. Then use the following fomula to compute the strength of each point
Next I set a threshold to filter out weak corners. Then I find the maximum of each connect components and delete points that near the boarder of the image so the points can have a window around it.
The purpose of ANMS is to an even distribution of feature points across the image. The algorithm only detect features that are both local maxima and whose response value is significantly (10%) greater than that of all of its neighbors within a radius r. It does not give a better result, so I comment it out.
500 strongest interest points | ANMS(Top 500) |
I computed the distance between each descriptor of first image and each descriptor of second image. For each interest point in image one, pick the smallest and 2nd smallest distance to interest points in second image. Threshold the matches by the ratio of the two distance. Discard the matches that have large ratio which means the match is unlikely to be uniqe.
I get result of 82 good matches and 19 bad matches.
The algorithm above has poor performance if the images have different scales.
These two image are the same image but the second is scaled by factor 0.5. For better illustration, I will rescale the second image later, but actually they are two images with different scales.
I follow the steps descripted in Lowe's paper(2004)
here is the result of two identical images with different scales. which is much better than original algorithm