Computer Vision Project

Project 2: Local Feature Matching

Harris Corner Detector

As a way of getting interest points of the figures, Harris corner detector is implemented with the formular of:

Process Flow

Compute image gradient Ix and Iy with the Sobel operator.
Compute Ix², Iy² and IxIy.
Compute g(Ix²), g(Iy²) and g(IxIy) with a Gaussian filter.
Threshold out points below certain score.
Use BWCONNCOMP function in Matlab to group connected points together.
Take the only the point with maximum value of score in each group.

Result - Interest Points

Interest points on Notre Dame images by Harris corner detector.

SIFT-like Local Feature

A figure below represents a schematic explanation of SIFT.

Process Flow

Create a 16x16 window around each keypoint.
Compute the gradient magnitude and direction for each window.
Weight the magnitudes according to their distance from the center pixel by using a gaussian filter.
Break the window up into 16 cells of size 4x4.
Form a histogram for each cell by dividing the gradients into 8 buckets, each covering 45 degrees.
Sum the weighted magnitudes of the gradients in each bucket.
Append each cell's histogram to form the 1x128 feature descriptor vector.

Ratio Test Matching

Process Flow

Set a threshold (roughly 0.7 ~ 0.8).
Take the ratio of "d1 = first nearest distance" to "d2 = second nearest distance"
Match if the ratio less than the threshold.
Sort these matches based on the confidence.

Results

Notre Dame (top 100 points) : 93/100, 93% accuracy

Notre Dame (total 278 points) : 255/278, 92% accuracy

Mount Rushmore (top 100 points) : 96/100, 96% accuracy

Mount Rushmore (total 344 points) : 325/344, 94% accuracy

Episcopal Gaudi (total 69 points) : 59/69, 86% accuracy after scaling the 2nd image down by 0.6 (Before scaling : 25~40 % accuracy)

Statue of Liberty

Sleeping Beauty Castle Paris

House

Pantheon Paris (an example for scale effect : a small number of matching points)

Extra Credit : MSER (interest point detection)

Based on the paper by Matas [1], this technique checks for regions that remain stable over a certain number of thresholds. If a region Q_i+Δ is not significantly larger than a region Q_i-Δ, region Q_i is taken as a maximally stable region. Interest point is determined as the center point of the maximally stable region over a certion threshold value so that we can choose the space on the objects. The stable region is assumed an ellipse in shape for ease of computation. Since I used the same local descriptor, the points near the image edges were suppressed. MSER takes apporoximately the center point of a stable region as an interest point. Therefore, the current feature descriptor which uses adjacent fixels may have a lower matching accuracy as shown below.

Results

Interest points with elliptical stable regions on Notre Dame images by MSER

Notre Dame via MSER: 19/26, 73% accuracy

[1] Matas, Jiri, et al. "Robust wide-baseline stereo from maximally stable extremal regions." Image and vision computing 22.10 (2004): 761-767.

Kihan Park (GTID:902788094)

Project 2: Local Feature Matching

Harris Corner Detector

SIFT-like Local Feature

Ratio Test Matching

Results

Extra Credit : MSER (interest point detection)

Results