Results of Nortre Dame. (98% matching accuracy)
Local feature matching a an important topic in computer vision. Simply spearking, its conventional pipeline includes obtaining interest points, computing the local features of these points and then matching the nearest interest points. In this project, I implemented the fundamental baseline (with 98% accuracy on Nortre Dame) and achieved the following extra credits:
I implemented the Harris corner detector to extract interest points. I first used the non-maximal suppression for Harris detector. The 100 most confidient matched interest points are shown as follows:
My implementation for get_interest_points
function is shown as follows:
%Harris corner detector with non-maximal suppression
function [x, y, confidence, scale, orientation] = get_interest_points(image, feature_width)
alpha = 0.056;
threshold = 0.001;
H_x = fspecial('gaussian',[3 3],1.2);
H_y = fspecial('gaussian',[3 3],1.2);
H_f = fspecial('gaussian',[5 5],2);
DH_x = imfilter(H_x,[-1,0,1]);
DH_y = imfilter(H_x,[1;0;-1]);
I_x = abs(imfilter(image,DH_x));
I_y = abs(imfilter(image,DH_y));
g_Ix2 = imfilter(I_x.^2,H_f);
g_Iy2 = imfilter(I_y.^2,H_f);
g_Ixy = imfilter(I_x.*I_y,H_f);
har_m = g_Ix2.*g_Iy2-g_Ixy.^2-alpha*(g_Ix2+g_Iy2).^2;
har = har_m(feature_width/2+1:end-feature_width,feature_width/2+1:end-feature_width);
% non-maximal suppression
radius = 3;
mask_size = 2*radius+1;
mask = ordfilt2(har,mask_size^2,ones(mask_size));
ip_ind = (har==mask)&(abs(har./max(max(har))>threshold));
[y,x] = find(ip_ind);
y = y+feature_width/2;
x = x+feature_width/2;
end
One drawback of Harris corner detector is that its detected points easily concentrated on some local parts of the image. To make these points distribute more uniformly, I also implemented the adaptive non-maximal suppression, whose code is shown as follows:
%Harris corner detector with non-maximal suppression
function [x, y, confidence, scale, orientation] = get_interest_points(image, feature_width)
alpha = 0.056;
threshold = 0.001;
H_x = fspecial('gaussian',[3 3],1.2);
H_y = fspecial('gaussian',[3 3],1.2);
H_f = fspecial('gaussian',[5 5],2);
DH_x = imfilter(H_x,[-1,0,1]);
DH_y = imfilter(H_x,[1;0;-1]);
I_x = abs(imfilter(image,DH_x));
I_y = abs(imfilter(image,DH_y));
g_Ix2 = imfilter(I_x.^2,H_f);
g_Iy2 = imfilter(I_y.^2,H_f);
g_Ixy = imfilter(I_x.*I_y,H_f);
har_m = g_Ix2.*g_Iy2-g_Ixy.^2-alpha*(g_Ix2+g_Iy2).^2;
har = har_m(feature_width/2+1:end-feature_width,feature_width/2+1:end-feature_width);
% adaptive non-maximal suppression
threshold_anms = 3;
ip_ind = (abs(har./max(max(har))>threshold));
[y,x] = find(ip_ind);
for i = 1:size(y,1)
ipval(i) = har(y(i),x(i));
end
for i = 1:size(y,1)
idx = [];
[~,idx] = find(ipvalthreshold_anms);
x = x(idx2);
y = y(idx2);
y = y+feature_width/2;
x = x+feature_width/2;
end
Adaptive non-maximal suppression produces the points distribution as follows:
After obtaining the intial interes points, I compute the local features associated to each interest point. I follow the suggested implementation details. In general, I extract a square window of pixels whose center is the interest point, partition it to 4x4 cells, compute a gradient histogram in each cell, append these histograms together and finally normalized to unit length. In specific, I do a lot of modifications in order to boost the performance. E.g., I use a weighted distribution of histogram in order to reduce the boundary effect. I also use a Gaussian function to decrease the contribution of the gradients that are far from the center.
My implementation for get_features
is shown as follows:
%compute SIFT-like features
function [features] = get_features(image, x, y, feature_width)
%% simple normalization
% features = zeros(size(x,1), 17*17);
% for k=1:size(x,1);
% k1 = 1;
% for i =-8:8
% for j=-8:8
% features(k,k1) = image(round(y(k))+1+j,round(x(k))+1+i);
% k1 = k1 + 1;
% end
% end
% features(k,:) = features(k,:)/norm(features(k,:),2);
%% SIFT-like features
H_f = fspecial('gaussian',[16 16],1);
image = imfilter(image,H_f);
x_int = fix(x);
y_int = fix(y);
section = feature_width/4;
start = -feature_width/2+1;
sigma_w = 8;
for q = 1:size(x,1)
num3 = 1;
for k1 = 1:4
for k2 = 1:4
num = 1;
for i = (start+section*(k1-1)):(start+section*k1-1)
for j = (start+section*(k2-1)):(start+section*k2-1)
m(num) = sqrt((image(y_int(q)+i+1,x_int(q)+j)-...
image(y_int(q)+i-1,x_int(q)+j))^2+(image(y_int(q)+...
i,x_int(q)+j+1)-image(y_int(q)+i,x_int(q)+j-1))^2);
weight = exp(((-(y_int(q)+i-y(q))^2-(x_int(q)+j-x(q))^2)/(2*sigma_w^2))/(2*pi*sigma_w^2));
m(num) = m(num)*weight;
theta(num) = (atan2((image(y_int(q)+...
i+1,x_int(q)+j)-image(y_int(q)+i-1,x_int(q)+j)),(image(y_int(q)+i,x_int(q)+j+1)-...
image(y_int(q)+i,x_int(q)+j-1)))+pi)/(2*pi)*8+1;
num = num+1;
end
end
histo(num3,:) = zeros(1,9);
for num2 = 1:16
histo(num3,fix(theta(num2))) = histo(num3,fix(theta(num2))) + ...
(1-theta(num2)+fix(theta(num2)))*m(num2);
if theta(num2) == 9
histo(num3,9) = histo(num3,9) + m(num2);
else
histo(num3,fix(theta(num2))+1) = histo(num3,fix(theta(num2))+1) + ...
(theta(num2)-fix(theta(num2)))*m(num2);
end
end
histo(num3,1) = histo(num3,1) + histo(num3,9);
num3 = num3+1;
end
end
m_histo = histo(:,1:8);
final_histo(q,:) = m_histo(:)'/norm(m_histo(:)',2);
final_histo(q,find(final_histo(q,:)>0.17)) = 0.15;
final_histo(q,:) = final_histo(q,:)/norm(final_histo(q,:),2);
end
features = final_histo;
end
I use the standrad nearest neighbors for feature matching. The confidence is computed by ratio of the distance between the nearest point and the second nearest point. Furthermore, I use PCA for feature dimensionality reduction and KD-tree to speed up the matching. My code for match_features
is shown as follows:
% match local features
function [matches, confidences] = match_features(features1, features2)
num_features = min(size(features1, 1), size(features2,1));
matches = zeros(num_features, 2);
%% PCA
% low_dim = 70;
% [~,new_features] = pca([features1;features2]);
% new_features1 = new_features(1:size(features1,1),1:low_dim);
% new_features2 = new_features(size(features1,1)+1:end,1:low_dim);
% features1 = new_features1;
% features2 = new_features2;
%% nearest neighbor distance ratio
if size(features1,1) == num_features
[idx,D] = knnsearch(features2,features1,'K',2,'NSMethod','kdtree');
matches(:,1) = [1:num_features];
matches(:,2) = idx(:,1);
confidences = D(:,1)./D(:,2); % ratio test
else
[idx,D] = knnsearch(features1,features2,'K',2,'NSMethod','kdtree');
matches(:,2) = [1:num_features];
matches(:,1) = idx(:,1);
confidences = D(:,1)./D(:,2); % ratio test
end
%% naive approach
% if size(features1,1) == num_features
% for i = 1:size(features1,1)
% for j = 1:size(features2,1)
% dist(i,j) = sqrt(sum((features1(i,:)-features2(j,:)).^2));
% end
% end
% [dist_min,idx_min] = min(dist,[],2);
% [sortval,idx] = sort(dist_min,'ascend');
% matches(:,1) = idx
% matches(:,2) = idx_min(idx);
% confidences = 1./(sortval+1);
% end
% matches(:,1) = randperm(num_features);
% matches(:,2) = randperm(num_features);
[confidences, ind] = sort(confidences, 'ascend');
matches = matches(ind(1:100),:);
end
For Nortre Dame example, the matching is quite successful partially because this pair of samples are easy to match. Th visual results for the matching is shown as follows:
I also experimented my implementaion on Mount Rushmore example and obtained 98% matching accuracy in top 100 most confident interest points. The results are shown as follows:
However, since I have not implemented the scale-invariant part, so the accuracy on Episcopal Gaudi exmaple is still not very satisfactory.