Project 2: Local Feature Matching

Results of Nortre Dame. (98% matching accuracy)

Summary

Local feature matching a an important topic in computer vision. Simply spearking, its conventional pipeline includes obtaining interest points, computing the local features of these points and then matching the nearest interest points. In this project, I implemented the fundamental baseline (with 98% accuracy on Nortre Dame) and achieved the following extra credits:

  1. Implemented the adaptive non-maximum suppression. (5 pts)
  2. Used PCA to reduce the dimension of the local features to speed up the matching. (10 pts)
  3. Used kd-tree for matching. (5 pts)

Obtaining interest points

I implemented the Harris corner detector to extract interest points. I first used the non-maximal suppression for Harris detector. The 100 most confidient matched interest points are shown as follows:

My implementation for get_interest_points function is shown as follows:


%Harris corner detector with non-maximal suppression
function [x, y, confidence, scale, orientation] = get_interest_points(image, feature_width)
    alpha = 0.056;
    threshold = 0.001;
    H_x = fspecial('gaussian',[3 3],1.2);
    H_y = fspecial('gaussian',[3 3],1.2);
    H_f = fspecial('gaussian',[5 5],2);
    DH_x = imfilter(H_x,[-1,0,1]);
    DH_y = imfilter(H_x,[1;0;-1]);
    I_x = abs(imfilter(image,DH_x));
    I_y = abs(imfilter(image,DH_y));
    g_Ix2 = imfilter(I_x.^2,H_f);
    g_Iy2 = imfilter(I_y.^2,H_f);
    g_Ixy = imfilter(I_x.*I_y,H_f);
    har_m = g_Ix2.*g_Iy2-g_Ixy.^2-alpha*(g_Ix2+g_Iy2).^2;
    har = har_m(feature_width/2+1:end-feature_width,feature_width/2+1:end-feature_width);
    % non-maximal suppression
    radius = 3;
    mask_size = 2*radius+1;
    mask = ordfilt2(har,mask_size^2,ones(mask_size));
    ip_ind = (har==mask)&(abs(har./max(max(har))>threshold));
    [y,x] = find(ip_ind);
    y = y+feature_width/2;
    x = x+feature_width/2;
end

One drawback of Harris corner detector is that its detected points easily concentrated on some local parts of the image. To make these points distribute more uniformly, I also implemented the adaptive non-maximal suppression, whose code is shown as follows:


%Harris corner detector with non-maximal suppression
function [x, y, confidence, scale, orientation] = get_interest_points(image, feature_width)
    alpha = 0.056;
    threshold = 0.001;
    H_x = fspecial('gaussian',[3 3],1.2);
    H_y = fspecial('gaussian',[3 3],1.2);
    H_f = fspecial('gaussian',[5 5],2);
    DH_x = imfilter(H_x,[-1,0,1]);
    DH_y = imfilter(H_x,[1;0;-1]);
    I_x = abs(imfilter(image,DH_x));
    I_y = abs(imfilter(image,DH_y));
    g_Ix2 = imfilter(I_x.^2,H_f);
    g_Iy2 = imfilter(I_y.^2,H_f);
    g_Ixy = imfilter(I_x.*I_y,H_f);
    har_m = g_Ix2.*g_Iy2-g_Ixy.^2-alpha*(g_Ix2+g_Iy2).^2;
    har = har_m(feature_width/2+1:end-feature_width,feature_width/2+1:end-feature_width);
    % adaptive non-maximal suppression
    threshold_anms = 3;
    ip_ind = (abs(har./max(max(har))>threshold));
    [y,x] = find(ip_ind);
    for i = 1:size(y,1)
        ipval(i) = har(y(i),x(i));
    end
    for i = 1:size(y,1)
        idx = [];
        [~,idx] = find(ipvalthreshold_anms);
    x = x(idx2);
    y = y(idx2);
    y = y+feature_width/2;
    x = x+feature_width/2;
end

Adaptive non-maximal suppression produces the points distribution as follows:

Computing SIFT-like local features

After obtaining the intial interes points, I compute the local features associated to each interest point. I follow the suggested implementation details. In general, I extract a square window of pixels whose center is the interest point, partition it to 4x4 cells, compute a gradient histogram in each cell, append these histograms together and finally normalized to unit length. In specific, I do a lot of modifications in order to boost the performance. E.g., I use a weighted distribution of histogram in order to reduce the boundary effect. I also use a Gaussian function to decrease the contribution of the gradients that are far from the center.

My implementation for get_features is shown as follows:


%compute SIFT-like features
function [features] = get_features(image, x, y, feature_width)

    %% simple normalization
    % features = zeros(size(x,1), 17*17);
    % for k=1:size(x,1);
    % k1 = 1;
    % for i =-8:8
    %     for j=-8:8
    %         features(k,k1) = image(round(y(k))+1+j,round(x(k))+1+i);
    %         k1 = k1 + 1;
    %     end
    % end
    % features(k,:) = features(k,:)/norm(features(k,:),2);

    %% SIFT-like features
    H_f = fspecial('gaussian',[16 16],1);
    image = imfilter(image,H_f);
    x_int = fix(x);
    y_int = fix(y);
    section = feature_width/4;
    start = -feature_width/2+1;
    sigma_w = 8;
    for q = 1:size(x,1)
        num3 = 1;
        for k1 = 1:4
            for k2 = 1:4
                num = 1;
                for i = (start+section*(k1-1)):(start+section*k1-1)
                    for j = (start+section*(k2-1)):(start+section*k2-1)
                        m(num) = sqrt((image(y_int(q)+i+1,x_int(q)+j)-...
                            image(y_int(q)+i-1,x_int(q)+j))^2+(image(y_int(q)+...
                            i,x_int(q)+j+1)-image(y_int(q)+i,x_int(q)+j-1))^2);
                        weight = exp(((-(y_int(q)+i-y(q))^2-(x_int(q)+j-x(q))^2)/(2*sigma_w^2))/(2*pi*sigma_w^2));
                        m(num) = m(num)*weight;
                        theta(num) = (atan2((image(y_int(q)+...
                            i+1,x_int(q)+j)-image(y_int(q)+i-1,x_int(q)+j)),(image(y_int(q)+i,x_int(q)+j+1)-...
                            image(y_int(q)+i,x_int(q)+j-1)))+pi)/(2*pi)*8+1;
                        num = num+1;
                    end
                end
                histo(num3,:) = zeros(1,9);
                for num2 = 1:16
                    histo(num3,fix(theta(num2))) = histo(num3,fix(theta(num2))) + ...
                        (1-theta(num2)+fix(theta(num2)))*m(num2);
                    if theta(num2) == 9
                        histo(num3,9) = histo(num3,9) + m(num2);
                    else
                        histo(num3,fix(theta(num2))+1) = histo(num3,fix(theta(num2))+1) + ...
                        (theta(num2)-fix(theta(num2)))*m(num2);
                    end
                end
                histo(num3,1) = histo(num3,1) + histo(num3,9);
                num3 = num3+1;
            end
        end
        m_histo = histo(:,1:8);
        final_histo(q,:) = m_histo(:)'/norm(m_histo(:)',2);
        final_histo(q,find(final_histo(q,:)>0.17)) = 0.15;
        final_histo(q,:) = final_histo(q,:)/norm(final_histo(q,:),2);
    end
    features = final_histo;
end

Local Feature Matching

I use the standrad nearest neighbors for feature matching. The confidence is computed by ratio of the distance between the nearest point and the second nearest point. Furthermore, I use PCA for feature dimensionality reduction and KD-tree to speed up the matching. My code for match_features is shown as follows:


% match local features
function [matches, confidences] = match_features(features1, features2)
    num_features = min(size(features1, 1), size(features2,1));
    matches = zeros(num_features, 2);

    %% PCA
    % low_dim = 70;
    % [~,new_features] = pca([features1;features2]);
    % new_features1 = new_features(1:size(features1,1),1:low_dim);
    % new_features2 = new_features(size(features1,1)+1:end,1:low_dim);
    % features1 = new_features1;
    % features2 = new_features2;

    %% nearest neighbor distance ratio
    if size(features1,1) == num_features
        [idx,D] = knnsearch(features2,features1,'K',2,'NSMethod','kdtree');
        matches(:,1) = [1:num_features];
        matches(:,2) = idx(:,1);
        confidences = D(:,1)./D(:,2); % ratio test
    else
        [idx,D] = knnsearch(features1,features2,'K',2,'NSMethod','kdtree');
        matches(:,2) = [1:num_features];
        matches(:,1) = idx(:,1);
        confidences = D(:,1)./D(:,2); % ratio test
    end

    %% naive approach
    % if size(features1,1) == num_features
    %     for i = 1:size(features1,1)   
    %         for j = 1:size(features2,1)
    %             dist(i,j) = sqrt(sum((features1(i,:)-features2(j,:)).^2));
    %         end
    %     end
    %     [dist_min,idx_min] = min(dist,[],2);
    %     [sortval,idx] = sort(dist_min,'ascend');
    %     matches(:,1) = idx
    %     matches(:,2) = idx_min(idx);
    %     confidences = 1./(sortval+1);
    % end
    % matches(:,1) = randperm(num_features); 
    % matches(:,2) = randperm(num_features);

    [confidences, ind] = sort(confidences, 'ascend');
    matches = matches(ind(1:100),:);
end

For Nortre Dame example, the matching is quite successful partially because this pair of samples are easy to match. Th visual results for the matching is shown as follows:

Some problems that I run into

  1. Without carefully tuning the parameter, I got a relatively low performance. At first I thought it was my implementation error that cause the low accuracy, but after I carefully examined my code, I found no big mistakes. Then I carefully tune the parameter (especially the threshold for interes points generation), I obtained a satisfied accuracy. It means this matching pipeline is not that robust to different parameters
  2. I found no obvious performance boost when using adaptive non-maximal suppression. Sometimes, the accuracy will even drop. My guess is that when the interest points are distributed more uniformly, it makes the local features not very discriminative for matching. So I may need to increas the discriminativeness of the local features in order to get higher accuracy.

Extra experiments

I also experimented my implementaion on Mount Rushmore example and obtained 98% matching accuracy in top 100 most confident interest points. The results are shown as follows:

However, since I have not implemented the scale-invariant part, so the accuracy on Episcopal Gaudi exmaple is still not very satisfactory.