Project 2: Local Feature Matching

The aim here is to estimate the camera projection matrix, which maps 3D world coordinates to image coordinates, as well as the fundamental matrix, which relates points in one scene to epipolar lines in another. It involves the following 3 parts:

  1. Estimating Camera Projection Matrix
  2. Fundamental Matrix Estimation
  3. estimating the fundamental matrix with unreliable SIFT matches using RANSAC

Algorithm Description

Part 1: Camera Projection Matrix

Equation for moving from 3D world to 2D camera coordinates is:

We first set up the least squares regression to solve for M given the corresponding normalized points. The equations for elements of M and the corresponding matrix representations are as follows:

Computer Man Computer Man

I used Singular Value Decomposition to solve the above equation for M by reshaping the matrix V (representing the right singular vectors). To estimate the camera center in world coordinates, I simply used the following equation.

Computer Man

Part 2: Fundamental Matrix Estimation

Using similar procedure as part 1, we set up the equation to solve for F and solve it using Singular Value Decomposition.

Computer Man

Since the fundamental matrix is a rank 2 matrix, we must reduce rank of the F mtrix obtained. In order to do this we decompose F using singular value decomposition, set the smallest singular value to zero and recalculate F .

Extra Credit

Normalizing the coordinates before computing the fundamental matrix produces better results as seen in the Results section. The transform matrix T is the product of the scale and offset matrices. To compute a scale s I estimated the standard deviation after substracting the means and took its reciprocal, adjusted the fundamental matrix so that it can operate on the original pixel coordinates.

Part 3: Fundamental Matrix with RANSAC

We obtain point correspondences using SIFT and use RANSAC to find the "best" fundamental matrix. I iteratively chose 8 of point correspondences randomly, solved for the fundamental matrix using the function in part II, and then count the number of inliers. To count the number of inliers the metric I used is:

Using similar procedure as part 1, we set up the equation to solve for F and solve it using Singular Value Decomposition.

If the distance is less than threshold, the matches are kept as inliers. We count the number of inliers for each F matrix obtained from various random combinations of point correspondences and choose the best fundamental matrix as the one which has maximum number of inliers.

Example of code with highlighting


%example code- calculate_projection_matrix
A(2*i-1,:)=[Points_3D(i,:), 1 0 0 0 0, - Points_2D(i,1)*Points_3D(i,1),- Points_2D(i,1)*Points_3D(i,2),
	- Points_2D(i,1)*Points_3D(i,3),- Points_2D(i,1)];
A(2*i,:)=[0 0 0 0,Points_3D(i,:), 1, - Points_2D(i,2)*Points_3D(i,1),- Points_2D(i,2)*Points_3D(i,2),
	- Points_2D(i,2)*Points_3D(i,3),- Points_2D(i,2)];


%example code- estimate_fundamental_matrix (Reducing the rank)
[U, S, V] = svd(F_matrix);
EigenValues=[S(1,1),S(2,2),S(3,3)];
[Min, Index]=min(EigenValues);
S(Index,Index)=0;
F_matrix=U*S*V';


%example code- ransac_fundamental_matrix
Distance=b*F_matrix*a;
        if abs(Distance)<=threshold
            inlier_count=inlier_count+1;
            curr_inliers_a(inlier_count,:)=matches_a(i,:);
            curr_inliers_b(inlier_count,:)=matches_b(i,:);
	end

RESULTS

PART 1

The projection matrix generated is:

Computer Man

The total residual is: <0.0445>

The estimated location of camera is: <-1.5127, -2.3517, 0.2826>

Computer Man

PART 2

F matrix without normalizing the coordinates:

Computer Man Computer Man

Epipolar lines (before normalizing coordinates)

F matrix after normalizing the coordinates:

Computer Man

PART 3

Before Normalizing coordinates: Output for threshold = 0.01,0.001,0.0005

As observed from the following results, the output improves after normalizing the coordinates before computing the fundamental matrix.

Computer Man

Episcopal Gaudi: Threshold= 0.01 , Inliers= 314

Computer Man

Episcopal Gaudi: Threshold= 0.001 , Inliers= 54

Computer Man

Episcopal Gaudi: Threshold= 0.0005 , Inliers= 39

After Normalizing coordinates: Output for threshold = 0.01,0.001,0.0005

Computer Man

Episcopal Gaudi: Threshold= 0.01 , Inliers= 452

Computer Man

Episcopal Gaudi: Threshold= 0.001 , Inliers= 107

Computer Man

Episcopal Gaudi: Threshold= 0.0005 , Inliers= 83

Correspondence and Episcopal lines for Gaudi

Computer Man

Episcopal Gaudi: Threshold= 0.0015 , Inliers= 452

Computer Man

Episcopal Gaudi: Episcopal Lines for Threshold= 0.0015

Effect of normalizing coordinates on Notre Dame

Computer Man

Notre Dame: Threshold= 0.01 , Inliers= 252

Computer Man

Notre Dame: Episcopal Lines for Threshold= 0.01 , Inliers= 252

Computer Man

Notre Dame: Threshold= 0.001 , Inliers= 179

Computer Man

Notre Dame: Episcopal Lines for Threshold= 0.001 , Inliers= 252

Effect of varying threshold on Mount Rushmore

Computer Man

Mount Rushmore: Threshold VS Inliers

Computer Man

Mount Rushmore: Threshold= 0.0005

Computer Man

Mount Rushmore: Episcopal Lines for Threshold= 0.0005

Computer Man

Mount Rushmore: Threshold= 0.0001

Computer Man

Mount Rushmore: Episcopal Lines for Threshold= 0.0001