Project 3 / Camera Calibration and Fundamental Matrix Estimation with RANSAC

Overview

This project demonstrates how to

We will explore each part and evaluate the performance of the algorithms used.

1. Camera projection matrix

The camera projection matrix contains the camera parameters such as the focal point or the camera center. This matrix satisfies the equation:

To solve for this, we rewrite the equation as:

Once we have the matrix A set up, we can easily solve for x using SVD (singular value decomposition).

After we've got the camera projection matrix, solving for the camera center is child's play as it can be directly computed by: Center = -inv(M(:,1:3))*M(:,4)

Performance: Fast and accurate (the results closely match the provided values)

The code was written with no loops to take advantages of MATLAB's efficiency in matrix operations

2. Fundamental matrix estimation

Suppose we're given a set of correct correspondences between two images, how can we find the fundamental matrix that transforms one point to another?

To do this, we first write down the equation:

Next, we solve for F using SVD again.

Now, because the fundamental matrix is a rank 2 matrix. We need to reduce the rank of F. To do this, we decompose F into U(Sigma)V, clear out the smallest singular value in Sigma (setting it to zero) and multiply them back together for the rank-2 matrix F.

Doing so we will find the fundamental matrix.

Performance: Fast and accurate (all the epipolar lines cross (or almost cross) through the corresponding points.

There's not much to improve here. The code was very efficient as it is right now.

3. Find fundamental matrix with unreliable matches

In part 2, we were able to find the fundamental matrix with ease because we were given an accurate set of matching points. But we soon find out that this doesn't happen often in the real world. Usually we have to find the correspondences using some kind of algorithm like SIFT, but they're not 100% accurate. However, they are good enough that we can use geometric constraints to figure out the set of correct correspondences, or what we call inliers. Before determining what is inlier or outlier, however, we must first calculate our fundamental matrix. Because there may be outliers in the mix, we use a trial-and-error process called RANSAC to find out which fundmental matrix gives us the most number of inliers.

The steps of RANSAC is straightforward:

Note that the algorithm uses 3 important parameters to define (1) when we should stop and (2) what we considered to be inliers. These parameters are placed at the top of the code:

Performance: Near-perfect accuracy for the first two image pairs (Mount Rushmore and Notre Dame)

The algorithm doesn't perform so well against the remaining two pairs. The fundamental matrix can be calculated incorrectly due to the following reasons:

This problem can be somewhat mitigated by normalizing the coordinates. However, this was not implemented and thus I won't go into detail here.

Results

Mount Rushmore: Found 825 possibly matching features. Good.

Notre Dame: Found 851 possibly matching features. Good.

Episcopal Gaudi: Found 1062 possibly matching features. Bad.

Woodruff Dorm: Found 887 possibly matching features. Bad.