Project 3: Camera Calibration and Fundamental Matrix Estimation with RANSAC

Part I: Camera Projection Matrix

The first part of this project involved estimating the projection matrix M that maps 3d points (world coordinates) to 2d points (image coordinates) for the scene shown below.

Scene with markers for each point of interest.

2D image coordinates (normalized).

3D world coordinates (normalized).

I used the given 2d to 3d point correspondences to set up a system of linear equations and solve for the 12 unknown components of the projection matrix. Solving the system directly (by Gaussian Elimination) gives a trivial solution (x = 0), so I instead used SVD to solve the constrained optimization problem (as discussed in lecture).

The MATLAB code snippet below demonstrates solving the constrained optimization problem.
[U, S, V] = svd(A);
M = V(:, end);
M = reshape(M, [4 3])'';

Reshaping the solution vector as a 3 x 4 matrix, I got the following projection matrix M for the given normalized points:

For validation, I checked the "residual" (sum of squared differences) between the projected 2d locations of each 3d point and the ground truth 2d locations. This value was 0.0445, indicating an accurate estimation of the projection matrix.

2d Projection of 3d Points vs. Actual 2d Points

After finding an accurate estimation of the projection matrix M, I used M to estimate the camera center in world coordinates. The camera center is represented by the red cross in the plot below.

Estimated camera center is <-1.5127, -2.3517, 0.2826>.


Part II: Fundamental Matrix Estimation

The goal for this part of the project was to estimate the fundamental matrix F relating the two camera perspectives shown below.

I used the given point correspondences (between the two perspectives) to set up a system of linear equations and solve for the fundamental matrix F. This method requires a minimum of 8 point correspondences (hence is called the "8-point algorithm"), however all 20 of the provided correspondences are ground truth, so overconstraining the problem does not degrade the solution. The code snippet below demonstrates solving the regression to find F, and reducing the rank of the solution to 2.

% Solve for fundamental matrix.
[U, S, V] = svd(A);
f = V(:, end);
F_matrix = reshape(f, [3 3])'';

% Enforce singularity constraint.
[U, S, V] = svd(F_matrix);
S(3, 3) = 0;
F_matrix = U * S * V';

This algorithm yielded the following estimation of the fundamental matrix F for this image pair:

Next I checked the fundamental matrix estimation by plotting the epipolar lines for each image. Note that all of the points are intersected by epipolar lines (in blue), indicating an accurate estimation of the fundamental matrix.


Part III: Fundamental Matrix with RANSAC

The fundamental matrix is straighfoward to estimate when reliable point correspondences are given (as in part 2), however in practice, the point correspondences are unreliable and least squares regression alone is not sufficient. For the third part of this project, I used the RANSAC algorithm in conjunction with the 8-point algorithm to estimate the fundamental matrix from noisy data.

The number of RANSAC interations was determined using the formula below (from Szeliski 6.1.4), where S is the number of iterations, P is the total probability of successfully estimating the matrix (after S iterations), and pk is the likelihood in one iteration that all k random samples are inliers:

I used P = 0.99 and k = 8. The probability of a correspondence being an inlier (p) is dependent on the image pair, and estimating its value is outside of the scope of this project, so I used p = 0.25 as a lower bound. A smaller value of p requires more iterations, so I used S = 10,000 as an upper bound for the total number of RANSAC iterations.

RANSAC Results

Mount Rushmore

Input image pair: two perspectives of Mount Rushmore.


Point correspondences found using SIFT features and nearest neighbor matching (825 total).

Epipolar lines from estimated fundamental matrix.

Point correspondences that "agree" with the estimated fundamental matrix (175 total).
Note that the fundamental matrix has filtered out most of the incorrect correspondences (as well as many of the correct ones).

Notre Dame

Input image pair: two perspectives of Notre Dame.


Point correspondences found using SIFT features and nearest neighbor matching (851 total).

Epipolar lines from estimated fundamental matrix.

Correspondences that "agree" with the estimated fundamental matrix (199 total).
This image pair is difficult because the keypoint matches are mostly planar and thus the fundamental matrix is not well constrained.

Episcopal Gaudi

Input image pair: two perspectives of Episcopal Gaudi.


Point correspondences found using SIFT features and nearest neighbor matching (1062 total).

Epipolar lines from estimated fundamental matrix.

Correspondences that "agree" with the estimated fundamental matrix (234 total).

Woodruff Dorm

Input image pair: two perspectives of Woodruff Dorm.


Point correspondences found using SIFT features and nearest neighbor matching (185 total).

Epipolar lines from estimated fundamental matrix.

Correspondences that "agree" with the estimated fundamental matrix (44 total).