Project 3 / Camera Calibration and Fundamental Matrix Estimation with RANSAC

This project involved 3 major tasks:

  1. Estimating the projective matrix of a camera which facilitates the projection of 3-dimensional world coordinates into 2-dimensional image coordinates and finding the world coordinates of the camera's center.
  2. Estimating the fundamental matrix which facilitates finding corresponding points between two images in epipolar geometry.
  3. Estimating a fundamental matrix for the correspondences of two images using the RANSAC model-fitting algorithm.

Part 1

The first part of the project required finding the projection matrix of a camera given some 3-D world coordinates and their corresponding homogenous image coordinates. The values of the matrix can be found by performing a linear regression with the equations resulting from this matrix multiplication.

In order to find the values in the matrix M, I had to set up a series of equations in MATLAB which took into account several image-world correspondences:

This required setting the value of m34 to 1 and solving the regression for the rest of the values. After multipling the matrix by a scalar to scale the matrix to the scale desired for the assignment, I got the following matrix:

The total residual (error) from the difference between projected 2D locations and actual 2D locations in the image was small: 0.0445.

The camera center in world coordinates was calculated by solving the following equation:

Where Q consists of the first three columns of the matrix and m4 is the last column of the matrix. The resulting camera center was:

Part 2

The next part of the project involved implementing the algorithm used to determine the fundamental matrix instrumental in finding correspondences between two images. This was computed by solving the regression resulting from the fundamental matrix definition:

These equations were solved for every pair of corresponding points. After solving for the f values and making the matrix rank 2, the resulting fundamental matrix for a sample pair of images was:

The epipolar lines in each image computed from the fundamental matrix can be visualized as follows:


Part 3

The last part of the project involved using the algorithm implemented in part 2 in coordination with the RANSAC model-fitting algorithm in order to find a good fundamental matrix for an image pair that filters out bad keypoint matches from good ones.

The algorithm used to find the best fundamental matrix was as follows:

  1. Eight pairs of points were selected randomly from a list of corresponding keypoint pairs from two images.
  2. The pairs were used as arguments for the algorithm devised in the previous part, and a fundamental matrix was generated for finding corresponding pairs.
  3. All of the pairs from the images were then tested against the fundamental matrix's projections for correspondence locations. This was done by plugging the pairs into the equation x'*F*x^T. If the distance of the result from 0 was within a particular threshold (for example: 0.02), then that point pair was considered an inlier which the fundamental matrix had deemed as a "good" keypoint match.
  4. The steps above were repeated, and the fundamental matrix which generated the greatest number of inliers after 2000 iterations was considered the best matrix for finding inliers.

Using a low threshold had the benefit of only yielding inliers with a very low error, but this also resulted in the elimination of many pairs which in fact were still rather good keypoint matches.

To find the best threshold, I used a Mount Rushmore image pair to perform trials for three thresholds: 0.002, 0.01, and 0.02. About 825 possible keypoint matches were found. I anticipated that the threshold that resulted in a matrix that found close to half of those correspondences to be inliers would be a good threshold to use.

Here are the results of the experimentation:

Threshold 0.002 0.01 0.02
Trial 1 66 213 353
Trial 2 67 250 415
Trial 3 62 221 405
Trial 4 68 277 339
Trial 5 80 223 370
Trial 6 53 241 428
Trial 7 76 210 433
Trial 8 66 197 426
Trial 9 75 244 451
Trial 10 124 247 473
Average 73.7 or about 74 232.3 or about 232 409.3 or about 409

Example results for each threshold, with 30 of the inliers being displayed:

Threshold 0.002 0.01 0.02
Image

Another reason why the fundamental matrix from a 0.002 threshold would not work is because the way the epipolar lines radiate from the epipoles in that scenario seems to suggest forward motion with cameras when that is clearly not the case.

From the results above, it is clear that a threshold of 0.02, on average, gives nearly half of the inliers of the Mount Rushmore image. Therefore, this was the threshold I went with.

Here are the resulting epipolar lines and correspondences for the fundamental matrices for many image pairs. Good epipolar lines were yielded for all pairs except for the Gaudi pair since the eight-point algorithm was not normalized.

Epipolar lines Keypoint correspondences