Project 3 / Camera Calibration and Fundamental Matrix Estimation with RANSAC

Part 1: Camera Matrix Estimation

This section of the project was relatively straightforward. Setting up the least squares problem as described in the description and solving was a relatively quick process. In fact, with the correct preprocessing, it was nearly identical to the description provided in the project files. SVD was chosen because it is relatively fast and not prone to numerical prescision problems.


% Projection Matrix
X = Points_3D(:,1);
Y = Points_3D(:,2);
Z = Points_3D(:,3);
u = Points_2D(:,1);
v = Points_2D(:,2);

A = zeros(2*size(Points_2D, 1), 12);
for i=1:size(Points_2D, 1)
    A(2*i-1,:) = [X(i), Y(i), Z(i), 1, 0,    0,    0,    0, -u(i)*X(i), -u(i)*Y(i), -u(i)*Z(i), -u(i)];
    A(2*i,:)   = [0,    0,    0,    0, X(i), Y(i), Z(i), 1, -v(i)*X(i), -v(i)*Y(i), -v(i)*Z(i), -v(i)];
end

[U, S, V] = svd(A);
M = V(:,end);
M = reshape(M, [], 3)';

% Camera Center estimation
Q = M(1:3,1:3);
m4 = M(:,4);
Center = -Q\m4;

This code results in the correct nomalized matrix M and camera center values as described in the project description.

Part 2: Fundamental Matrix Estimation

This part of the project was also a relatively straightforward linear least-squares solution. There is the added complexity of needing to zero the third eigenvalue, but that is counteracted by the simpler least-squares problem being solved. Once again, SVD is used twice. This results in a relatively quick runtime (which is important since this code is run for 10-20000 iterations by part 3).


% Fundamental Matrix Estimation
U = Points_a(:,1);
V = Points_a(:,2);
u = Points_b(:,1);
v = Points_b(:,2);

A = zeros(2 * size(Points_a, 1), 9);
for i=1:size(Points_a, 1)
    A(i,:) = [U(i) * u(i), V(i) * u(i), u(i), U(i) * v(i), V(i) * v(i), v(i), U(i), V(i), 1];
end
[X, Y, Z] = svd(A);
F = Z(:,end);
F = reshape(F, [], 3)';

[U, E, V] = svd(F);
E(3,3) = 0;
F_matrix = U*E*V';

Part 3: Fundamental Matrix with RANSAC

Here is the basic code for running RANSAC. It preprocesses the points so that they are only modified once (for efficiency), and then attempts to find a 'good' fundamental matrix with many inliers. The cutoff value of `.1` was found by looking at some descriptive statistics of the dataset on the Mt. Rushmore images, namely it was about half the standard deviation of the error.

After preprocessing, it runs for 15000 iterations, this was calculated assuming a 90% success rate, and the assumption that 33% of points were inliers.

At each iteration, a random sample of the points is used to calculate a potential Fundamental Matrix and then the number of inliers is calculated for this matrix. The best matrix from all attempts is taken. The inliers from this matrix are then found and reported.


for x = 1:size(matches_a, 1)
    As(x,:) = [matches_a(x,:) 1]';
    Bs(x,:) = [matches_b(x,:) 1]';
end
cutoff = .1;

Best_Fmatrix = estimate_fundamental_matrix(As(1:10,:), Bs(1:10,:));

% RANSAC
best_inliers = 0;
for i = 1:15000
    samples = randsample(size(matches_a, 1), 8);
    fmatrix = estimate_fundamental_matrix(As(samples,:), Bs(samples,:));
    num_inliers = 0;
    for k = 1:size(matches_a, 1)
        if abs([Bs(k,:)] * fmatrix * [As(k,:)]') < cutoff
            num_inliers = num_inliers + 1;
        end
    end
    if num_inliers > best_inliers
        best_inliers = num_inliers;
        Best_Fmatrix = fmatrix;
    end
end

inlier_indices = [];
for k = 1:size(matches_a, 1)
    if abs([Bs(k,:)] * Best_Fmatrix * [As(k,:)]') < cutoff
        inlier_indices = [inlier_indices k];
    end
end

Here's the result of this algorithm running on the Mt. Rushmore image pair. Every pair of points is a real correspondence, there do not seem to be any erroneous correspondences, a success!

If we look at the points in one image and the epipolar lines corresponding to those potential points from the other image, we get very good results as well. However, a closer look reveals that some of the points are not perfect correspondences, we can do better! While it is difficult to see, not every point is perfectly along the line that would correspond to the point, although it takes a few iterations of zoom to make that clear. (A clarification, since these are images created in Matlabs Figure viewer, the original lines and points were double precision accurate and so enough zoom would have ensured an obvious lack of correspondence, all of these images were created at zoom levels of 2 or 3).

Some other examples, namely Episcopal Gaudi and Notre Dame make these errors clearer.

The Gaudi example has some obvious problems, and there are a few mismatched pairs in Notre Dame, see the green line on the ground, and the purple line that connects the left and right towers across the two images. Feel free to open the images in higher resolution to see the unmatched or mismatched points more clearly

Extra Credit: RANSAC with Normalization

We can do better. Normalization can improve our results significantly by naturally reducing the third eigenvalue in our svd. Normalization looks a bit like this:

Note that this code creates the transformation matrix from *all* points instead of just the 8 points for a given hypothetical F. This is for both efficiency reasons, and because it improves the calcuation of the mean and standard deviation for the population.


means_a = mean(matches_a);
means_b = mean(matches_b);
std_a = std(matches_a);
std_b = std(matches_b);

real_std_a = (sum(std_a.^2).^.5).^-1;
real_std_b = (sum(std_b.^2).^.5).^-1;

Ta = [real_std_a, 0, 0; 0, real_std_a, 0; 0, 0, 1] * [1, 0, -means_a(1); 0, 1, -means_a(2); 0, 0, 1;];
Tb = [real_std_b, 0, 0; 0, real_std_b, 0; 0, 0, 1] * [1, 0, -means_b(1); 0, 1, -means_b(2); 0, 0, 1;];

As = zeros(size(matches_a, 1), 3);
Bs = zeros(size(matches_a, 1), 3);
for x = 1:size(matches_a, 1)
    As(x,:) = Ta * [matches_a(x,:) 1]';
    Bs(x,:) = Tb * [matches_b(x,:) 1]';
end
cutoff = .001;

This code works by creating a matrix that normalizes the coordinates so that the mean is (0,0) and the standard deviation is 1. The result is a better formed matrix after SVD. Note two things, one, `F` is now defined in some transformed coordinate space, and two, our inlier/outlier cutoff is much smaller. To solve the first problem, we also now transform the matrix back into real coordinates at the end.


Best_Fmatrix = Tb' * Best_Fmatrix * Ta;

The results at first seem unexciting

There's very little discernable difference, perhaps the points are better aligned, but it is nearly impossible to tell at any reasonable zoom level.

However, looking at other cases, the value of normalization becomes more apparent.

Normalization reduces the number of errors on the Notre Dame pair from 'a few' to 'none visible', and more drastically, moves the Gaudi pair from 'basically random' to 'practically perfect'. Once again, look at the full resolution images to see this more clearly.

Further Exploration: Modified RANSAC

After raising a question on piazza about why RANSAC used inliers/outliers instead of an error metric, I decided to implement the alternative to see if there were any improvements to be made using a softer error measure.

This error measure was implemented as follows


low_error = 1000000;
for i = 1:15000
    samples = randsample(size(matches_a, 1), 8);
    fmatrix = estimate_fundamental_matrix(As(samples,:), Bs(samples,:));
    errors = zeros(size(matches_a, 1), 1);
    for k = 1:size(matches_a, 1)
        errors(k) = abs([Bs(k,:)] * fmatrix * [As(k,:)]');
    end
    errors = sort(errors);
    tot_error = mean(errors(round(1:size(matches_a, 1) * .15)));
    if tot_error < low_error
        low_error = tot_error;
        Best_Fmatrix = fmatrix;
    end
end

Note that it calculates the mean error among the 15% of points with the lowest error for each potential F matrix, and uses this as the metric to judge a matrix's fitness, as opposed to the number of inliers and outliers.

Here are some of the images that resulted from this error metric, first unnormalized and then normalized

Adding normalization, unsurprisingly, significantly improves the results, much like it did with the original RANSAC algorithm.

The big takeaway here is that there is a bit more error here, which is not a good thing. There are a few more outlier points (as can be seen in the image above). Given this, I attempted to lower the cutoff (from 15% to 10%) to see if that provided any improvement.

First unnormalized.

And normalized.

This last image has one erroneous point, however you'll notice that in the epipolar line visualization, the point matches the epipolar line almost perfectly, so this point actually matches the transoformation matrix rather well. This was simply an unlucky match instead of an error in the algorithm.

Given that this alternate algorithm is slower (since it requires sorting the errors at each iteration), and provides no clear improvement in accuracy or necessary iteration count, since the same underlying RANSAC is used, it does not seem like a major improvement. However, I do believe that it might, in a more genetic-algorithm like system result in lower overall iterations and a faster system. Such an algorithm would work like RANSAC, but instead of trying 8 new points at every iteration, it would swap each point out with some probability based on the error value at that point. It also may work better when there are very few/very many good points, but I was unable to really test it in many cases.

The reason for this is fairly clear: RANSAC, by virtue of its design, does not care greatly about how good or bad a given match is, unless it is good enough. Since it forgets everything at each iteration, there is no reason for it to care or worry about how accurate the given matches are, beyond 'close enough'. However, an algorithm that saved some points between each iteration would want to naturally keep better points, and I hypothesize that an error metric that was more continuous and more favorable to better matches would work better in such a case, because a set of points that included 4 inliers and 4 outliers would perform better than one that contained 2 inliers and 6 outliers.