In this project, I estimate the the camera projection matrix (maps 3D world coordinates to image coordinates), the fundamental matrix (relates points in one scene to epipolar lines in another), and then use RANSAC to find the best fundamental matrix given a set of possible matches, thereby eliminating spurious matches.
A matched image of Gaudi, after normalization. The lines indicate matched features.
This project can hence be divided into four parts:
The first goal is to compute the prjection matrix that goes from 3D world coordinates to 2D coordinates. The equation to move from 3D to 2D coordinates is:
To compute the center of the camera, we can simply use the below equation:
Here, M is made up of a 3x3 we’ll call Q and a 4th column called m_4. The resulting C matrix is the 3D coordinates of the camera center.
Actual and projected coordinates in 2D for the first set of points. |
Actual and projected coordinates in 3D with camera center for the first set of points. |
This part is for estimating the mapping of points in one image to lines in another by means of the fundamental matrix. The fundamental matrix is defined by:
We can set up the regression equations in a similar way to part 1 and solve for matrix F. However, the least squares estimate of F will be full rank, where as the fundamental matrix is a rank 2 matrix. In order to reduce its rank, we can decompose F using SVD and then estimate a rank 2 matrix by setting the smallest singular value in Sigma to 0 and then recomputing F using the new sigma matrix.
First image epipolar lines |
Second image epipolar lines |
Now, we will compute the fundamental matrix with unreliable point correspondences computed with SIFT. We will try to estimate the best Fundamental matrix given unreliable corrspondences, using RANSAC.
We first determine the number of iterations we should run for, using the equation discussed for RANSAC. We will then sample points randomly, estimate a fundamental matrix for those points and repeat this iterations number of times. We will chose the fundamental matrix with the max number of inliers, i.e., max number of corresponding points falling within the set threshold.
Below is a code snippet that illustrates how this is done in matlab.
%as discussed in class
threshold_delta = 0.04;
p = 0.99;
outlier_ratio = 0.5;
num_points_to_sample = 8;
iterations = log(1 - p) / log(1-(1 - outlier_ratio) ^ num_points_to_sample);
for sample = 1:iterations
inliers = 0;
inliers_a_current = [];
inliers_b_current = [];
indices = randsample(1:numPoints,num_points_to_sample);
sampled_a = matches_a(indices,:);
sampled_b = matches_b(indices,:);
F_matrix = estimate_fundamental_matrix(sampled_a, sampled_b);
for pointNum = 1:numPoints
distance = abs([matches_b(pointNum, :), 1] * F_matrix * [matches_a(pointNum, :), 1]');
if distance < threshold_delta
inliers = inliers+1;
inliers_a_current = [inliers_a_current; matches_a(pointNum, 1), matches_a(pointNum, 2)];
inliers_b_current = [inliers_b_current; matches_b(pointNum, 1), matches_b(pointNum, 2)];
end
end
if inliers > max_inliers
Best_Fmatrix = F_matrix;
inliers_a = inliers_a_current;
inliers_b = inliers_b_current;
max_inliers = inliers;
end
end
First image epipolar lines |
Second image epipolar lines |
Matched points |
Estimate of the fundamental matrix can be improved by normalizing the coordinates before computing the fundamental matrix. I used the approach as described in the project page. The vital equation to keep in mind while normalizing is as follows:
Where s is the scale factor and c is the mean coordinates.
We also have to retrieve the orginal F matrix back while estimating it using:
Gaudi: Without normalizing |
Gaudi: With normalizing |
Notre Dame: Without normalizing |
Notre Dame: With normalizing |
As we can see from the above results, normalizing seems to have improved the accuracy considerably, espcially in the case of Gaudi.