The purpose of this project was to understand scene geometry and camera calibration. The project was divided into three parts, estimation of the camera projection matrix, the fundamental matrix and using RANSAC to find the fundamental matrix with the most inliers.
The camera projection matrix is the one that converts world 3D coordinates to 2D image coordinates. Using the equations provided in the project brief, specifically the following:
I was then able to solve for matrix M using Singular Value Decomposition. The matrix I found corresponded with the provided matrix M in the brief.
The projection matrix is:
0.4583 -0.2947 -0.0140 0.0040
-0.0509 -0.0546 -0.5411 -0.0524
0.1090 0.1783 -0.0443 0.5968
The residual was around 0.445.
Using this matrix and the information from lecture on how to find the camera center C, I used the inverse of the 3x3 within M, multiplied by the fourth column of M. This yielded the camera center correctly. The estimated location of camera was: <-1.5127, -2.3517, 0.2826>
E = zeros(2*numPoints, 12); %11 for the entries of the big matrix and 1 for the column of u or v.
for i = 1:numPoints
point2d = Points_2D(i,:);
point3d = Points_3D(i,:);
u = point2d(1,1);
v = point2d(1,2);
x = point3d(1,1);
y = point3d(1,2);
z = point3d(1,3);
row1 = [point3d 1 0 0 0 0 -u*x -u*y -u*z -u];
row2 = [0 0 0 0 point3d 1 -v*x -v*y -v*z -v];
E(2*i-1,:) = row1;
E(2*i,:) = row2;
end
[U,S,V] = svd(E);
M = reshape(V(:,end), 4,3)';
The fundamental matrix is the one that helps us map points in one image to lines in another. Using a similar set up as from part 1 for the linear regression, we solve for F using SVD once again, estimate a rank 2 matrix by simply setting the lowest singular value to 0 and then performing SVD again.
The epipolar lines drawn on the sample images:
From glancing at the visualization and comparing with the provided lines, this points to correctness of our estimation of the fundamental matrix.
Fundamental matrix estimated =
-0.0000 0.0000 -0.0019
0.0000 0.0000 0.0172
-0.0009 -0.0264 0.9995
for i=1:numPoints
a = Points_a(i, :);
b = Points_b(i,:);
ua = a(1,1);
ub = b(1,1);
va = a(1,2);
vb = b(1,2);
E(i,:) = [ua*ub va*ub ub ua*vb va*vb vb ua va 1];
end
[U,S,V]=svd(E);
F_matrix = V(:,end);
F_matrix = reshape(F_matrix, [3 3])';
% values arranged in descending order in S
[U,S,V] = svd(F_matrix);
S(3,3) = 0;
F_matrix = U * S * V';
For this section, we use RANSAC in conjunction with the original fundamental matrix estimation in order to best deal with the presence of multiple outliers. The starter code allows for us to use vl_sift to perform SIFT matching for the given image pairs, and use these point correspondences with RANSAC to find the most appropriate fundamental matrix. The basic steps were as follows:
After obtaining the above results, I implemented coordinate normalization prior to fundamental matrix estimation. For this, I computed the means and chose to scale the coordinates on the computation of their standard deviations (see following code snippet), and then adjusted the fundamental matrix as well as per the equation given in the brief where F_orig = T_b' * F_normalized * T_a.
numPoints = size(Points_a, 1);
sum_a = sum(Points_a);
cu_a = sum_a(1,1)/numPoints;
cv_a = sum_a(1,2)/numPoints;
sum_b = sum(Points_b);
cu_b = sum_b(1,1)/numPoints;
cv_b = sum_b(1,2)/numPoints;
Us_a = Points_a(:, 1) - cu_a;
Vs_a = Points_a(:, 2) - cv_a;
scale_a = sqrt(sum(Us_a.^2 + Vs_a.^2));
Us_b = Points_b(:,1) - cu_b;
Vs_b = Points_b(:,2) - cv_b;
scale_b = sqrt(sum(Us_b.^2 + Vs_b.^2));
T_a = [scale_a, 0,0; 0, scale_a, 0; 0, 0, 1];
T_a = T_a * [1, 0, -cu_a; 0, 1, -cv_a; 0,0,1];
T_b = [scale_b, 0,0; 0,scale_b,0; 0,0,1];
T_b = T_b * [1,0,-cu_b; 0,1,-cv_b;0,0,1];
The results on the Episcopal Gaudi Image pair were as follows:
The matches for the other pairs of images were already quite good, but comparing to the pre-normalized points from before, Episcopal Gaudi seemed to perform much better after normalization. There do seem to be 1 or 2 spurious matches, but the overall improvement is evident.