In this project, we look at the problem of camera calibration through the following three steps:
In this part of the project, I calculate the projection matrix by building the following homogeneous linear system:
In this equation, X,Y,Z are the 3D point coordinates and u,v are the 2D point coordinates. M represents the projection matrix. Using SVD, I can solve for M. After solving for M, I get the following projection matrix:
0.4583 -0.2947 -0.0140 0.0040
-0.0509 -0.0546 -0.5411 -0.0524
0.1090 0.1783 -0.0443 0.5968
for the normalized points, which is slightly different from the answer given. I'm not quite sure why this is the case--perhaps differences in handling the floating point values or something along those lines, though I wasn't able to pinpoint where exactly that occurred. My total residual is 0.0445, which is fairly low.
I solve for the camera center by following the equation from the project page, where Camera Center = -Q^-1 * m4, where Q are the first three columns of M and m4 is the last column of M.
After solving for the camera center, I get:
<-1.5127, -2.3517, 0.2826>
for the normalized points, which is slightly different from the answer given. The minute errors are probably carrying over from the slightly different values in the calculation for the projection matrix.
These results can be seen in the following images, where only about 5-6 projected points aren't squarely on the actual points.
Then, I calculate the fundamental matrix with the following equation:
where x', y' is one set of points and x, y are another--these two sets of points correspond to one another. Again, I set up the system of equations accordingly to solve for A using SVD. Then using the output of SVD, I am able to calculate the fundamental matrix:
-6.60698417012863e-07 8.82396296136248e-06 -0.000907382302152503
7.91031620841439e-06 1.21382933020618e-06 -0.0264234649901806
-0.00188600197690852 0.0172332901072652 0.999500091906723
However, this least squares estimation is full rank and the fundamental matrix is of rank 2. So, I use SVD again to decompose it into U, S, V and set the smallest value in S to 0. I recalcualate the fundamental matrix by multiplying U, the new S, and V to calculate the following final matrix:
-5.36264198382353e-07 8.83539184115726e-06 -0.000907382264407744
7.90364770858056e-06 1.21321685010730e-06 -0.0264234649922034
-0.00188600204023565 0.0172332901014488 0.999500091906703
This matrix is visually verified as being correct by plotting the epipolar lines, which you can see below:
First, the VL Feat implementation of SIFT is used to produce descriptors and possible matches.
Image | SIFT Descriptors in Image A | SIFT Descriptors in Image B | Possible Feature Matches | Inlier Feature Matches |
---|---|---|---|---|
Rushmore | 5581 | 5246 | 825 | 87 |
Notre Dame | 3396 | 3025 | 851 | 96 |
Gaudi | 12683 | 8766 | 1062 | 124 |
In this portion of the assignment, I improve the above results with normalization. As mentioned in the lecture, the numbers can easily blow up, with certain regions of the matrix holding values much larger than other sections. To correct for this, I scale these coordinates by computing the scale factor from the mean coordinates. I tried two methods of scaling. The first was to use the maximum value from the points to scale, which led to a difference between the original fundamental matrix and the normalize fundamental matrix, the difference of which you can see here at each index for the Rushmore image pair:
-0.0000 0.0000 -0.0000
0.0000 0.0000 -0.0009
-0.0003 -0.0007 0.5667
Those are pretty minute differences for the most part, except for at 3,3. So, I tried using standard deviation to see if I could get the original and normalized fundamental matrix to be the same:
-0.0000 0.0000 -0.0081
-0.0000 -0.0000 0.0150
0.0047 -0.0196 1.6878
That ended up doing even worse, so I stuck with the maximum value method.
After applying this to the estimation of the fundamental matrix implemented in part 2, I get the following results: