Computer Vision Project 3

In this project, I worked to create a utility to calculate a camera's projection matrix from a 3D to 2D point correspondence, as well as to calculate a fundamental matrix relating two views based on an unreliable correspondence.

Projection Matrix

This step of the project was very straightforward - all I had to to was take the provided lists of corresponding points and produce a linear system of equations, two equations to each pair of points. After solving for the first 11 entries of the projection, I used the projection matrix to calculate the position of the camera center. Below is the set of points plotted in 3D along with the calculated camera center.


projection matrix:  0.7679   -0.4938   -0.0234    0.0067
   		   -0.0852   -0.0915   -0.9065   -0.0878
    		    0.1827    0.2988   -0.0742    1.0000
computed camera center: <-1.5126, -2.3517, 0.2827>

Fundamental Matrix Estimation

The second step of the project was intended to calculate the fundamental matrix relating points in one image to epipolar lines in the other using a set of corresponding image points. This also required the construction of a linear system of equations, this one with one equation constructed from each pair of points. Then, I added one last simple equation to the system to fix the scale of the solution and solved.

Now, the function visits each interest point in the image and sets up a feature_width-size box around it. For each pixel in that feature box, the function finds the direction of the image gradient at that pixel and finds the two directional bins (there being 8 of them in each spatial bin, from 0 to 315 at 45-degree intervals) the gradient "lies between." The script then assigns the coefficients to each of those bins, inversely proportional to the "distance" of the gradient direction to each bin's center, and multiplies the gradient magnitude at that point by both the directional bin coefficients and the spatial bin coefficients corresponding to its location within the feature block and adds it to its appropriate bin entries on the line of the feature vector matrix corresponding to the correct interest point.

In addition, I added data point normalization to estimate_fundamental_matrix. The script calculates the mean u and v values for all provided points from both image A and from image B. It then translates each point such that the points from both image A and image B are centered about the origin and finds the standard deviation of the translated points' magnitudes. The script then produces normalizing matrices for both images using these values and transforms each point by its image's scaling matrix. After fundamental matrix estimation completes as usual, these two matrices are used to convert the fundamental matrix for the normalized points back to the base space of the input points. The effect was almost immediately noticable - adding normalizaiton to the pipeline give me parallel epipolar lines on the Mount Rushmore pair, which was further improved by parameter tweaking.


estimated fundamental matrix (base image pair):   -703.5332e-009  -117.3541e-009   275.5338e-006
  						  -161.5641e-009   392.5460e-009   -52.2136e-006
   						   361.3123e-006    50.0006e-006  -106.1084e-003

RANSAC

This portion of the project involved taking a set of point correspondences produced by VLFeat using SIFT descriptors and using this imperfect matching to find the fundamental matrix relating the two camera views. For each iteration of RANSAC, I selected a small sample (10 seemed to work best) of point pairs from the provided matching and passed them into my estimate_fundamental_matrix script. Using the resulting fundamental matrix, I iterated through each pair of points, taking the point from one image and finding the minimum distance of its proposed match to its epipolar line. If the distance was small enough, it was added to the set of inlier matches. Then, if at least 30% of the input matches were found to be inliers, I recalculated the theoretical fundamental matrix using all of the detected inliers. I then iterated through all of the inliers used to calculate this new fundamental matrix and found the sum of the distances from each point to the epipolar line of its match. This I considered to be the fit error of this fundamental matrix.

After each iteration, if the error of the calculated fundamental matrix was the lowest so far, then the new fundamental matrix was stored as the best and its error stored as the new best error. The inliers for the best fundamental matrix were also recorded as the "unpruned" matches out of the input matches. According to my testing, the best results resulted from 10,000 iterations using an acceptable inlier distance of 15.

Stephen Camp

Project 3: Camera Calibration and Fundamental Matrix Estimation

Projection Matrix

Fundamental Matrix Estimation

RANSAC