Computer Vision Project

This project involves mapping 3d points in the world to 2d points in the image plane. It also deals with mapping points in one image to epipolar lines in another by means of a fundamental matrix. It demonstrates usage of RANSAC algorithm in order to find the best fundamental matrix. Finallly it concludes with image co-ordinate normalization in order to improve the estimation of the fundamental matrix. Therefore the project consists of the following four parts:

Projection matrix computation

The projection matrix is computed using the given set of 3d world points and 2d image points. The system of equations formed using the 20 points is solved using SVD. The projection matrix obtained is:

The estimated location of camera center is:

(-1.5127, -2.3517, 0.2826)

The residual value of the calculated 2d points w.r.t the ground truth is: 0.0445, which is reasonably small. The plot of calculated 2d points vs the ground truths is shown in the figure below

Fundamental Matrix Estimation

For the estimation of the fundamental matrix, we use the corresponding point locations from 2 images in order to form the system of equations. This system of equations is solved using SVD and the fundamental matrix is obtained, but after that its rank is reduced to 2. The fundamental matrix obtained is as follows:

The epipolar lines obtained for the pair of images using the calculated fundamental matrix is shown in the two images below.

Fundamental Matrix with RANSAC

In this part, we get many pairs of corresponding point locations from a pair of images by using vlfeat's SIFT feature extractor. We need to use these pairs in order to calculate the fundamental matrix. But however many of them are spurious pairs. In order to get the best fundamental matrix which has the most of number of "inliers", we use RANSAC algorithm. The number of iterations for which RANSAC is run is set to 2000 which is reasonably high, in order to get a high number of inliers to agree with the fundamental matrix. The number of pairs of point locations used in each iteration to calculate the fundamental matrix is 8. Once the fundamental matrix is obtained, we calculate the error for all the pairs of points w.r.t the fundamental matrix by using the property Af = 0. But here instead of the right hand side being zero, we might get non zero errors since we have spurious point pairs. I have chosen the threshold to differentiate between inliers and outliers to be 0.025 and iteration count was set to 2000.

Bells and Whistles: Image co-ordinates normalization

Since the corresponding pair of points obtained from the pair of images are not normalized, they lead to poor numerical stability. Here I have translated the all the points from each image such that they have a mean point location of (0,0) and they are also scaled such that the average distance of the points from the origin is 2. This leads to better estimation of the fundamental matrix. The following pairs of images show the results obtained after image co-ordinate normalization. Iteration count for all the following images was 2000 and outlier threshold was set 0.025. On comparing the results (using apple to apple comparison) of the image correspondences with and without image co-ordinate normalization, we find that image co-ordinate normalization has improved the results by a great extent.

Nitin Kodialbail

Project 3: Camera Calibration and Fundamental Matrix Estimation with RANSAC

Projection matrix computation

Fundamental Matrix Estimation

Fundamental Matrix with RANSAC

Bells and Whistles: Image co-ordinates normalization