Project 3 : Camera Calibration and Fundamental Matrix Estimation with RANSAC

Introduction and Background

This project implements algorithms for the application of projective geometry in computer vision. Specifically, fundamental relations arising from the study of projective geometry are used for estimation of the fundamental matrix and camera pose. Topics are presented as follows: (1) calculation of projection matrix and camera pose, (2) estimation of fundamental matrix using singular value decomposition (SVD), and (3) estimation of fundamental matrix using random sample consensus (RANSAC). In addition, the effect of normalization will be studied and an extension of RANSAC will be used. The extended method is known as m-estimator sample consensus (MSAC)

Projective geometry is the mathematical study of geometric properties that are invariant under projective transformations. Intuitively, analysis of projections can be done through observations of the effect of changes in perspective on lines in an image. Parallel lines within an image will always tend to a point at infinity which is invariant, unless all points are coplanar (degenerate case). Based on this observation, it can be shown that the cross ratio of 4 collinear points remains invariant as well. This property will be used in the evaluation of the quantities shown above.

Calculating Camera Projection Matrix and Camera Pose

Given a set of 2D coordinates with corresponding 3D coordinates, a projection matrix can be calculated which relates 3D world coordinates with 2D image coordinates through a linear transformation. Determinig the projection matrix requires the formulation of a system of linear equations for regression. Linear regression is performed using a linear solver ('\' operator). Once the projection matrix has been calculated it can be decomposed to evaluate the camera center. A sample result is shown below. The residual generated by the given model is low, as expected. Figure 1 presents the camera center with corresponding target point.

The projection matrix is: 0.7679 -0.4938 -0.0234 0.0067 -0.0852 -0.0915 -0.9065 -0.0878 0.1827 0.2988 -0.0742 1.0000 The total residual is: <0.0445> The estimated location of camera is: <-1.5126, -2.3517, 0.2827>

Fig. 1: Camera center with corresponding target points

Fundamental Matrix Estimation with SVD

Similar to the calculation of the projection matrix, linear regression equations can be formulated for the fundamental matrix as well. In this implementation, however, SVD is used for linear regression within a 8-point algorithm. SVD requires more compuation, but is numerically stable and does not amplify perturbations in the matrix. The idea behind SVD, is to reduce the problem into a diagonal representation by orthogonal transformations of the domain and range. The diagonal representation can then be effeciently computed. A further advantage for SVD is that it generates results even for rank-deficient matrices, by computing a linear least squares solution and then projecting the result to the closest rank-deficient matrix. Figure 2 presents two images with epipolar lines, which converge to a vanishing point that is not in the image while intersecting target points.

Fig. 2: Epipolar lines of an image pair

Normalization of homogeneous coordinates through linear transformations can improve the resulting estimate; this is because numerical conditioning allows for more precision. The normalizaiton routine is shown below.


%example code
function [norm_points,T] = normalizeF(points)
     c = mean(points(1:2,:),2);
     np(1,:) = points(1,:)-c(1);
     np(2,:) = points(2,:)-c(2);
     
     dist = sqrt(np(1,:).^2 + np(2,:).^2);
     mean_dist = mean(dist(:));
        
     scale = sqrt(2)/mean_dist;
        
     T = [scale 0 -scale*c(1);0 scale -scale*c(2); 0 0 1];
        
     norm_points = T*points; norm_points = norm_points(1:2,:)';
 end

Estimation of Fundamental Matrix Using RANSAC

Perfect point correspondences between two images of a scene is unlikely to occur, therefore RANSAC can be used as a robust regression technique. The basic algorithm of RANSAC it oultined below. Each sample set has cardinality of 8, consitituing a minimial sample set (MSS) for the given regression problem. The SVD algorithm from the previous section is used to estimate the fundamental matrix (model) based only on the MSS. The model is then evaluated by determining the euclidian norm of the reconstruction error. Thresholding is used to determine the no. of inlier, and can be tuned. 5000 sample sets are used for the examples whoen below. Figure 3 presents two scenes with matched features shown, for Mount Rushmore. To observe the effect of normalization, the fundamental matrix is estimated for the Gaudi scene without and with normalization. The results show that a greater number of inliers exist with the implementation of normalization. Figure 4 shows feature matches and epipolar lines for the Gaudi scene. It can clearly be observed that the eippolar geometry is not consistent with the perspective in the scene.

RWhen implementing RANSAC points are evaluated for model fit by determing the Euclidean distance for the error. An extension of this approach is MSAC, which gives constant weighting to inliers and weights outliers by their distance. The weighted sums represent a score for fitness to the model. Figure 5 shows epipolar lines for the GAUDI scene which are more appropriate.

  1. Random sampling of MSS
  2. Estimation of fundamental matrix with MSS (SVD)
  3. Evaluation of norm of error
  4. Evaluation of no. of inliers
  5. Routine iterated for k random samples
  6. Matrix with greatest no. of inliers chosen


Fig. 3: Matched features for Mount Rushmore, and Notre Dame scenes



Fig. 4: Matched features and epipolar lines for Gaudi



Fig. 5: Epipolar lines generated using MSAC