CS7635 Final Project Report
Robot Localization - Michael Kaess

Goal of the Project

Many approaches for robot localization are described in the literature. Often active sensors like laser range finders are used to provide accurate measurements. The goal of this project is to localize the robot based on data from an omnicam, a relatively cheap passive sensor.

Localization is often divided in two versions: local and global localization. The local version tracks the robot beginning from a known start position. Global localization starts with no knowledge of the robot position.

Many localization algorithms use probabilistic models. Often Kalman filters are used, but due to their restriction to unimodal distributions, they are not suited for global localization. Also if track of the robot is lost, such algorithms usually cannot relocalize the robot.

Markov localization algorithms extend the Kalman filter to multimodal distributions by either using a topological or a grid based model. In both cases the probabilities of the robot for specific discrete locations is maintained. For big environments both models require the update of a large number of such discrete locations.

Monte Carlo Localization concentrates computational power on areas where it is more likely to find the robot, while still allowing multimodal distributions. This is done by sampling the probability distribution and updating the samples based on a motion model and observations of the environment.

Related Work

Monte Carlo Localization is described in [1]. Experimental results are provided using different sensors. One of the sensors used is a camera pointing upwards, and only the average over a few pixels in the center was used as input to the algorithm.

In this project, the input is extended to include more information from the image. Furthermore an omnicam is used, providing rotationally independent images.

Because of the amount of data of the input images, not all information can be used. Principal Component Analysis is commonly used to compress images to the most significant information to best distinguish between different images. On large amounts of data the standard way of calculating the covariance matrix gets too computationally expensive. [2] describes a way to calculate the eigenimages indirectly without calculating the standard covariance matrix.

Experimental Procedure

In a previous project, an omnicam mounted on top of an ATRV-Jr mobile robot platform was used to acquire sample and test images. 376 sample images with known location were taken in a nearly uniform pattern over all hallways of the 3rd floor of the MaRC building. Several sets of test images were taken by driving the robot randomly through the corridors, and extracting every 100th image.

Since the robot moved at different speeds during the test runs, the distance moved by the robot between two images varies. Just to give an impression, the average distance between two images is somewhere around 0.5 meters. Of these test sequences, only one is used for the reported data, consisting of 230 images. The robot started in the lower left corner of the map, moved through the middle corridor and finally stopped at the upper right corner.

All sample images are used in the original orientation along the hallway they were taken in, and additionally rotated by 180 degrees, to generate the eigenspace. This means that the eigenspace concentrates on images which are aligned in the corridor; the eigenspace is not suited to differentiate between images taken in other orientations, like diagonally in the hallway. The 24 most significant eigenimages are calculated and used as eigenspace.

Then all sample images are taken at every possible orientation, given a resolution of 5 degrees, and projected into the eigenspace. The resulting 376*72*24 values are stored as a "map" together with the locations each image was taken at.

During runtime, each test image is projected into eigenspace, and the resulting 24 values are used together with the map to generate input for the Condensation algorithm.

The Condensation algorithm starts with a random sample of robot poses (x,y,a values for location and orientation of the robot). The presented results used N=2000 samples for both, local and global localization. It proceeds in a loop, for each new input image doing the following two steps:

Prediction phase: Apply the motion model to each of the samples. The motion model moves the sample by a predicted distance (the 0.5 meters average reported above) in the direction given by the sample. Then noise is added to location and orientation to allow for inaccuracies.

Update phase: Each sample is weighted by its likelihood given the observation. This is done in the following way: For each sample, the k geometrically nearest map images are selected, that is based on actual location the map image was taken at, and the location (x,y) represented by the sample (k was chosen to be 10). The similarity of each of these with the input image is calculated as the sum of square differences between the projections. These similarities are weighted by the inverse of the geometrical distance, and their sum finally yields the weight of the sample.

Finally resampling of the weighted samples is done by randomly drawing out of the samples, with the probability of selecting a sample proportional to its weight.


The presented results (see the presentation page) show both, local and global localization. In the first movie the initial distribution is a Gaussian with mean around the actual starting location of the robot and variances of several meters. The second movie shows global localization starting with a uniform distribution over all hallways.

It needs to be mentioned that after fixing all bugs in the code, the parameters were not tweaked (N, k, the weighting of the samples). The performance can probably even be improved. Also the output is not optimal, since the movie shows all samples without weighting. Further work will be done to see if the results are worth publishing.

What I have learned

- PCA is useful for compression to relevant information, but the initial creation of the eigenspace can be extremly computationally expensive
- Condensation is easy to implement
- debugging Matlab code is not easy


  1. Dellaert, Frank, W. Burgard, D. Fox, and S. Thrun. Using the Condensation Algorithm for Robust, Vision-based Mobile Robot Localization. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1999.
  2. Turk, Matthew, A. Pentland. Eigenfaces for recognition The Journal of Cognitive Neuroscience, 3(1): 71-86, 1991

Back to SWIKI project overview

Back to my projects page