Localization is often divided in two versions: local and global localization. The local version tracks the robot beginning from a known start position. Global localization starts with no knowledge of the robot position.
Many localization algorithms use probabilistic models. Often Kalman filters are used, but due to their restriction to unimodal distributions, they are not suited for global localization. Also if track of the robot is lost, such algorithms usually cannot relocalize the robot.
Markov localization algorithms extend the Kalman filter to multimodal distributions by either using a topological or a grid based model. In both cases the probabilities of the robot for specific discrete locations is maintained. For big environments both models require the update of a large number of such discrete locations.
Monte Carlo Localization concentrates computational power on areas where it is more likely to find the robot, while still allowing multimodal distributions. This is done by sampling the probability distribution and updating the samples based on a motion model and observations of the environment.
In this project, the input is extended to include more information from the image. Furthermore an omnicam is used, providing rotationally independent images.
Because of the amount of data of the input images, not all information can be used. Principal Component Analysis is commonly used to compress images to the most significant information to best distinguish between different images. On large amounts of data the standard way of calculating the covariance matrix gets too computationally expensive.  describes a way to calculate the eigenimages indirectly without calculating the standard covariance matrix.
Since the robot moved at different speeds during the test runs, the distance moved by the robot between two images varies. Just to give an impression, the average distance between two images is somewhere around 0.5 meters. Of these test sequences, only one is used for the reported data, consisting of 230 images. The robot started in the lower left corner of the map, moved through the middle corridor and finally stopped at the upper right corner.
All sample images are used in the original orientation along the hallway they were taken in, and additionally rotated by 180 degrees, to generate the eigenspace. This means that the eigenspace concentrates on images which are aligned in the corridor; the eigenspace is not suited to differentiate between images taken in other orientations, like diagonally in the hallway. The 24 most significant eigenimages are calculated and used as eigenspace.
Then all sample images are taken at every possible orientation, given a resolution of 5 degrees, and projected into the eigenspace. The resulting 376*72*24 values are stored as a "map" together with the locations each image was taken at.
During runtime, each test image is projected into eigenspace, and the resulting 24 values are used together with the map to generate input for the Condensation algorithm.
The Condensation algorithm starts with a random sample of robot poses (x,y,a values for location and orientation of the robot). The presented results used N=2000 samples for both, local and global localization. It proceeds in a loop, for each new input image doing the following two steps:
Prediction phase: Apply the motion model to each of the samples. The motion model moves the sample by a predicted distance (the 0.5 meters average reported above) in the direction given by the sample. Then noise is added to location and orientation to allow for inaccuracies.
Update phase: Each sample is weighted by its likelihood given the observation. This is done in the following way: For each sample, the k geometrically nearest map images are selected, that is based on actual location the map image was taken at, and the location (x,y) represented by the sample (k was chosen to be 10). The similarity of each of these with the input image is calculated as the sum of square differences between the projections. These similarities are weighted by the inverse of the geometrical distance, and their sum finally yields the weight of the sample.
Finally resampling of the weighted samples is done by randomly drawing out of the samples, with the probability of selecting a sample proportional to its weight.
It needs to be mentioned that after fixing all bugs in the code, the parameters were not tweaked (N, k, the weighting of the samples). The performance can probably even be improved. Also the output is not optimal, since the movie shows all samples without weighting. Further work will be done to see if the results are worth publishing.
Back to my projects page