rays-map.pngWhile working on the DARPA LAGR project, we found it exceedingly difficult to tune our reactive behaviors to work well in cluttered and patchy environments. Either obstacle avoidance was too sensitive and the robot would not drive through gaps, or it was too aggressive and the robot would collide with obstacles. At first we made the behaviors more and more complicated, introducing more parameters to tune, but we really wanted an efficient way to demonstrate to the robot how to drive.


Thus, we developed a system for interactive, on-line training of behaviors with a remote control. The user flips a switch to training mode, and drives the robot how they would like it to drive, then flips the switch back to autonomous to test the behavior. If the robot doesn’t perform well and needs additional training, the user just flips the switch again and provides an example of correctly performing whatever maneuver the robot goofed up on. When in training mode, the robot is recording training instances of its stereo perception and the direction to its goal location, along with the motor command that the user provides with the remote control.

Under the hood, the robot is recording training instances that associate its sensor state with the action the human demonstrates. At run-time, the robot finds the closest-matching example to it’s current sensor state in its database of recorded training instances, and executes the demonstrated motor command from that closest match.

The sensor state is actually a vector containing measurements to the nearby obstacles, and the robot’s heading to the goal. The obstacle measurements are regularly-spaced around the robot, analogous to a laser range scan, but in fact generated from a top-down-view obstacle map the robot accumulates using stereo:

info_flow_web.pngWe found that computing the nearest-neighbor using the full-dimensionality range vector did not yield very good perceptual matches. The reason, we believe, is that the distances between points in high-dimensional spaces approach the same value as the dimensionality increases. The problem is that the signal gets lost in the noise. Because Euclidean distance is the sum of the squares of the distance in each dimension, important variations in distance between some dimensions become a small fraction of the total sum of distances in many dimensions.

Because of this dimensionality problem, we perform PCA on the training database of perception states, then project new perceptual states at runtime onto the same principal axes. We empirically found that the first 6 principal axes capture environment very well. The effect of PCA, in addition to simply reducing dimensionality, is also to average out noise in range measurements caused by patchy perception or small false obstacles.


Learning Mobile Robot Behaviors by Example. R. Roberts and C. Pippin and T. Balch. Journal of Field Robotics, Special Issue on Learning Applied to Ground Robots (to appear). [ pdf ]

[ Slides pdf ]