Swimming Locomotion Control

Overview

In this project, I aimed to search for good locomotion controllers for creatures simulated as articulated rigid bodys swimming in a virtual fluid environment. Specifically, we investigated how CMA-ES and Reinforcement Learning could be used to find those controllers.

CMA-ES

We used Covariance Matrix Adaptation (CMA-ES) to find a set of parameters that can actuate a creature to swim. For each degrees-of-freedom (DOF) in the articulated creature, we formulate its time depent target position as a sinusoidal function: $$q_{targ}=Asin(\omega t+\phi)$$ During simulation, torques for each DOF is computed from a stable PD controller to track a desired target configuration. We then use CMA-ES to find the best set of parameters ${ A_0,\phi_0,A_1,\phi_1,\dots,A_n,\phi_n }$ that would make the longest distance traveled for a creature in some time period.

Using this formulation, we can get the following result of a swimming controller for an articulated flatworm.

Flatworm

Deep Reinforcement Learning

To get an even more robust controller that could be not only time dependent but also state dependent, we used some deep reinforcment learning algorithm to find good controllers. Specifically, we used Proximal Policy Optimization algorithm for our policy search. We build neural net policies that take input of all the DoFs of a articulated creature and out put the torque need to be generated for each DOF.

Here are some fun results.

Flatworm

Turtle With Symmetric Strokes

Turtle With Asymmetric Strokes

Humanoid