The backbone of The Membrane method of tracking is based on the ideas
of active contours, as well as some potential field techniques from robotics. Active contour models, such as those in the Snake algorithm, use intensity gradients from images to fit contours to underlying features of objects. Each snakelike contour has internal and external energy which it tries to minimize. External energy is the effect of gradients pulling the contours to a close fit of features. Minimizing internal energy keeps the contours tight and smooth.
In our approach to tracking we had hoped to track multiple subjects in realtime. Hence, calculation of gradients and minimization were very expensive. Additionally, we hoped to maintain the integrity of logical objects, rather than some features. Regardless of the changes in lighting or appearance of the subjects their logical models should remain intact.
The initial solution to this problem was to use a medium more conducive to distinguishing separate logical objects than intensity or color images. Since there exist algorithms that calculate disparity maps at framerate, such as SRI's Small Vision System, we chose to use this in our codebase. Our snakes would then be placed around (or within) the objects to be tracked and then would grow/shrink to fit the contour of the entire object in the disparity map.

For our contours we used a simple connected spline of points. The minimization of energy for the contour is a tight fit around the initially selected logical object. In order to achieve this, we were able to localize the problem to each individual point. Conceptually, each point is affected by a number of forces that attract and repel it, leading to the desired effect.
The total forces acting on each point decide where a point will travel and it is moved in that direction. The calculation is performed once for every point in the spline at each frame, iterations which are necessary to minimize energy occur through repetition on sequential frames (in order to maintain realtime tracking).

One of the largest problems with tracking is occlusion, where the object we are tracking moves in front or behind another object. We had hoped to resolve this problem for our algorithm. To do so, we tried a number of methods. Once we discovered that our algorithm was fast enough to track a large number of objects simultaneously, we narrowed this problem to occlusion between tracked objects. (Since all large objects in view could be tracked).
Some of the initial ideas involved finding the mean of the movement of unocluded points and translating the occluded points in that direction. This proved ineffective, since the motion of an object and its points was not necessarily related. For example, a person may move left, but the points on top of his head actually fall to his shoulders rather than translate left (by moving along their line of freedom).
Our final decision was to use a Closest Point algorithm from a previous unoccluded contour to a current one. It was necessary to find the translation of the previous contour which would give the closest approximation of the new contour. To do this, for each unoccluded point in the translated previous contour, we find the distance to its closest point on the new contour. Summing these provides a measurement for the quality of the approximation. Minimizing this distance gives an optimal translation of the previous contour that matches the current one, and this is then applied to the occluded points.

