Learning video processing by example

Antonio Haro, Irfan A. Essa

Abstract: We present an algorithm that approximates the output of an arbitrary video processing algorithm based on a pair of input and output exemplars. Our algorithm relies on learning the mapping between the input and output exemplars to model the processing that has taken place. We approximate the processing by observing that pixel neighborhoods similar in appearance and motion to those in the exemplar input should result in neighborhoods similar to the exemplar output. Since there are not many pixel neighborhoods in the exemplars, we use techniques from texture synthesis to generalize the output of neighborhoods not observed in the exemplars. The same algorithm is used to learn such processing as motion blur, color correction, and painting.

Please note: sequences have been compressed to reduce download time. As such, some contain compression artifacts.

Color correction


Training input	Training output


Test sequence	Algorithm output

This is a toy example showing the learning of color effects. It is simple because the same color transformation is applied to each pixel.

Noise removal


Training input	Training output


Test sequence	Algorithm output

In this example, our algorithm learns to remove noise in a temporally coherent manner from an input sequence. The noise used in training is Gaussian, and varies from frame to frame.

Motion blur


Training input	Training output


Test sequence	Algorithm output

This example demonstrates that our algorithm can learn processing that is temporal, in addition to the spatial examples presented above. Input is a pair of sequences rendered in Maya, with and without motion blur. The motion blur generated by our system is not as good as Maya's, but this is because Maya is using 3d information that our algorithm does not have access to. We also lose some of the blur because our temporal kernel is small (for performance reasons).

Painting

This is an example of an alternate use for our algorithm. In this case, the processing is non-obvious; the goal is to make the video look like a painting. As training, we anisotropically blur a Van Gogh painting, and present that as training input, and the original painting as training output. Our algorithm then generates a non-photorealistically rendered version of our test input video.

Pulp Fiction, copyright 1994, Miramax Film Corp.


Training input	Training output


Test sequence	Algorithm output


Training input	Training output


Test sequence	Nearest neighbor output	Our algorithm's output


Training input	Training output


Test sequence	Nearest neighbor output	Our algorithm's output


Training input	Training output


Test sequence	Nearest neighbor output	Our algorithm's output


Training input	Training output


Test sequence	Nearest neighbor output	Our algorithm's output

Publications

"Learning video processing by example",
A. Haro and I. Essa
Proceedings 16th International Conference on Pattern Recognition, Quebec, Canada, August 2002.
(Abstract | PS.Z | PDF).

CPL Home

CPL Projects

CPL Publications

CPL People

CPL Courses

CPL Sponsors

Co-Web
CPL Swiki

GVU