Learning video processing by example

Antonio Haro, Irfan A. Essa


Abstract: We present an algorithm that approximates the output of an arbitrary video processing algorithm based on a pair of input and output exemplars. Our algorithm relies on learning the mapping between the input and output exemplars to model the processing that has taken place. We approximate the processing by observing that pixel neighborhoods similar in appearance and motion to those in the exemplar input should result in neighborhoods similar to the exemplar output. Since there are not many pixel neighborhoods in the exemplars, we use techniques from texture synthesis to generalize the output of neighborhoods not observed in the exemplars. The same algorithm is used to learn such processing as motion blur, color correction, and painting.

Please note: sequences have been compressed to reduce download time. As such, some contain compression artifacts.


Color correction

Training input Training output
Test sequence Algorithm output
This is a toy example showing the learning of color effects. It is simple because the same color transformation is applied to each pixel.


Noise removal

Training input Training output
Test sequence Algorithm output
In this example, our algorithm learns to remove noise in a temporally coherent manner from an input sequence. The noise used in training is Gaussian, and varies from frame to frame.


Motion blur

Training input Training output
Test sequence Algorithm output
This example demonstrates that our algorithm can learn processing that is temporal, in addition to the spatial examples presented above. Input is a pair of sequences rendered in Maya, with and without motion blur. The motion blur generated by our system is not as good as Maya's, but this is because Maya is using 3d information that our algorithm does not have access to. We also lose some of the blur because our temporal kernel is small (for performance reasons).


Painting

This is an example of an alternate use for our algorithm. In this case, the processing is non-obvious; the goal is to make the video look like a painting. As training, we anisotropically blur a Van Gogh painting, and present that as training input, and the original painting as training output. Our algorithm then generates a non-photorealistically rendered version of our test input video.


Pulp Fiction, copyright 1994, Miramax Film Corp.
Training input Training output
Test sequence Algorithm output

Training input Training output
Test sequence Nearest neighbor output Our algorithm's output

Training input Training output
Test sequence Nearest neighbor output Our algorithm's output

Training input Training output
Test sequence Nearest neighbor output Our algorithm's output

Training input Training output
Test sequence Nearest neighbor output Our algorithm's output



Publications

"Learning video processing by example",
A. Haro and I. Essa
Proceedings 16th International Conference on Pattern Recognition, Quebec, Canada, August 2002.
(
Abstract | PS.Z | PDF).


CPL Home
CPL Projects
CPL Publications
CPL People
CPL Courses
CPL Sponsors
Co-Web
CPL Swiki
GVU