The technique of mosaicing involves piecing together smaller components in order to generate one large, seamless unit. In the case of image and video mosaicing, the image frames are the small components, and the result of piecing them together is a panoramic view. The motivation behind this is that individual pictures or frames in a video sequence have a very limited field of view. We are able to see a much larger field of view. Hence, it seems natural to paste together a series of limited-view images to create one image with a field of view more similar our own. In practice, however, this is not easily done because each image undergoes a perspective transform due to the rendering or image capturing process.
Once mosaicing has been achieved, it is possible to experiment with dynamic objects in the input image
frames. In this project, we will attempt to remove dynamic objects in a video sequence while still maintaining
a visually correct mosaic.
Mosaicing can be performed on images or on a video sequence. This project will concentrate on video mosaicing. The primary difference between image and video mosaicing is that the frames in a video sequence retain a great deal of coherence while images may overlap only at the edges. Hence, a less rigorous and less computationally intensive approach may be used to align video frames. Such an approach would permit the possibility of real-time mosaicing of a video sequence.
As mentioned above, a second goal of this project is to remove dynamic objects from a video sequence
as the mosaic of the video stream is generated. The test data we will use is a video sequence taken of the
exterior of the College of Computing from one fixed view point while rotating the camera. Our initial video
will not include any moving/dynamic objects in the scene. Our final test data will be the same scene with a
single person or car moving in the scene.
This project is based upon the techniques outlined in the paper, Panoramic Mosaics by Manifold
Projection [Peleg, Herman 97]. The process involves:
Image alignment is accomplished in two passes. First, translation-only alignment is obtained using
image correlation. Second, more accurate alignment involving both image translation and rotation is
obtained by solving for motion parameters. Alignment can be improved using multi-resolution techniques
in which coarse correlation is used to guide fine correlation of images. Alignment is also improved by
processing images on a per-column basis. The columns of an image may be placed independently of each
other, which allows finer accuracy in alignment. Using only the first pass may allow real-time mosaicing of
video sequences.
In order to piece together consecutive images in a sequence, each image is cut into column strips.
These strips may then be pasted independently of one another according to the alignment described above.
The pasting of image sequences is accomplished by overlapping aligned images such that the middle of any
image is given the greatest weigh or influence, while the edges contribute less. This is done because
distortion is minimal at the center of images. There are two methods of pasting consecutive images. Either
overlapping pixels can be averaged with more weight given to those pixels closest to the image center, or the
pixel that is closest to its image's center is selected over all the overlapping pixels. In the latter case, blurring
and deterioration of image quality is minimized, but the seams in the resulting image may be more
apparent.
Seams are often the result of different color contrast between images. A multi-resolution spline technique can be used which composites images at each band-pass pyramid level.
As mentioned in the project domain, the final test data is a scene with one moving person or car. The
dynamic object in each frame will be removed prior to alignment and pasting. Correlation between
consecutive images should still be possible because all pixels associated with the dynamic object is replaced
by a constant color, such as black or white. Identification of dynamic pixels to be removed is accomplished
using flow field vector calculation (code supplied by CPL).