The present invention relates to a system for constructing mosaic images from a sequence of frames encoded using global motion parameters.
When a video camera is moved angularly while recording a sequence of frames each frame shows a slightly different angular “slice” of a complete scene. By aligning the images on each frame with the images on its neighboring frames, a panoramic mosaic image may be compiled showing a greater angular view than any individual frame alone. This technique has also been widely used with still image photography to compose a photographic mosaic image where the camera angle was not wide enough to capture the entire scene with one photograph.
In an MPEG-2 system frames of data are transmitted as a plurality of 16×16 pixel data macro blocks. Some macro blocks have an associated motion vector. If the contents of a particular macroblock can be matched with a corresponding 16×16 pixel array in the previous or next video frame, then the contents of the macroblock is transmitted efficiently as a difference signal and one or more displacement vectors. The purpose of the displacement vector is to identify the location of the macroblocks in the previous or next video frame. If more than one vector is used, then each displacement vector specifies the displacement of each of the 8×8 blocks in the macroblocks. The purpose of the difference signal is to convey the sample value residuals between pixel values in the macroblocks and pixel values in the corresponding 16×16 image pixel blocks. Residual signals are typically small because displacement vectors align video frame content in time, thereby reducing the amount of data that must be transmitted to represent every frame.
Burt et al., U.S. Pat. No. 5,488,674, disclose a system for fusing images into a mosaic image based on hierarchical spatial decomposition of each image. The decomposition is used to identify salient features in each image. The composing mechanism uses the most salient features to build the mosaic. The technique described by Burt et al. does not include the situation where image fusion is performed in digital video encoding/decoding environment and over a digital communication channel. In particular, the image matching technique does not make use of the motion vectors or global motion parameters which are transmitted by an MPEG-2 or MPEG-4 encoder, respectively.
Burt et al., U.S. Pat. No. 5,649,032, describe a system for building a mosaic within a video encoding and decoding system from a series of images which are automatically warped. The image merging operations for the mosaic are pixel-based and are performed at various scales (from low resolution to original resolution). Burt et al. also disclose several techniques for aligning, selecting, and combining images. The mosaic is used to provide a prediction signal such that only the difference between the current image content and the most recent mosaic is transmitted. A residual analysis is performed at the end of each merging process to identify candidate signals to transmit. The reconstructed mosaics are an integrated part of the encoding and decoding process. Hence, the mosaic reconstruction process impacts the computational and memory requirements of both the encoder and decoder.
In MPEG-4, frames of pixel data are divided into data objects. The different data objects may be encoded and transmitted separately to the decoder. The decoder receives each of the encoded data objects and reconstructs each frame of the video. In addition, one of the data objects may be the background that is relatively stationary in relation to the other objects moving in the foreground. To reduce the bandwidth required for transmission of signals between an MPEG-4 video encoder and an MPEG-4 video decoder, a global motion compensated encoding mode may be triggered in the encoder. The purpose of global motion compensation is to describe the relative global transformation of an object or a frame content in time. When an MPEG-4 encoder enables the global motion compensated mode, it estimates global motion parameters between two consecutive video frames or video fields or video objects. The global motion parameters are subsequently used to predict the content of macroblocks after they have been warped (transformed) according to the estimated global motion parameters. In addition, the set of global motion parameters is transmitted to the MPEG-4 video decoder. The benefits of using global motion compensated coding are two fold: First, it alleviates the need to transmit displacement vectors for each macroblock and second it can produce smaller residuals because global motion parameters describe the motion video content more faithfully than local displacement vectors especially for video objects that undergo motion due to relative camera motion or zoom.
What is desired, therefore, is a mosaic construction system that does not significantly increase the memory and computational requirements of a video encoding/decoding system and is not computationally intensive.