The present invention relates generally to encoding frames of motion compensated, differential video data, and particularly to a method of encoded so-called xe2x80x9cresidue framesxe2x80x9d of MPEG video data by representing only the highest energy portions of the residue frames as the sum of a set of predefined two-dimensional waveforms.
Referring to FIG. 1, a preferred embodiment of the present invention operates in the context of a video encoding and decoding systems. In a typical video system 100 a video camera or other video source 113 outputs a series of video frames that are initially processed and encoded by a video encoder 132. The video encoder may be an MPEG, MPEG2, MPEG4 or similar encoder, but could also be any other type of video encoder that generates motion compensated residue or delta frames representing the differences between certain frames and earlier frames. In the preferred embodiment, the output of the video encoder 132 includes so called I frames and xe2x80x9cresidue frames.xe2x80x9d In simple terms, each I frame contains xe2x80x9cprimary dataxe2x80x9d representing an entire new picture, called a frame, while each residue frame contains differential data representing the differences between a previous video frame and a subsequent video frame.
The I frames and residue frames are compressed using various techniques to produce compressed data 124 (sometimes herein called encoded data). The present invention concerns only a data compressor 134 used to compress xe2x80x9cresidue framesxe2x80x9d and other sparse sets of data that contain xe2x80x9cislandsxe2x80x9d of non-zero values. The resulting encoded data 124 is stored, or transmitted to another computer, or both, using appropriate storage and transmission mechanisms 106, 112.
A video decoder 140 and data decompressor 135 convert the compressed video data 124 back into xe2x80x9creference pictures,xe2x80x9d which represent reconstruct frames of video data. The reconstructed video data frames are the same frames as those that a video decoder would generate while processing the compressed video data for viewing. The video encoder 132 compares a current video data frame with a motion compensated version of the most recent reference picture to generate a residue frame.
To reconstruct video images from encoded data, a data decompressor 135 is used to reverse the encoding process performed by the data compressor 134. Once again, in this document we are only concerning with the part of the system or process dealing with residue frames or sparse data frames. The resulting decoded data is then processed by a video decoder (e.g., a MPEG or similar decoder) 140, which reconstructs a set of video frames suitable for viewing on a video monitor 115, or for storage in uncompressed form in a data storage device 106.
As can be seen from the above discussion, the data compressor and decompressor of the present invention supplement the operation of motion compensating video encoders and decoders, enabling further compression of the video data. This reduces the bandwidth needed to transmit video images and the storage required to store such video images.
FIG. 3 is a highly schematic representation of a residue frame. The residue frame is filled primarily with data having very small values. A relatively small portion of the data in the residue frame has significant energy. The high energy portions of the residue frame are represented in FIG. 3 by concentric circular and oval regions, each line representing data values of equal energy. The regions of the residue frame between the circular and oval regions represent low energy data values.
One goal of the present invention is to match the shapes of the peaks in the residue frame with predefined two dimensional waveforms so that each such peak can be presented as the sum (i.e., superposition) of a small number of predefined waveforms, each multiplied by a magnitude value to indicate the best scaling of the predefined waveforms to match the data in the residue frame. Generally, the process of finding the best matches between a set of two dimensional data and a set of predefined waveforms requires computing the inner product of the data with each of the predefined waveforms at all possible positions within the data. Computing such inner products is typically computationally intensive because it requires the computation of the integral of each predefined waveform at every possible position of the waveforms within the data.
Another goal of the present invention is provide a set of spatially truncated two dimensional waveforms that have the property that determining the best match between the waveforms and a set of data is can be accomplished very efficiently through the successive application of a set of FIR (finite impulse response) filters. In particular, it is a goal of the present invention to reduce the computations required to find a best match between residue frame data and the predefined waveforms by using waveforms whose inner product with a set of video data can be generated through the application of FIR filters. Furthermore, it is a goal of the present invention to use waveforms that have been defined so that match values for a second waveform can be obtained by applying a predefined FIR filter to the match values for a first one of the waveforms.
In summary, the present invention is a system and method for encoding a two dimensional array of data. The data to be encoded may be a residue frame generated by a motion compensated video encoder.
The data encoding method utilizes a library having entries corresponding to a set of predefined two dimensional adaptive spline wavelet waveforms. Each predefined two dimensional adaptive spline wavelet waveform is formed by the superposition of one or more B-splines.
The data encoding method identifies a set of best matches between the array of data and the predefined two dimensional adaptive spline wavelet waveforms by generating the inner product of the array of data and each of the predefined two dimensional adaptive spline wavelet waveforms. Each inner product is generated by FIR filtering the data with a corresponding set of FIR filter coefficients, and then determining which of the inner products have largest values. Once a set of best matches has been found, the data encoding method generates data representing the identified set of best matches. The generated data indicates for each match: one of the library entries, a position within the array of data at which the match was found, and a magnitude of the match.
The data encoding method is computationally efficient because inner products are computed by FIR filtering. Further, the inner product between the array of data and some of the predefined two dimensional adaptive spline wavelet waveforms is generated by FIR filtering another one of the inner products using FIR filter coefficients specified by the library.
In addition, the inner product between the array of data and a first one of the predefined two dimensional adaptive spline wavelet waveforms having a low resolution level is generated by FIR filtering an earlier generated inner product of the array of data and a second one of the predefined two dimensional adaptive spline wavelet waveforms having a higher resolution level, using a predefined set of resolution modifying FIR filter coefficients. This feature of the invention takes advantage of the multiresolution properties of B-splines and enables the use of short FIR filters to efficiently compute the inner product between an array of data an low resolution waveforms.