Not Applicable.
1. Field of the Invention
The invention relates to the field of motion video compression and decompression.
2. Background Information
Motion video data usually consists of a sequence of frames that, when displayed at a particular frame rate, will appear as xe2x80x9creal-timexe2x80x9d motion to a human eye. A frame of motion video comprises a number of frame elements referred to as pixels (e.g., a 640xc3x97480 frame comprises over 300,000 pixels). Each pixel is represented by a binary pattern that describes that pixel""s characteristics (e.g., color, brightness, etc.). Given the number of pixels in a typical frame, storing and/or transmitting uncompressed motion video data requires a relatively large amount of computer storage space and/or bandwidth. Additionally, in several motion video applications, processing and displaying a sequence of frames must be performed fast enough to provide real-time motion (typically, between 15-30 frames per second).
Techniques have been developed to compress the amount of data required to represent motion video, making it possible for more computing systems to process motion video data. Typical compression techniques compress motion video data based on either: individual pixels (referred to as pixel compression); blocks or regions of pixels in a frame (referred to as block compression); individual frames; or some combination of these techniques.
Pixel compression techniques tend to be easier to implement and provide relatively high quality (U.S. patent application, Ser. No. 08/866,193. filed May 30, 1997). However, pixel compression techniques suffer from lower compression ratios (e.g., large encoding bit rates) because pixel compression techniques consider, encode, transmit, and/or store individual pixels.
In contrast to pixel compression, block compression systems operate by dividing each frame into blocks or regions of pixels. Block compression is typically based on a discrete Fourier transform (DFT) or a discrete cosine transform (DCT). In particular, each region of pixels in the first frame in a sequence of frames is DFT or DCT encoded. Once encoded, the first frame becomes the xe2x80x9cbase frame.xe2x80x9d To achieve a higher degree of compression, block compression systems attempt to compress the next new frame (i.e., the second frame) and all subsequent frames in terms of previously DFT/DCT encoded regions where possible (referred to as interframe encoding). Thus, the primary aim of interframe compression is to eliminate the repetitive DFT/DCT encoding and decoding of substantially unchanged regions of pixels between successive frames in a sequence of motion video frames.
To perform interframe compression on a new frame, the pixels in each region of the new frame are compared to the corresponding pixels (i.e., at the same spatial location) in the base frame to determine the degree of similarity between the two regions. If the degree of similarity between the two regions is high enough, the region in the new frame is classified as xe2x80x9cstatic.xe2x80x9d A static region is encoded by storing the relatively small amount of data required to indicate that the region should be drawn based on the previously encoded corresponding region of the base frame. In addition to classifying regions as xe2x80x9cstatic,xe2x80x9d interframe compression techniques typically also perform motion estimation and compensation. The principle behind motion estimation and compensation is that the best match for a region in a new frame may not be at the same spatial location in the base frame, but may be slightly shifted due to movement of the image(s)/object(s) portrayed in the frames of the motion video sequence. If a region in a new frame is found to be substantially the same as a region at a different spatial location in the base frame, only the relatively small amount of data required to store an indication (referred to as a motion compensation vector) of the change of location of the region in the new frame relative to the base frame is stored (U.S. patent application, Ser. No. 08/719,834, filed Sep. 30, 1996). By way of example, MPEG (a standard for Block Compression ) performs a combination of: 1) intraframe compression on the first frame and on selected subsequent frames (e.g., every four frames); and 2) interframe compression on the remaining frames.
In contrast to both pixel compression and block compression, frame compression systems operate on one entire frame at a time. Typical frame compression systems are based on decomposing a frame into its different components using a digital filter, and then encoding each component using the coding technique best suited to that component""s characteristics. To provide an example, subband coding in a technique by which each frame is decomposed into a number of frequency subbands, which are then encoded using the coding technique best suited to that subband""s characteristics. As another example, various references describe different frame compression systems that are based on using wavelets to decompose a frame into its constituent components (e.g., U.S. Pat. Nos. 5,661,822; 5,600,373).
When using a digital filter in frame compression systems, a problem arises due to the lack of input data along the boundaries of a frame. For example, when a digital filter begins processing at the left boundary of a frame, some of the filter inputs required by the filter do not exist (i.e., some of the filter inputs are beyond the left boundary of the frame). Several techniques have been developed in an attempt to solve the problem of the digital filter input requirements extending beyond the boundaries of a frame. As a example, one technique uses zero for the nonexistent filter inputs. Another technique called circular convolution joins the spatially opposite boundaries of the image together (e.g., the digital filter is performed as if the left boundary of the frame is connected to the right boundary of the frame). In another system, a different wavelet is used at the boundaries (see U.S. Pat. No. 5,661,822). In still another system, the image data at the boundaries is mirrored as illustrated below in Table 1.
A method and apparatus for performing multiple frame image (or super frame) compression and decompression of motion video data is described. According to one aspect of the invention, a plurality of sequential frames in a motion video sequence are collected and digitally filtered as a single image. At least some of the results of the digital filtering are then encoded to generate compressed data. According to another aspect of the invention, the plurality of sequential frames are filtered as if the boundary of each frame is adjacent to a boundary in the same spatial location of another of the plurality of sequential frames.
According to another aspect of the invention, a computer system is described including a processor and a memory. The memory provides a buffer for processing a plurality of frames of a motion video sequence as a single image. In addition, a plurality of instructions are provided, which when executed by the processor, cause the processor to generate compressed data representing the motion video sequence by decomposing the single image and compressing at least some of the resulting wavelet coefficients. According to another aspect of the invention, the decomposition is performed such that at least one boundary of a frame in the single image is processed as if it were adjacent to the same boundary of another frame in the single image.