This invention relates generally to the expansion of compressed video. More specifically, embodiments of this invention relate to an apparatus, system, and method for reducing the video frame memory required for a compressed video decoder.
DESCRIPTION OF THE RELATED ART
Embodiments of the present invention also relate generally to an apparatus, system, and method for processing Motion Picture Experts Group (xe2x80x9cMPEGxe2x80x9d) data in a video decoder. The Motion Picture Experts Group is a committee of experts that was formed under the auspices of the International Organization for Standardization, or ISO, in 1988. The MPEG is an engineering working group that generate the standards for the digital compression of video and audio signals. The MPEG group has fostered several standards, e.g., MPEG 1, MPEG 2, MPEG 4 and MPEG 7, that have become recognized as international standards for digital compression of video and audio signals. The MPEG standards have been implemented within the high definition TV (xe2x80x9cHDTVxe2x80x9d) standard, that has begun broadcasting in the United States.
However, the conventional and currently prevailing system of television broadcasts in the United States is the National Television Standards Committee (xe2x80x9cNTSCxe2x80x9d) system. The NTSC system is an analog system that has been used for over 50 years, and specifies the protocol of video signals that are broadcast over the air to television receivers. The NTSC standard encompasses and defines various aspects of an analog video signal, including the bandwidth and frequency restrictions, as well as the signal levels utilized in standard definition television (xe2x80x9cSDTVxe2x80x9d) receivers.
In the early days of television, television receivers were relatively expensive. The cost of providing programming was also relatively expensive. As electronics progressed, early vacuum tubes were replaced by more capable and more dependable vacuum tubes. Then, transistors gradually replaced the vacuum tubes. As electronics progressed even further, the single transistor was replaced by an integrated circuit in which many transistors could be contained on one circuit. As a consequence, the cost of producing television receivers declined in relative terms. Thus, more and more television receivers became available until the present day, where the average family has more than one television receiver per household.
Along with the reduction in the cost of television receivers, the cost of producing programming for television receivers has also been reduced. For example, video cameras can now be held in one hand, and they can be purchased relatively inexpensively. Also, there is a great deal of programming available. Consequently, various techniques to facilitate all of this additional programming have been sought.
One of these techniques involves utilizing digital television. Digital television systems replace the traditional NTSC analog signals with a signal comprising digital bits of data. Video and audio signals that were heretofore only available as analog signals can be now encoded digitally and broadcast. Digital broadcasting is utilized to improve quality, and to increase the capacity of existing channels.
However, the digital encoding of television signals, by itself, does not conserve television bandwidth. For example, if a conventional SDTV analog signal is instead encoded digitally, the amount of frequency spectrum required to broadcast the resulting digital signal will ordinarily be in excess of a 6 Megahertz (xe2x80x9cMhzxe2x80x9d) bandwidth. This excess bandwidth size is a problem, because the 6 Mhz bandwidth limitation is required by the conventional NTSC analog SDTV signal. However, an advantage of digital signals is that sophisticated compression techniques can be utilized, that were not conventionally amenable to analog signals. Thus, a digital signal may be compressed so as to fit within the preferred 6 Mhz bandwidth.
Further, video signals often contain a great deal of redundant information. By encoding, or compressing, the video information so that the redundant information is eliminated prior to transmission, a great deal of bandwidth can be saved. Also, various compression techniques can be used to compress a digital video signal to a fraction of its original size.
Specifically, each frame of a digital picture is composed of pixels. The pixels derive their name from the fact that they are picture elements. Repeated pixels are one source of redundancy that can be removed by digital compression techniques.
For example, in a video picture, a blue sky scene may occupy a large part of the frame. The blue sky may contain a single pixel that is repeated hundreds or thousands of times. However, instead of broadcasting each pixel, a run-length coding (xe2x80x9cRLCxe2x80x9d) technique can be employed. Run-length coding has many different implementations. However, each of these implementations basically replace a string of xe2x80x9clikexe2x80x9d pixels with a single pixel, and then indicates or stores some sort of counter, or number, that indicates how many times the same pixel is repeated.
In addition to run-length coding, a variable-length coding (xe2x80x9cVLCxe2x80x9d) technique can be employed. For example, a Huffman coding is one way of realizing a variable-length coding. Variable-length coding thus encodes the pixels that are more prevalent within a scene with a smaller number of bits, as compared to the relative number of bits utilized for pixels that are less prevalent in the scene.
Once again, using the illustration of a video picture with a blue sky scene portion, conventionally, the blue sky is represented by a 16 bit sized bit pattern. However, by utilizing the variable-length coder technique, only an 8 bit sized bit pattern is required for the blue sky scene portion. Thus, a 50% coding length savings may be achieved on this portion of the scene.
By utilizing these compression techniques, it is possible to fit more than one channel onto a single 6 Mhz bandwidth. But, as more and more programming becomes available, it becomes more and more difficult to fit this larger amount of programming into the existing 6 Mhz bandwidths. In other words, it is difficult to increase the number of channels that are carried on a single bandwidth available without increasing the amount of bandwidth used.
Digital TV is being utilized to increase both the amount of programming available and the quality of the picture available without increasing the bandwidth requirements. However, there is a tradeoff between the computing resources that are required to generate digital pictures, and the computing resources that are required to compress and decompress such pictures. These tradeoffs are based, in part, upon the relative costs of implementation of the various conventional techniques.
Specifically, in many video compression techniques, the frame memories are used to save the previously coded frame(s) that are used in compressing the new frames. In the decompression stage at the receiver, the frame memories are also used for the decoding process. But as to the cost, while the price of frame memories continues to decrease, frame memories still constitute a large portion of the cost of both encoders and decoders. It is, therefore, beneficial to reduce the required amount of frame memory, especially for video decoders.
For example, for an HDTV video decoder, one frame memory comprises a size of about 3.1 MB. This corresponds to a frame size of 1920 pixels by 1080 lines of luminance and xc2xd of that for the two other color components. Also, each video decoder requires at least 2 frame memories for the decoding of video.
Conventional solutions have attempted to reduce the frame memory requirements for video decoders by a factor of two or more. However, these memory compression solutions are merely an extension of compression algorithms that are used for the storage of transmission of data, with some additional constraints. The following three constraints are considered important.
The first constraint relates to the timing issues in performing the compression of the frame, storing the frame into the frame memory, and then decompressing a portion of the frame within the required time interval for decompressing one frame of video. The second constraint is providing the ability to access arbitrary portions of the compressed frame for decoding. The third constraint is to guarantee that the designated memory storage is sufficient for the compressed frame.
The conventional solutions to this frame memory problem incorporate a rate controller to ensure that the compressed video will fit into the desired memory area. However, this rate controller solution requires a tradeoff between the accessibility of the data and the compression factor. However, the rate controller solution cannot guarantee that the compressed data will fit into the designated memory area. Furthermore, the rate controller increases the computational requirements.
Thus, what is needed is an improved apparatus, system, and method for reducing the video frame memory required for a compressed video decoder.
Embodiments of the present invention are best understood by examining the detailed description and the appended claims with reference to the drawings. However, a brief summary of embodiments of the present invention follows.
Briefly described, an embodiment of the present invention comprises a device and a method that provides for the improvement of a reduction in a video frame memory requirement with respect to a compressed video decoder.
In one embodiment, this improvement is achieved by a removal of the rate controller. Also, both a block compression technique and a fixed storage allocation technique are utilized, in order to lower the overall system cost, and to lower the frame memory requirements.
In a preferred embodiment, this improvement is achieved by performing a hierarchical transform, e.g., a Haar transform, that operates on the previously decoded frames. Then, the coefficients obtained from this transformation are quantized and then run-length coded, utilizing variable-length codes. The hierarchical transform preferably operates on an Nxc3x97N block size with L levels of hierarchical decomposition, where N and L can be selected in advance. For example, in one preferred embodiment, N may equal 8, and L may equal 3.
The compression system then fits the Nxc3x97N block into an allocated storage of (Nxc3x97N)/cf bytes, where cf designates the compression factor. For example, a nominal value of cf that equals 2, 3, or 4 may be utilized. The quantization process comprises a simple scaling of the coefficients. However, the DC coefficient is not scaled. The variable-length encoder comprises a run-length encoder that fits as many coefficients as is possible into the available space of the (Nxc3x97N)/cf bytes.