The present invention relates to rescaling data. More specifically, the present invention relates to filtering data to allow higher reduction ratios. Still more specifically, the present invention provides techniques for selecting a cut-off index to filter data based on transform coefficients.
Video data is one particularly relevant form of data that can benefit from improved techniques for resealing. Video rescaling schemes allow digitized video frames to be represented digitally in an efficient manner. Rescaling digital video makes it practical to transmit the compressed signal by digital channels at a fraction of the bandwidth required to transmit the original signal without compression. Generally, compressing data or further compressing compressed data is referred to herein as rescaling data. International standards have been created on video compression schemes. International standards have been created on video compression schemes. The standards include MPEG-1, MPEG-2, MPEG-4, H.261, H.262, H.263, H.263+, etc. The standardized compression schemes mostly rely on several key algorithm schemes: motion compensated transform coding (for example, DCT transforms or wavelet/sub-band transforms), quantization of the transform coefficients, and variable length coding (VLC).
The motion compensated encoding removes the temporally redundant information inherent in video sequences. The transform coding enables orthogonal spatial frequency representation of spatial domain video signals. Quantization of the transformed coefficients reduces the number of levels required to represent a given digitized video sample and reduces bit usage in the compression output stream. The other factor contributing to rescaling is variable length coding (VLC) that represents frequently used symbols using code words. In general, the number of bits used to represent a given image determines the quality of the decoded picture. The more bits used to represent a given image, the better the image quality. The system that is used to compress digitized video sequence using the above described schemes is called an encoder or encoding system.
More specifically, motion compensation performs differential encoding of frames. Certain frames, such as I-frames in MPEG-2, continue to store the entire image, and are independent of other frames. Differential frames, such as B-frames or P-frames in MPEG-2, store motion vectors associated with the difference and coordinates of particular objects in the frames. The pixel-wise difference between objects is called the error term. In MPEG-2, P-frames reference a single frame while B-frames reference two different frames. Although this allows fairly high reduction ratios, motion compensation is limited when significant changes occur between frames. When significant changes occur between frames in a video sequence, a large number of frames are encoded as reference frames. That is, entire images and not just motion vectors are maintained in a large number of frames. This precludes high reduction ratios. Furthermore, motion compensation can be computationally expensive.
Each frame can be converted to luminance and chrominance components. As will be appreciated by one of skill in the art, the human eye is more sensitive to the luminance than to the chrominance of an image. In MPEG-2, luminance and chrominance frames are divided into 8xc3x978 pixel blocks. The 8xc3x978 pixel blocks are transformed using a discrete cosine transform (DCT) and scanned to create a DCT coefficient vector. Quantization involves dividing the DCT coefficients by a scaling factor. The divided coefficients can be rounded to the nearest integer. After quantization, some of the quantized elements become zero. The many levels represented by the transform coefficients are reduced to a smaller number of levels after quantization. With fewer levels represented, more sequences of numbers are similar. For example, the sequence 4.9 4.1 2.2 1.9 after division by two and rounding becomes 2 2 1 1. As will be described below, a sequence with more similar numbers can more easily be encoded using either arithmetic or Huffman coding. However, quantization is an irreversible process and hence introduces significant loss of information associated with the original frame or image.
Huffman or arithmetic coding takes the most common long sequences of numbers of bits and replaces them with a shorter sequence of numbers or bits. Again, Huffman or arithmetic coding is limited by common sequences of numbers or bits. Sequences that contain many different numbers are more difficult to encode.
Currently available compression techniques for resealing data (e.g. video or audio) are limited in their ability to effectively compress data sequences for transmission across networks or storage on computer readable media. The available techniques also have significant limitations with respect loss, computational expense, and delay. Various techniques for reducing the bit rate of compressed data sequences including audio and video streams are being developed. Some of the more promising approaches are described in U.S. Pat. No. 6,181,711 titled System And Method For Transporting A Compressed Video And Data Bitstream Over A Communication Channel. Other approaches are described in U.S. patent application Ser. No. 09/608,128 Methods And Apparatus For Bandwidth Scalable Transmission Of Compressed Video Data Through Resolution Conversion and U.S. patent application Ser. No. 09/766,020 titled Methods For Efficient Bandwidth and U.S. patent application Ser. No. 08/985,377 titled System And Method For Spatial Temporal-Filtering For Improving Compressed Digital Video Scaling Of Compressed Video Data. Each of these references is assigned to the assignee of this invention and is incorporated herein by reference for all purposes. It is still desirable to provide additional techniques for resealing data that improve upon the limitations of the prior art.
According to the present invention, methods and apparatus are provided for selecting a cut-off index to filter transform coefficients associated with an input bitstream to provide filtered transform coefficients associated with a rescaled output bitstream. An arrangement of transform coefficients associated with an input data sequence (e.g. an audio segment or a video image block) is filtered using a cut-off index to provide modified transform coefficients associated with a modified output data sequence. Information including information about the input data sequence and the modified output data sequence can be used to update the filter cut-off index for reduction of future data sequences.
One aspect of the invention provides a method for reducing the bit rate of a video bitstream to meet bandwidth constraints. The method can be characterized as follows: identifying transform coefficients representing video content in a frame or a portion of frame of the video bitstream; identifying a cut-off index using rate control information; filtering selected transform coefficients from the video bitstream by using a cut-off index to thereby reduce the bit rate of the video bitstream.
The transform coefficients can be discrete cosine transform coefficients. The transform coefficients selected for filtering can be selected based upon their impact on human vision. Frequency considerations can be used for determining a cut-off index.
Another aspect of the invention pertains to a method for determining a cut-off index for a plurality of transform coefficients. The method can be characterized as follows: identifying a plurality of transform coefficients; identifying a cut-off index using rate control information; filtering the plurality of transform coefficients using the cut-off index to provide modified transform coefficients; adjusting the rate control information using information associated with transform coefficients and the modified transform coefficients.
Another aspect of the invention is an apparatus for filtering transform coefficients to provide modified transform coefficients. The apparatus can be characterized as follows: a filter stage for receiving transform coefficients, the filter stage associated with a cut-off index, wherein the filter stage selectively filters transform coefficients using the cut-off index; a rate control stage coupled to the output of the filter stage, the rate control stage providing feedback to the filter stage for adjusting the cut-off index using rate control information.
Another aspect of the invention pertains to computer program products including a machine readable medium on which is stored program instructions, tables or lists, and/or data structures for implementing a method as described above. Any of the methods, tables, or data structures of this invention may be represented as program instructions that can be provided on such computer readable media.