The invention relates to video and/or image compression and, more particularly, to methods and apparatus for context selection of block transform coefficients in a video and/or image compression system.
Lossless and near-lossless image and video compression techniques have generated considerable interest in the video processing community in recent years. Examples of such known techniques have been extensively described in the image and video compression literature, for example: Draft of MPEG-2: Test Model 5, ISO/IEC JTC1/SC29/WG11, April 1993; Draft of ITU-T Recommendation H.263, ITU-T SG XV, December 1995; xe2x80x9cLossless and near-lossless coding of continuous tone still imagesxe2x80x9d (JPEG-LS), ISO/IEC JTC1/SC 29/WG 1, July 1997; B. Haskell, A. Puri, and A. N. Netravali, xe2x80x9cDigital video: An introduction to MPEG-2, xe2x80x9d Chapman and Hall, 1997; H. G. Musmann, P. Pirsch, and H. J. Gralleer, xe2x80x9cAdvances in picture coding,xe2x80x9d Proc. IEEE, vol.73, no. 4, pp.523-548, April 1985; N. D. Memon and K. Sayood, xe2x80x9cLossless compression of video sequences,xe2x80x9d IEEE Trans. Communications, vol. 44, no.10, pp. 1340-1345, October 1996; A. N. Netravali and B. G. Haskell, xe2x80x9cDigital Pictures: Representation, Compression, and Standards,xe2x80x9d 2nd Ed., Plenum Press, 1995; A. Said and W. A. Pearlman, xe2x80x9cNew, fast, and efficient image codec based on set partitioning in hierarchical trees,xe2x80x9d IEEE Trans. Circuit and Systems for Video Technology, vol. 6, no. 3, pp.243-249, June 1996; M. J. Weinberger, J. J. Rissanen, and R. B. Arps, xe2x80x9cApplications of universal context modeling to lossless compression of gray-scale images,xe2x80x9d IEEE Trans. Image Processing, vol. 5, no. 4, pp.575-586, April 1996; X. Wu and N. Memon, xe2x80x9cContext-based, adaptive, lossless image coding,xe2x80x9d IEEE Trans. Communications, vol. 45, no. 4, pp. 437-444, April 1997; and Z. Xiong, K. Ramchandran, and M. T. Orchard, xe2x80x9cSpace frequency quantization for wavelet image coding,xe2x80x9d IEEE Trans. Image Processing, vol. 6, 1997.
These conventional techniques have been used in an attempt to generate high quality, perceptually distortion free compressed video and still images. One of the issues of interest in developing an image or video compression technique is the reduction of overhead data that must be sent by the encoder to the decoder for proper decoding of the coded bit stream. Approaches which have attempted to take this issue into consideration can be roughly classified into two categories: context-based predictive coding in the spatial domain and context-based coding in the wavelet domain. Examples of the spatial domain techniques are discussed in xe2x80x9cLossless and near-lossless coding of continuous tone still imagesxe2x80x9d (JPEG-LS), ISO/IEC JTC1/SC 29/WG 1, July 1997; the Weinberger et al. article; and the Wu et al. article, as mentioned above. Examples of the wavelet domain techniques are discussed in the Memon et al. article; the Said et al. article; and the Xiong et al. article, as mentioned above.
While some of the aforementioned art techniques do not require sending overhead information pertaining to coding parameters employed at an encoder to a corresponding decoder, the existing techniques for accomplishing this have exhibited a variety of shortcomings, e.g., high complexity, high cost to implement/operate, etc. Thus, it would be highly advantageous to provide an improved compression technique for block transform coding which not only avoids the burden of transmitting coding parameter related information to a decoder but also eliminates, or at least substantially minimizes, the shortcomings of existing approaches.
The present invention provides methods and apparatus for context selection of an image or video sequence in the transform domain. The transform coefficients may be obtained using a particular block transform, e.g., Hadamard transform. With proper context selection, and pre-specified selection rules, an encoder according to the invention can change the coding parameters of each block-and/or coefficient that is currently being encoded, depending solely on the context of the surrounding blocks or transform coefficients, without having to specifically send these coding parameters to a corresponding decoder.
In one aspect of the invention, a method for use in a block transform-based coding system of processing (e.g., encoding and/or decoding) one or more block transform coefficients associated with at least one block of visual data (e.g., an image and/or video sequence) comprises the following steps. First, one or more previously reconstructed block transform coefficients associated with the visual data are identified. Then, a context selection value is computed for use in processing a block transform coefficient associated with the at least one block, the context selection value being based on the one or more previously reconstructed block transform coefficients.
The context selection value may be computed as a function of one or more values respectively associated with one or more previously reconstructed block transform coefficients in near proximity, with respect to a scanning order, to the block transform coefficient to be processed. Further, the context selection value may be computed as a function of a spatial frequency associated with the block transform coefficient. In particular, previously reconstructed coefficients with the same spatial frequency (context) value may determine the coding parameters that will be used to encode the coefficient. Still further, the context selection value may be computed as a function of both the one or more values respectively associated with the one or more previously reconstructed block transform coefficients in near scanning order proximity and the spatial frequency associated with the block transform coefficient.
Since selection of block transform coefficients at an encoder is accomplished according to the invention using only previously reconstructed samples, the encoder does not need to provide such coding parameter information to the corresponding decoder since the decoder can get the information using the same previously reconstructed samples used at the encoder. Advantageously, transmission bandwidth and/or storage capacity is saved.