1. Field of the Invention
The present invention relates to video compression and decompression algorithms.
2. Description of the Related Art
In a typical transform-based video compression algorithm, such as one conforming to the Moving Picture Experts Group (MPEG) family of algorithms, a block-based transform, such as a discrete cosine transform (DCT), is applied to blocks of image data corresponding either to pixel values or pixel differences generated, for example, based on a motion-compensated inter-frame differencing scheme. The resulting transform coefficients for each block are then typically quantized for subsequent encoding (e.g., run-length encoding followed by variable-length encoding) to generate an encoded video bitstream.
Depending on the particular video compression algorithm, images may be designated as the following different types of frames for compression processing:
An intra (I) frame which is encoded using only intra-frame compression techniques,
A predicted (P) frame which is encoded using inter-frame compression techniques based on a reference frame corresponding to a previous I or P frame, and which can itself be used to generate a reference frame for encoding one or more other frames, and
A bi-directional (B) frame which is encoded using inter-frame compression techniques based on either (i) forward, (ii) reverse, or (iii) bi-directional prediction from either (i) a previous I or P frame, (ii) a subsequent I or P frame, or (iii) a combination of both, respectively, and which cannot itself be used to encode another frame. Note that, in P and B frames, one or more blocks of image data may be encoded using intra-frame compression techniques.
In MPEG-2 encoding, an (8xc3x978) quantization (Q) matrix can be defined (and updated) for each video frame, where each element in the Q matrix corresponds to a different corresponding DCT coefficient resulting from applying an (8xc3x978) DCT transform to a block of pixel values or pixel differences. For a given frame, the elements in the defined Q matrix are scaled by a quantization parameter (mquant), which can vary from block to block within the frame, to generate quantizer values used to quantize the different blocks of DCT coefficients for that frame.
The present invention is directed to a parameterized adaptation algorithm for updating the quantization matrix used during video compression, such as MPEG-2 compression. In a preferred embodiment, the parameterized Q matrix adaptation algorithm of the present invention is a real-time vision-optimized encoding (VOE) algorithm that does not require on-line computation of a visual discrimination model (VDM). The VOE element of this algorithm is that, in addition to other parameters of the algorithm, the functional relationship between the DCT statistics and the matrix parameterization is optimized based on the VDM, which can be any perceptual quality metric, using an exhaustive search.
According to embodiments of the present invention, the Q matrix is adapted based on the DCT statistics of the previously encoded frame of the same picture type. The DCT statistics are based on the slope of the main diagonal of the DCT map, which is averaged over a frame. The slope of the parameterized Q matrix is roughly inversely proportional to the slope of the main diagonal of the DCT map. The parameterization of the Q matrix may consist of three parameters: the slope of the matrix along the diagonal, the convexity of the matrix along the diagonal, and a specified constant offset. In one implementation of the present invention, the slope is updated for each frame type (i.e., I, P, and B frames), and the convexity is fixed to a constant. Another aspect of the algorithm is the mean adjustment of the matrix. Whenever the slope of a matrix changes (i.e., from frame to frame), the effective mean of the matrix should be changed, where the effective mean is preferably kept constant for a given frame.
According to one embodiment, the present invention is a method for processing a current frame of video data, comprising the steps of (a) generating a transform map for a previously encoded frame of video data of the same type as the current frame; (b) generating one or more quantization (Q) matrix shape parameters using the transform map for the previously encoded frame; (c) generating a Q matrix for the current frame using a parameterized function based on the one or more Q matrix shape parameters; (d) quantizing transform coefficients corresponding to the current frame based on the Q matrix; and (e) generating part of an encoded video bitstream based on the quantized transform coefficients.