1.Field of the Invention
The present invention relates to image processing, and, in particular, to video compression.
2. Description of the Related Art
The goal of video compression processing is to encode image data to reduce the number of bits used to represent a sequence of video images while maintaining an acceptable level of quality in the decoded video sequence. This goal is particularly important in certain applications, such as real-time video conferencing, where transmission bandwidth limitations may require careful control over the bit rate, that is, the number of bits used to encode each image in the video sequence. In order to satisfy the transmission and other processing requirements of a video conferencing system, it is often desirable to have a relatively steady flow of bits in the encoded video bitstream.
Achieving a relatively uniform bit rate can be very difficult, especially for video compression algorithms that encode different images within a video sequence using different compression techniques. Depending on the video compression algorithm, images may be designated as the following different types of frames for compression processing:
An intra (I) frame which is encoded using only intra-frame compression techniques, PA1 A predicted (P) frame which is encoded using inter-frame compression techniques based on a previous I or P frame, and which can itself be used as a reference frame to encode one or more other frames, PA1 A bidirectional (B) frame which is encoded using bidirectional inter-frame compression techniques based on a previous I or P frame and a subsequent I or P frame, and which cannot be used to encode another frame, and PA1 A PB frame which corresponds to two images--a P frame and a subsequent B frame--that are encoded as a single frame as in the H.263 video compression algorithm.
Depending on the actual image data to be encoded, these different types of frames typically require different number of bits to encode. For example, I frames generally require the greatest numbers of bits, while B frames generally require the least number of bits.
In a typical transform-based video compression algorithm, a block-based transform, such as a discrete cosine transform (DCT), is applied to blocks of image data corresponding to either pixel values or pixel differences generated, for example, based on a motion-compensated inter-frame differencing algorithm. The resulting transform coefficients for each block are then quantized for subsequent encoding (e.g., run-length encoding followed by variable-length encoding). The degree to which the transform coefficients are quantized (also referred to as the quantization level) directly affects both the number of bits used to represent the image data and the quality of the resulting decoded image. In general, higher quantization levels imply fewer bits and lower quality. As such, quantization level is often used as the primary variable for controlling the tradeoff between bit rate and image quality.
At times, using quantization level alone may be insufficient to meet the bandwidth and quality requirements of a particular application. In such circumstances, it may become necessary to employ more drastic techniques, such as frame skipping, in which one or more frames are dropped from the video sequence. Such frame skipping may be used to sacrifice short-term temporal quality in the decoded video stream in order to maintain a longer-term spatial quality at an acceptable level.