1. Field of the Invention
The present invention relates to signal processing, and, in particular, to computer-implemented processes and apparatuses for encoding and decoding video signals.
2. Description of the Related Art
This invention relates to signal processing which is often used to compress video image signals representative of video pictures into an encoded bitstream. Each picture may be a still image, or may be part of a plurality of successive pictures of video signal data that represent a motion video. As used herein, "picture" and "video picture" may interchangeably refer to signals representative of an image as hereinabove described.
The portion of an encoded bitstream representing a compressed picture may be stored in a mass storage device such as a hard disk drive or compact disk read-only-memory (CD-ROM) in its compressed format in order to conserve storage space. When the compressed picture is later retrieved it may be decompressed and, for example, displayed on a monitor. A higher amount of compression of the blocks constituting an image tends to lower the number of bits needed to represent the image, but also tends to diminish the quality of the image reconstructed by the decoder.
The encoded bitstream may also be transmitted to one or more remote signal processing systems such as video conferencing nodes which decode the encoded signals. These video conferencing nodes may be personal computer (PC)-based systems communicating with each other over a selected transmission medium. Possible transmission media include Integrated Services Digital Network (ISDN) and Public Switched Telephone Network (PSTN) telephone connections. Although ISDN connections provide a higher bandwidth than PSTN connections, ISDN connections are currently less readily available and more expensive than PSTN connections.
Because transmission media have finite bandwidths, in order to provide video conferencing of satisfactory quality, each PC system preferably compresses or encodes in real time the video signals corresponding to the local participant and transmits the resulting compressed signals or bitstreams to the PC systems of the remote participants. In such a video conferencing system, each PC system also preferably receives and decompresses compressed signals from the PC systems of the remote participants to play the decompressed video signals locally. The encoder may also, in some usages, encode video pictures offline to perform more computation-intensive and more efficient encoding.
Such encoding operations that compress video image signals typically operate on subsets of the image, such as (8.times.8) blocks of pixels, or on macroblocks comprising a number of such blocks. A macroblock comprises a (16.times.16) array of luminance pixels (also known as "luma pels") and two associated (8.times.8) blocks of chroma information. The (16.times.16) luma array is further divided into four (8.times.8) blocks, and all six blocks in a macroblock are typically transformed using the forward discrete cosine transform (DCT), quantized, and further encoded.
Thus, the (8.times.8) blocks of the image to be encoded are typically transformed by a forward DCT to generate a transformed signal comprising 64 DCT coefficients, which are also arranged in an (8.times.8) block. One technique for controlling the bit rate of the encoded bitstream is to select varying quantization levels at the encoding stage which are applied to the DCT coefficients to produce coefficient indexes. Varying quantization levels may be produced by using a basic quantization table which is multiplied by the quantization level (also referred to as the quantizer step size or quantization scale). Thus, when a basic quantization table is utilized in this manner, the quantization scale corresponds to the quantization level. For example, a quantization scale of 7 corresponds to a quantization level of 7, where 7 is multiplied by each entry in the basic quantization table to produce a scaled quantization table that corresponds to quantization level 7. A particular quantization level is typically selected within an acceptable range of quantization levels which are expected to produce approximately the desired codesize.
In quantization, each DCT coefficient is divided by the quantization factor in the corresponding (8.times.8) block position in order to reduce the number of bits needed to represent the coefficient. As is appreciated by those skilled in the art, use of a coarser quantization table, associated with a coarser quantization level, implies using fewer bits to encode an image but at the cost of image quality. Use of finer quantization tables results in encoded bitstreams with more bits but with higher quality images upon decompression or decoding. This type of bit rate control is often referred to as primary bit rate control. Secondary bit rate control involves the dropping of pictures or images from the video stream. The secondary bit rate control is a back-up mode in case the primary bit rate control is insufficient.
Existing techniques for encoding video signals include the H.261 (P.times.64) video compression method developed by the International Telegraph Union (ITU), and standards developed by the Moving Pictures Experts Group (MPEG) of the International Standards Organization (ISO), such as the ISO/IEC 11172 (MPEG-1) and ISO/IEC 13818 (MPEG-2) standards.
In existing techniques for encoding video signals, a reconstructed picture decoded from the encoded bitstream can suffer quality degradation due to, inter alia, inaccuracies caused by the quantization and dequantization process. In particular, encoding pictures on a block-by-block basis in which quantization is applied may lead to artifacts in the decoded images in the form of edges between the blocks. Such block edge artifacts are also referred to as blocking effects or blockiness.
Postfiltering is one technique which may be employed to smooth out those edges. Conventional methods of post-filtering use linear filters applied indiscriminately over the image or just along block boundaries. However, such techniques attempt to reduce artifacts which are effectively built into encoded video signals, rather than reducing block edge artifacts before pictures are encoded. Such post-filtering methods also tend to reduce real edges that happen to correspond to block boundaries.
There is thus a need for methods and apparatuses for encoding video signals to reduce blocking effects during the encoding process itself.