The use of digital, as opposed to analog signals, for television broadcasts and the transmission of other types of video and audio signals has been proposed as a way of allowing improved picture quality and as a more efficient use of spectral bandwidth over that currently possible using analog NTSC television signals.
Because of the relatively large amount of digital data required to represent a video image, many algorithms for video compression use motion compensation techniques, e.g., motion vectors, and Discrete Cosine Transform (DCT) coding, to reduce the amount of video data required to represent a video image.
Motion vectors are used to avoid the need to retransmit the same video data for multiple frames. A motion vector refers to a previous or subsequent video frame and identifies video data that should be copied from the previous or subsequent frame and incorporated into the current video frame. A motion vector will normally specifies vertical and horizontal indices identifying the block of data to be copied and the offset, if any, between the location of the identified video data in the previous or subsequent frame and the location in the current frame at which the specified video data is to be inserted. Some standards, such as the MPEG standard discussed below, allow the location offset information included in a motion vector to be specified to a resolution of half a pel, i.e., half pel resolution.
The ISO MPEG (International Standards Organization--Moving Picture Experts Group) ("MPEG") standard is an example of one standard which uses motion vectors and DCT coding in order to reduce the amount of data required to represent a video image.
One version of the MPEG standard, MPEG-2, is described in the International Standards Organization--Moving Picture Experts Group, Drafts of Recommendation H.262, ISO/IEC 13818-1 and 13818-2 titled "Information Technology--Generic Coding Of Moving Pictures and Associated Audio" hereby expressly incorporated by reference.
A known full resolution video decoder 10 is illustrated in FIG. 1. As illustrated, the known video decoder 10 includes a channel buffer 12, a syntax parser/VLD and master state controller circuit, an inverse quantization circuit 16, an inverse DCT (IDCT) circuit 18, a multiplexer 20, summer 22, anchor frame memory 24, and motion compensated prediction module 25 which are coupled together as illustrated in FIG. 1.
The channel buffer 12 receives and temporally stores encoded video data received from a transport decoder before supplying the encoded video data to the syntax parser/VLD and master state controller circuit 14. The syntax parser/VLD portion of the circuit 14 is responsible for parsing and variable length decoding the encoded video data while the master state controller is responsible for generating various timing control signals used throughout the decoder 10. The inverse quantization circuit 16 receives the video data from the circuit and performs an inverse quantization operation to generate a plurality of DCT coefficients and other data which are supplied to the IDCT circuit 18. In the full resolution decoder 10, the IDCT circuit 18 performs a full order IDCT operation on the DCT coefficients it receives. This means that if the video data was originally encoded using 8.times.8 DCT coefficient blocks it is decoded by performing an 8.times.8 IDCT operation.
The output of the IDCT circuit is coupled to a first input of the multiplexer 20 and to the first input of the summer 22. A second input of the MUX 20 is coupled to the output of the summer 22.
In the case of intra-coded video frames, the MUX 20 is controlled, as is known in the art, to output the video data generated by the IDCT circuit 18. This data is stored in the anchor frame memory 24 for use in subsequent predictions and is also output for display.
The motion compensated prediction (MCP) module 25 includes first and second motion compensated prediction circuits 28, 29, an average prediction circuit 30 and a MUX 31. The MCP module 25 is capable of performing single, e.g., forward or backward prediction as well as two way prediction. The first MCP circuit is responsible for performing one way prediction or the first of the two ways of prediction if two way prediction is employed. The 2nd MCP circuit 29 is used to perform the second prediction when two way prediction is employed.
The average prediction circuit 30 is responsible for averaging the results produced by the 1.sup.st and 2.sup.nd MCP circuits 28, 29 when two way prediction is used. The MUX 31 is controlled, as is known in the art, to output the signal from the 1.sup.st MCP circuit 28 when one-way prediction is being used and the output of the average prediction circuit 30, when two way prediction is being performed. The output of the motion compensated prediction module 25 is coupled to the input of the summer 22.
The summer 22 combines the output of the IDCT circuit 18 with the output of the MCP module 25 to produce data representing a fully decoded video image in the case of an inter-coded video image.
As is known in the art, the MUX 20 is controlled to select and supply to the anchor frame memory 24, the output of the IDCT circuit 18 in the case of intra-coded video images and the output of the summer 22 in the case of inter-coded images.
FIG. 2 is a simplified diagram of a portion 21 of the known full resolution video decoder 10 which follows the inverse quantization circuit 16 when configured for processing inter-coded video images. The illustrated portion 21 includes the IDCT circuit 18, the summer 22, the anchor frame memory 24 and the motion compensated prediction module 25. For purposes of simplicity, the MUX 20 is omitted from FIG. 2.
A relatively large amount of data may be required to represent a video image. This data must be stored, e.g., in an anchor frame memory for decoding purposes. High definition video images such as those used to provide HDTV, are an example of images where large amounts of data may be used to represent the video images.
In order to reduce the complexity and the cost of digital video decoders, various modifications to the portion 21 of the known full resolution decoder illustrated in FIG. 2 have been made. These techniques often include the use of downsampling to reduce the amount of data required to represent one or more video images thereby permitting a smaller anchor frame memory 24 to be used.
In some decoders, downsampling is achieved by extracting a subset, e.g., a 4.times.4 block of DCT coefficients, from each full block, e.g., 8.times.8 block, of DCT coefficients being processed. A reduced order IDCT, e.g., a 4.times.4 IDCT when processing images encoded using 8.times.8 blocks of DCT coefficients, is then performed on the extracted DCT coefficients. The DCT extraction operation may be performed by placing a DCT coefficient extraction circuit before the IDCT circuit 18 in the known encoder of FIG. 1. The reduced order IDCT may be accomplished by simply using a reduced order IDCT circuit, e.g., a 4.times.4 IDCT circuit, as the IDCT circuit 18.
By using a reduced order, e.g., 4.times.4 IDCT which matches the downsampled image size, IDCT circuitry requirements as well as memory requirements are reduced.
In many cases, performing an IDCT where some DCT coefficients have been forced to or are treated as zero, in combination with downsampling, has the unfortunate side effect of introducing drift into images, e.g., inter-coded video images. Drift results from the application of a motion vector which was intended to be applied to a full resolution image to a downsampled image.
One known downsampling decoder which performs a reduced order, i.e., a 4.times.4 inverse discrete cosine transform (IDCT) circuit on a downsampled video image, i.e., an image originally represented by an 8.times.8 block of DCT coefficients, is described in H. G. Lim et al.'s article "A low complexity H.261-compatible software video decoder," Signal Processing: Image Communication8, pp. 25-37, (1996) (hereinafter "the Lim et al. article).
The known approaches to performing drift reduction such as those described in the Lim et al. article are based on the use of a reduced order IDCT for downsampling, e.g., the use of a 4.times.4 IDCT to generate an IDCT from data coded using an 8.times.8 DCT. In such a case, each pixel represented by the DCTs in the reduced order DCT block being decoded are a function of a single full order, e.g., 8.times.8 DCT block.
For various reasons, in a reduced cost decoder, in many cases it is desirable to perform a full order IDCT, e.g., with some of the DCT coefficients set to or treated as zeros, as opposed to performing a reduced order IDCT. After completion of the full order IDCT downsampling may be performed to reduce memory requirements. This differs from the case where DCT coefficient extraction and a reduced order DCT is performed to produce the downsampled image. Significantly, in video decoders which perform a full order IDCT operation followed by a downsampling operation, the pixels of the downsampled video image may be a function of several different full size DCT coefficient blocks. This complicates drift reduction processing.
Unfortunately, because of the complexities associated with processing images which were generated using a full order IDCT followed by downsampling, the known drift reduction processing methods described in the Lim et al. article are not directly applicable to video decoders which use full order IDCTs followed by downsampling.
Accordingly, there is a need for methods and apparatus for reducing drift in video decoders which perform full order IDCTs followed by downsampling.
Another problem with known drift reduction techniques is that they do not support performing drift reduction on interlaced video where two fields may be combined into a single block for DCT processing, e.g., for performing an IDCT operation thereon.
Known decoders also suffer from the problem of inefficient drift reduction processing resource allocation. For example, in the decoder described in the Lim et al. article drift reduction techniques are applied uniformly to the generation of inter-coded video images without regard to the type of inter-coded video image being generated. In the case where computational resources are limited, e.g., in order to reduce costs, the uniform application of drift reduction to all inter-coded images being generated can be an inefficient allocation of processing resources.
Accordingly, there is a need for methods and apparatus for implementing drift reduction in downsampling decoders which utilize a full order IDCT followed by a downsampling operation. There is also a need for drift reduction methods and apparatus which are applicable to interlaced as well as non-interlaced video images regardless of whether a full or reduced order IDCT is performed.
In addition, there is a need for methods and apparatus which efficiently allocate drift reduction processing capability in order to maximize achieved drift reduction in systems with limited drift reduction processing capability, e.g., in low cost video decoders.