The present invention relates to decoding of images and, more specifically, to methods and apparatus for performing video decoding in a system including a programmable processor, e.g., a central processing unit (CPU).
The digital represent ation of images is becoming ever more common. Digital video, e.g., digital high definition television (HDTV), represents one application where digital video signals comprise, e.g., a sequence of digital images. With the increasing use of the Internet, the generation, storage and transmission of digital video images is likely to continue to increase.
In order to reduce the amount of data required to represent digital images, such images are often encoded, e.g., compressed. Coding of images in the form of frames may be performed on an intra-frame basis so that the generated encoded image data does not depend on data from other images. Intra-frame coding is often performed using discrete cosine transform (DCT) coding techniques. Data generated from intra-frame coding is often called intra-coded data.
When a sequence of images is being coded, coding of images may be performed on an inter-frame, as well as an intra-frame, basis. Inter-frame coding generally involves using motion compensated prediction techniques to produce motion vectors which include information on how a portion of a subsequent image may be generated using a portion of a preceding or subsequent image as reference data. Motion vector information, once generated, may be coded, e.g., using differential coding techniques, to reduce the amount of data required to represent the motion vector information.
When there is little motion between frames, motion vectors provide an efficient method of representing portions of image. However, intra-frame coding can be more efficient when there is a substantial change from one frame to another. In an attempt to maximize coding efficiency, often some portions of an inter-coded frame are encoded using intra-frame coding techniques while other portions of the same image are coded using inter-frame coding, e.g., motion vectors. Those portions of inter-coded frames that use inter-frame coding, may also include coded information regarding the residual, or correction, image. During decoding of such portions of frames, the coded residual data is decoded. The decoded residual image data is combined with image data generated by performing motion compensated predictions using reference frame data and motion vectors. Those portions of inter-coded frames that use intra-frame coding are decoded without the use of reference frame data or motion vectors. In this manner, a complete inter-coded frame can be generated from the intra- and inter-coded data used to represent the frame.
MPEG-2 is a well known video standard which includes support for the use of motion compensated prediction techniques in addition to transform coding, e.g., the use of DCT transforms. The international video standard MPEG-2 is described in the International Standards Organization document (ISO/IEC 13818-2). MPEG2 has been used as the basis for several commercial applications including digital video disks (DVD) and digital broadcast television.
In MPEG-2 motion vector information is differentially encoded for transmission purposes. In addition, quantization operations and run length coding operations are performed on video data to further reduce the amount of data required to represent the images being encoded. A scan conversion operation is also normally performed as part of the MPEG-2 coding process in order to convert a set of two dimensional coefficient data into a one dimensional data sequence which can be processed, stored and/or transmitted. Variable length coding is applied to many of the data elements in order to further reduce the number of bits needed to represent an image.
In order to view images represented using encoded video data, the image data has to be decoded prior to display. MPEG-2 decoding generally involves performing operations which are the inverse to those used to originally encoded the image data, e.g., decoding usually involves variable length decoding, inverse scan conversion, inverse quantization, inverse discrete cosine transform (IDCT), motion vector reconstruction and motion compensated prediction operations.
Given the large amount of data used to represent images, a considerable amount of processing is normally required to decode an encoded image. In the case of motion vectors, image decoding is complicated by the need to access reference frame data, e.g., previously decoded frames, in order to generate a current decoded frame. The need to access reference frame data to perform motion compensated predictions using motion vector information results in the motion compensated prediction operations being highly memory intensive. The large number of memory access operations associated with performing motion compensated predictions can often have a significant impact on the amount of resources required to decode an image.
Users of computers are beginning to expect that they will be able to decode and display video images, e.g., encoded motion pictures, in real time. In order to achieve the real time display of encoded video, the encoded video images, e.g., frames, on average, need to be decoded in the same or a smaller amount of time than is used to display the images. As discussed above, the amount of processing required to decode an image can be considerable. Placing real time restraints on decoder circuitry further complicates matters due to the time constraints in which the decoding must be accomplished.
Generally, known video decoders belong to one of three types: 1) those that use dedicated special purpose integrated decoder circuits coupled with video memory to fully decode encoded video data; 2) those that use software and a general purpose programmable processor, e.g., CPU such as a Pentium processor, to fully decode encoded video data; and 3) those that partition the video decoding between a general purpose programmable processing unit such as a Pentium processor and a graphics processor chip which is used to manage and/or perform memory intensive operations such as motion compensated prediction. A high speed video or graphics memory is often used in conjunction with the graphic""s processor to further accelerate image decoding operations.
In many high definition television sets and other video display devices, the first type of known decoders are used. That is, dedicated hardware decoder circuits are used to decode received encoded video data. FIG. 1 illustrates a known decoder circuit 100 capable of decoding MPEG-2 video signals.
As illustrated, the video decoder 100 comprises a variable length decoder (VLD) circuit 102, an inverse scan circuit 104, an inverse quantization circuit 106, an inverse discrete cosine transform (IDCT) circuit 108, a motion compensated prediction circuit 110, and a frame store memory 112. Motion vector reconstruction circuitry, which is used to reconstruct motion vectors from encoded motion vector information, is present but not explicitly shown in the FIG. 1 illustration.
In the known system, the VLD circuit 102 receives MPEG-2 encoded video data and performs a variable length decoding operation on variable length encoded data included therein. The inverse scan circuit 104 is responsible for re-sequencing elements of the video data output by the VLD circuit 102 to reverse the effect of the scan conversion operation originally used to convert the two dimensional coefficient data into a one dimensional ordering. The inverse quantization (IQUANT) circuit 106 performs an inverse quantization operation on the quantized values included in the video data output by the inverse scan circuit 104. The IQUANT circuit 106 generates a video data stream including DCT coefficients. The DCT coefficients are processed by the IDCT circuit 108 to generate decoded image date from received intra-coded image data. The output of the IDCT circuit, in the case of intra-coded data, is decoded video, e.g., pixel values which are used to control the light output of individual pixels.
As discussed above inter-coded images may include both intra and inter coded image portions. The motion compensated prediction circuit is responsible for using previously decoded image data, which has been stored in the frame store memory 102, as reference frame data. The motion compensated prediction circuit 110 outputs received decoded image data corresponding to intra-coded frames as well as decoded image data corresponding to inter-coded frames which it generates from received motion vectors and/or by combining decoded inter-coded image data with decoded image data.
In the case of personal computers, the computer already includes a general purpose processor which is capable of being used for video decoding operations. Accordingly, existing personal computers usually use the second or third of the known video decoding techniques discussed above.
Unfortunately, software decoder implementations tend to be far slower than hardware implementations operating at comparable clock speeds. This is because the CPU circuitry is not optimized for video decoding as is the case with a dedicated hardware circuit. In addition, the relatively large amount of data which must be transferred from memory when performing motion compensated prediction operations tends to create a bottleneck with regard to decoding speed.
While the use of a graphics processor and a dedicated video memory can help improve video decoding performance, the use of a general purpose CPU to perform non-memory intensive decoder operations still normally results in a slower decoder implementation than would be achieved using dedicated decoder hardware operating at comparable clock speeds.
Given the processing capability of CPU""s currently used in common personal computers, and the increasing tendency to run multiple applications at the same time, real time software based video decoding of HDTV and other high resolution video images is often impractical on modern personal computers.
In view of the above, it becomes apparent that there is a need for improved methods and apparatus for decoding video images in systems, e.g., personal computers, which include programmable CPUs. It is desirable that at least some of the new methods and/or apparatus be capable of being implemented at a lower cost than providing a complete dedicated video decoder circuit capable of decoding intra-coded and inter-coded image data. It is also desirable that at some of the new methods and apparatus be capable of being used with systems which include a graphics processor for performing and/or overseeing memory intensive decoding functions such as motion compensated prediction operations.
The present invention is directed to methods and apparatus for decoding images. In accordance with the present invention video decoder circuitry is combined with a programmable processor, e.g., a general purpose CPU, to perform video decoding operations. A graphics processor may be used in combination with the CPU and video decoder circuitry of the present invention to perform and/or oversee memory intensive video decoder operations.
The video decoder circuitry of the present invention performs one or more non-memory intensive video decoding operations such as syntax parsing, variable length decoding, DC DCT coefficient reconstruction, motion vector reconstruction, inverse scan operations, inverse quantization and/or mismatch control. Memory intensive video decoding operations such as performing prediction operations using motion vectors and reference frames are performed by the programmable processor and/or graphics processor operating in conjunction with the video decoding hardware of the present invention. Because the functions performed by the video decoder circuitry of the present invention are normally performed at the beginning portions of the inter-coded decoding process, e.g., prior to performing motion compensated predictions, the video decoder circuitry of the present invention may be characterized as a video decoder front end processor. In various embodiments, the video decoder front end processor includes a complete intra-coded decoder. In addition it may, and in various embodiments does, include a motion vector reconstruction circuit.
The decoder circuitry of the present invention does not perform memory intensive motion compensated predictions. Accordingly, it can be implemented far more cheaply than video decoder circuits which perform such operations.
By combining the relatively low cost video decoder circuitry of the present invention with a general purpose processor, superior decoded video quality can be achieved and/or more images decoded in a given amount of time than when a similar general purpose processor is used to decode images without the benefit of the decoder circuitry of the present invention. Thus, the methods and apparatus of the present invention provide a way of enhancing a computer system""s ability to decode video images at a fraction of the cost of adding a hardware decoder capable of fully decoding inter-coded images.
The video decoder circuitry of the present invention can be implemented on a separate chip from the computer system""s CPU. Such a chip can be incorporated into a computer card which can be inserted into a computer to enhance the system""s ability to perform video decoding. Alternatively, the video decoder circuitry of the present invention can be incorporated directly into a processor thereby providing the video decoder circuitry of the present invention and a programmable processor on a single semiconductor chip.
Various additional features and advantages of the present invention will be apparent from the detailed description which follows.