I. Field of the Invention
The invention is very useful in MPEG-2 applications, especially in PC HDTV receiver applications where a large number of bits have to be parsed and decoded fast enough so as to receive HDTV program on a PC in real time. The method introduced in the invention presents a very efficient way to parse and decode MPEG-2 video stream that is continuously received on a PC.
II. Description of the Background Art
Digital TV becomes more and more important and favorable because of its good aspects, such as high quality pictures, robustness to channel noise, multi-channel capability, interactivity, editing capabilities and less transmitting power with equal quality in whole service area. Digital TV programs have been broadcasted by experimental stations in some countries, such as UK, Germany, U.S., Japan and Singapore. In the near future, the commercial and public digital TV stations will be launched on air in most of countries. Household receive the Digital TV program using SDTV set, HDTV set or set-top-box that are very expensive for a lot of families. PC-DTV receiver is a cheaper solution to watch DTV program besides it has better picture quality than TV screen. Adding in the feature of receiving DTV program on a PCs is also helpful to stimulate PC market.
MPEG-2 video and system standards are the main components for DTV broadcasting, used for all different DTV standards. MPEG-2 transport stream standard is adopted for multiplexing, MPEG-2 video MP@ML is adopted for SDTV program, and MPEG-2 video MP@HL is adopted for HDTV program. AC-3, MPEG layer 2 audio, and AAC are the audio standards used in DTV broadcasting for USA, Europe, and Japan respectively. Real time video and audio decoders as well as transport stream demultiplexer are the main development items. Among these items, real time video decoding is the most difficult part, especially for HDTV video decoding.
Video decoding process involves main steps of parsing bits, variable length decoding, inverse discrete cosine transform (IDCT), and motion compensation(MC). In order to achieve real time video decoding, every step should be considered to be optimized further. Upgrading CPU power can speed up the whole processing, however, not only does it increase the whole PC price but it sometimes can not achieve real time video decoding by upgrading CPU power alone. There are graphic card manufacture providing display card that can implement motion compensation to accelerate the decoding process. Some display cards have already supported for both MC and IDCT. Hence, parsing bits and variable length decoding become very important in terms of speeding up the video decoding process, especially HDTV video because of its large number of bits.
The existing methods of parsing bits and variable length decoding are not fast enough under certain CPU power to achieve real time video decoding. There is a big room to be improved. The faster the parsing bits and variable length decoding, the less CPU power required so that the PC-DTV receiver functionality can be achieved under lower price. Even in the future as the hardware price going down, the improved method will take less CPU power so that more applications can be run concurrently.
There are many ways to implement MPEG-2 variable length decoding from parsing bits to Huffman variable length code decoding. Some parse bits using byte by byte method, using two 32-bit integers 32-bit by 32-bit method, or using field structure to get the value of certain bits from decoding buffer. Some decode macroblock address increment by checking macroblock escape code first; some decode motion code by first decoding the absolute value of motion code then sign bit; some decode DCT coefficients of macroblocks by first decoding the absolute value then sign bit. It should be noted that those methods are slow and not exactly suitable for DTV receiver application.
The conclusion is that it is necessary to provide such an efficient method to improve video decoding process for PC-DTV receiver application, especially for HDTV application.
On a PC configured with Intel Pentium® 3 processor, to achieve high performance decoding process is to highly utilize 32-bit operation and MMX instruction besides the good algorithms. In MPEG-2 standard, certain number of bits stands for the meaningful value, and the number of bits ranges from 1 to 32. For example, the start code prefix is a string of twenty-three bits with the value zero followed by a single bit with the value one. The start code prefix is thus the bit string ‘0000 0000 0000 0000 0000 0001’. The sequence start code is a string of 32 bits with the hexadecimal value 0x1B3. The string ‘0000 1011’ in variable length code for motion code stands for the value −5.
In order to optimize MPEG-2 video variable length decoding in performance, it is necessary to design methods to highly utilize the 32-bit operation and MMX instruction, and it is necessary to design methods to follow the optimization rules in term of the use of Intel Architecture. Fully understanding MPEG-2 video standard and highly understanding the Intel Architecture is needed to design such high performance methods.
A decoder that parses bits using byte by byte method does not utilize the 32-bit operation; a decoder that parse bits using two 32-bit integers 32-bit by 32-bit method does not utilize MMX instructions; a decoder that parse bits using field structure to get the value of certain bits from decoding buffer will involve a lot of computation from assembly point of view. Hence they are slow.
Further, a decoder that decodes macroblock address increment by first checking macroblock escape code is slower than that by first checking the increment value one for I-picture decoding, since all macroblock address increment have the increment value one in I-picture.
When a decoder decodes motion code by first decoding the absolute value of motion code, then sign bit is slower than that by decoding the signed value of motion code using the properly designed variable length motion code tables. When a decoder decodes DCT coefficients in macroblocks by first decoding the absolute value then sign bit is slower than that by decoding the signed value of DCT coefficients using the properly designed DCT coefficient variable length code tables.