Digital signal compression using a coder/decoder (codec) allows streaming media, such as audio or video signals to be transmitted over the Internet or stored on compact discs. A number of different codecs have been developed that follow various compression standards. MPEG-4 AVC (Advanced Video Coding), also known as H.264, is a video compression standard that offers significantly greater compression than its predecessors. The H.264 standard is expected to offer up to twice the compression of the earlier MPEG-2 standard. The H.264 standard is also expected to offer improvements in perceptual quality. As a result, more and more video content is being delivered in the form of AVC(H.264)-coded streams. Two rival DVD formats, the HD-DVD format and the Blu-Ray Disc format support H.264/AVC High Profile decoding as a mandatory player feature. AVC(H.264) coding is described in detail in “Draft of Version 4 of H.264/AVC (ITU-T Recommendation H.264 and ISO/IEC 14496-10 (MPEG-4 part 10) Advanced Video Coding)” by Gary Sullivan, Thomas Wiegand and Ajay Luthra, Joint Video Team (JVT) of ISO/IEC MPEG & ITU-T VCEG (ISO/IEC JTC1/SC29/WG11 and ITU-T SG16 Q.6), 14th Meeting: Hong Kong, CH 18-21 January, 2005, the entire contents of which are incorporated herein by reference for all purposes.
AVC(H.264), like many other codecs uses a layer of encoding referred to as entropy encoding. Entropy encoding is a coding scheme that assigns codes to signals so as to match code lengths with the probabilities of the signals. Typically, entropy encoders are used to compress data by replacing symbols represented by equal-length codes with symbols represented by codes proportional to the negative logarithm of the probability. AVC(H.264) supports 2 entropy encoding schemes, Context Adaptive Variable Length Coding (CAVLC) and Context Adaptive Binary Arithmetic Coding (CABAC). Since CABAC tends to offer about 10% more compression than CAVLC, CABAC is favored by many video encoders in generating AVC(H.264) bitstreams. Decoding the entropy layer of AVC(H.264)-coded data streams can be computationally intensive and may present challenges for devices that decode AVC(H.264)-coded bitstreams using general purpose microprocessors. To decode high bit-rate streams targeted by the Blu-ray or the HD-DVD standards, the hardware needs to be very fast and complex, and the overall system cost could be really high. One common solution to this problem is to design special hardware for CABAC decoding. However, such special hardware can increase the cost of devices such as DVD players, game consoles, and the like that need to decode AVC(H.264)-encoded bitstreams.
The Cell is a general purpose microprocessor and media processor jointly developed by Sony, Toshiba and IBM. The basic configuration of a current generation of the Cell is composed of 1 “Power Processor Element” (“PPE”), and 8 “Synergistic Processing Elements” (“SPE”). An SPE is a Reduced Instruction Set Computing (RISC) processor with 128-bit Single Instruction Multiple Data (SIMD) organization for single and double precision instructions. At 3.2 GHz, each SPE gives a theoretical 25.6 billion floating point operations per second (GFLOPS) of performance, which largely dwarfs the abilities of the SIMD unit in typical desktop CPUs like the Pentium 4 and the Athlon 64. This computing power makes a Cell processor potentially capable of decoding AVC(H.264) high definition streams in real time alone without any help from other hardware.
The Cell's enormous computing power may be attributed to the SIMD structure in SPEs. However, the SIMD structure becomes effective only when the algorithm that utilizes the SPEs is parallelizable. Since the process of CABAC decoding is genetically sequential, the speedup offered by SIMD has not heretofore been utilized to its fullest potential. While traditional performance bottlenecks like inverse discrete cosine transformation (IDCT) may be eliminated by the SIMD structure in SPEs, CABAC decoding presents a potential new bottleneck holding back the overall computational performance of AVC decoding using the Cell. If the task of CABAC decoding is not efficiently carried out, one Cell processor alone would not be able to decode high definition CABAC streams in real time.
It is within this context that embodiments of the present invention arise.