The H.264 specification, also known as the Advanced Video Coding (AVC) standard, is a high compression digital video codec standard produced by the Joint Video Team (JVT), and is identical to ISO MPEG-4 part 10. The H.264 standard is herein incorporated by reference in its entirety.
H.264 CODECs can encode video with approximately three times fewer bits than comparable MPEG-2 encoders. This significant increase in coding efficiency (e.g., good video quality at bit rates below 2 Mbps) means that more quality video data can be sent over the available channel bandwidth. In addition, video services can now be offered in environments where they previously were not possible. H.264 CODECs would be particularly useful, for instance, in high definition television (HDTV) applications, bandwidth limited networks (e.g., streaming mobile television), personal video recorder (PVR) and storage applications for home use, and other such video delivery applications (e.g., digital terrestrial TV, cable TV, satellite TV, video over xDSL, DVD, and digital and wireless cinema).
In general, all standard video processing (e.g., MPEG-2 or H.264) encodes video as a series of pictures. For video in the interlaced format, the two fields of a frame can be encoded together as a frame picture, or encoded separately as two field pictures. Note that both types of encoding can be used in a single interlaced sequence. The output of the decoding process for an interlaced sequence is a series of reconstructed fields. For video in the progressive format, all encoded pictures are frame pictures. Here, the output of the decoding process is a series of reconstructed frames.
Encoded pictures are classified into three types: I, P, and B. I-type pictures represent intra coded pictures, and are used as a prediction starting point (e.g., after error recovery or a channel change). Here, all macro blocks are coded without prediction. P-type pictures represent predicted pictures. Here, macro blocks can be coded with forward prediction with reference to previous I-type and P-type pictures, or they can be intra coded (no prediction). B-type pictures represent bi-directionally predicted pictures. Here, macro blocks can be coded with forward prediction (with reference to previous I-type and P-type pictures), or with backward prediction (with reference to next I-type and P-type pictures), or with interpolated prediction (with reference to previous and next I-type and P-type pictures), or intra coded (no prediction). Note that in P-type and B-type pictures, macro blocks may be skipped and not sent at all. In such cases, the decoder uses the anchor reference pictures for prediction with no error.
The advanced coding techniques of the H.264 specification operate within a similar scheme as used by previous MPEG standards. The higher coding efficiency and video quality are enabled by a number of features, including improved motion estimation and inter prediction, spatial intra prediction and transform, and context-adaptive binary arithmetic coding (CABAC) and context-adaptive variable length coding (CAVLC) algorithms.
As is known, motion estimation is used to support inter picture prediction for eliminating temporal redundancies. Spatial correlation of data is used to provide intra picture prediction (prior to the transform). Residuals are constructed as the difference between predicted images and the source images. Discrete spatial transform and filtering is used to eliminate spatial redundancies in the residuals. H.264 also supports entropy coding of the transformed residual coefficients and of the supporting data such as motion vectors.
Entropy is a measure of the average information content per source output unit, and is typically expressed in bits/pixel. Entropy is maximized when all possible values of the source output unit are equal (e.g., an image of 8-bit pixels with an average information content of 8 bits/pixel). Coding the source output unit with fewer bits, on average, generally results in information loss. Note, however, that the entropy can be reduced so that the image can be coded with fewer than 8 bits/pixel on average without information loss.
The H.264 specification provides two alternative processes of entropy coding—CABAC and CAVLC. CABAC provides a highly efficient encoding scheme when it is known that certain symbols are much more likely than others. Such dominant symbols may be encoded with extremely small bit/symbol ratios. CABAC continually updates frequency statistics of the incoming data, and adaptively adjusts the coding algorithm in real-time. CAVLC uses multiple variable length codeword tables to encode transform coefficients. The codeword best table is selected adaptively based on a priori statistics of already processed data. A single table is used for non-coefficient data.
The H.264 specification provides for seven profiles each targeted to particular applications, including a Baseline Profile, a Main Profile, an Extended Profile, and four High Profiles. The Baseline Profile supports progressive video, uses I and P slices, CAVLC for entropy coding, and is targeted towards real-time encoding and decoding for CE devices. The Main Profile supports both interlaced and progressive video with macro block or picture level field/frame mode selection, and uses I, P, B slices, weighted prediction, as well as both CABAC and CAVLC for entropy coding. The Extended Profile supports both interlaced and progressive video, CAVLC, and uses I, P, B, SP, SI slices.
The High Profile extends functionality of the Main Profile for effective coding. The High Profile uses adaptive 8×8 or 4×4 transform, and enables perceptual quantization matrices. The High 10 Profile is an extension of the High Profile for 10-bit component resolution. The High 4:2:2 Profile supports 4:2:2 chroma format and up to 10-bit component resolution (e.g., for video production and editing). The High 4:4:4 Profile supports 4:4:4 chroma format and up to 12-bit component resolution. It also enables lossless mode of operation and direct coding of the RGB signal (e.g., for professional production and graphics).
Given that the H.264 standard is relatively new, there is currently a limited selection of available H.264 coding architectures. What is needed, therefore, are coding architectures that are H.264 enabled.