Given the same video quality results, the Video Coding Standard H.264 can compress video files to half the file size as previous video coding standards like MPEG4. The degree of compression depends on how close the predicted data is to the original data to be coded. But Predictive Coding has an overhead because it requires the inclusion of reference data that is needed later to generate the predicted data.
Both the Inter and Intra-frame compression techniques used in Video Coding Standard H.264 are based on predictive coding. It is therefore relatively more complex when compared with video coding standards such as MPEG2, MPEG4, VC-1, etc. Obtaining even low-resolution video frames/video thumbnails from H.264 video files can be very complex, at least when using conventional methods. Nevertheless, being able to decode lower resolution frames from high-resolution compressed video frames is desirable for a lot of reasons.
Conventional ways to decode lower resolution frames from high-resolution compressed video frames include full frame decoding and downscaling, partial frame decoding, and decoding from a hierarchically coded bit-stream. A full resolution image is decoded and then the full resolution decoded image is scaled down to the desired lower resolution. Such scaling usually includes anti-aliasing filtering/averaging and down sampling.
In partial frame decoding, the data in many bitstreams is available in the transform domain, e.g., JPEG, Intra Frame of Video Coding standards such as WMV7, WMV8, WMV9, MPEG1, MPEG2, MPEG4 and H.261, H.263 etc. It is therefore possible to decode low-resolution frames by simply decoding a few, low-frequency coefficients. MPEG4 uses AC and DC prediction in the transformed domain, so the AC and DC prediction is done prior to the decoding of a low-resolution frame. Instead of taking an 8×8 inverse transform, a 1×1, 2×2, or 4×4 inverse transform is taken of the 1×1, 2×2, or 4×4 block located in a larger block, like an 8×8 block.
In hierarchically coded bitstreams, the bitstreams are encoded with both a low-resolution bitstream and a corresponding enhancement layer bitstream. Just the low-resolution bitstreams need to be decoded to get low-resolution images or video. Getting the high-resolution image/video frames includes decoding both the low-resolution and high-resolution bitstreams.
H.264 encodes Intra information differently than do previous video coding standards like MPEG1, MPEG2, MPEG4, H.263, WMV7, WMV8, etc. A prediction for a current block is generated from reference pixels that are in the top and left side of the current block. These reference pixels are already encoded and decoded, and are available for generating the prediction for the current block. The prediction generated is then subtracted from the current block, and a residual error is obtained, e.g., Residual Block=Current Block−Prediction Block. The residual block is transformed, quantized, and the run length symbols generated are entropy coded. The coded residual block and the coded prediction mode are then formatted into a video bitstream.
H.264 uses various block sizes and various prediction modes for coding. H.264 currently uses 16×16, 8×8 and 4×4 block sizes to code the data according to the Intra compression method.
In the coding of the luminance 16×16 Intra prediction mode according to H.264, the data for the current Intra Luminance 16×16 Block is predicted in four ways:
Intra 16×16 luminance Mode 0—Prediction in the Vertical direction;
Intra 16×16 luminance Mode 1—Prediction in the Horizontal direction;
Intra 16×16 luminance Mode 2—DC Prediction; and
Intra 16×16 luminance Mode 3—Plane Prediction.
Reference Pixels at the top and left side are used to code a 16×16 block.
In the coding of Chrominance 8×8 Intra Prediction Mode according to H.264, the data for the current Intra Chrominance 8×8 Block is predicted in four ways:
Intra 8×8 chrominance Mode 0—DC Prediction Mode;
Intra 8×8 chrominance Mode 1—Horizontal Prediction Mode;
Intra 8×8 chrominance Mode 2—Vertical Prediction Mode; and
Intra 8×8 chrominance Mode 3—Plane Prediction.
For the encoding of Luminance Intra 4×4 Blocks, Luminance Intra 4×4 Prediction Mode prediction is generated from the pixels (I to L, M, and A to H) that lie to the immediate left and top of a current block.
TABLE I
In Table-I, the sixteen pixels labeled “a” to “p” represent a current 4×4 block to be coded. Pixels I-L, M, and A-H are neighboring reference pixels immediately to the left and above that are used in nine different ways to generate a prediction for the current block along the vertical direction, the horizontal direction, DC, the diagonal down left direction, diagonal down right direction, the vertical right direction, the horizontal down direction, the vertical left direction, and the horizontal up direction.
H.264 uses predictive coding to code the Intra prediction mode of the current Intra block. It uses a flag to indicate whether the predicted mode is to be used or not. If a predicted mode is not used, it sends three extra bits to specify the current prediction mode.
TABLE II
In an example represented in Table-II, a block C is a current block to be coded given neighboring blocks A and B. A prediction, “predIntraC×CPredMode” for the Intra prediction mode of current Intra block is generated in the following way:
predIntraCxCPredMode = min (intraMxMPredModeA,intraMxMPredModeB), where A and B can be of the same block size as C, or A and Bcan have a block size larger than C. For example, A can be ofsize 4x4 and B can be of size 8x8.If (predIntraCxCPredMode == Intra Prediction Mode of current Block) Use_Pred_Mode_Flag = 1Else Use_Pred_Mode_Flag = 0.When, Use_Pred_Mode_Flag is zero, three bits follow it to specifyone of eight remaining prediction modes,If (CurrentIntraMode < predIntraCxCPredMode) RemIntraMode = CurrentIntraModeElse RemIntraMode = CurrentIntraMode −1
A typical Set Top Box is an application in which multiple channels are available for decode and display. What is needed is a quick, low power method to allow a user to see snapshots of multiple video bitstreams so they can choose that video to play. But, conventional decoding and display of multiple H.264 video streams would ordinarily be a time and power consuming task.