H.264 is an advanced video coding (AVC) standard for video compression, and is currently one of the most commonly used formats for the recording, compression, and distribution of high definition video. Transcoding is the direct digital-to-digital data conversion of one encoding to another, such as for movie data files or audio files. This is typically performed in cases where a target device (or workflow) does not support the format or has limited storage capacity that mandates a reduced file size, or to convert incompatible or obsolete data to a better-supported or modern format. Transcoding may be performed at various times such as, for example, while files are being searched or when files are prepared for presentation.
H.264-based video transcoding with spatial resolution conversion may be performed by decoding a compressed bitstream, downsampling the decoded video sequence, and then encoding the down-scaled video data with an H.264 encoder. However, the encoding process associated with H.264-based video transcoding is highly complex. For example, a motion compensated prediction used during the encoding process may be time consuming and involve complex motion search and mode decision algorithms.
A coded video sequence in H-264/AVC consists of a sequence of coded pictures. Each coded picture may represent an entire frame or a single field. The coded picture is partitioned into fixed-size macroblocks that cover a rectangular picture area of samples. A macroblock typically consists of 16×16 or 8×8 samples, and is further subdivided into transform blocks, and may be further subdivided into prediction blocks.
In a typical encoding process, a rate distortion optimization technique may be used to determine encoding mode decision information, whereby for each macroblock, all supported encoding modes, (i.e., skip, inter 16×16, inter 16×8, inter 8×16, inter 8×8, inter 8×4, inter 4×8, inter 4×4, intra 16×16, intra 8×8 and intra 4×4), are considered to calculate the rate distortion cost and select the best mode which has the minimum rate distortion cost. However, this encoding process is significantly time consuming.