A significant amount of video content is currently available in the MPEG-2 format. Furthermore, a large number of both cable set top boxes and satellite set top boxes that only support the MPEG-2 format are currently deployed. Therefore, compatibility with the MPEG-2 standard will remain important for years to come.
An H.264/MPEG4-AVC digital video standard (H.264 for short) is an emerging new format for consumer video, particularly in both new broadcast and High-Definition (HD) Digital Versatile Disk (DVD) applications. As H.264-based content and products become available, transcoding in both directions between the H.264 standard and the MPEG-2 standard will become widely used capabilities. Anticipated consumer applications include reception of MPEG-2 broadcasts by a personal video recorder (PVR) and transcoding to H.264 for saving on disk storage space. Professional applications are also widely anticipated. Such applications include MPEG-2 to H.264 transcoding for content received at a headend facility in the MPEG-2 format converted into the H.264 format for distribution at a lower bandwidth. In another example, MPEG-2 to H.264 transcoding could be used to save bandwidth for expensive transmission media such as satellite links. Furthermore, the consumer market is a large market with strict complexity/cost constraints that will benefit substantially from an efficient and effective transcoding technology.
Conventional transcoding solutions use some or all of the following techniques. Basic transcoding is achieved by decoding in one format and then re-encoding in another. Information from the bitstream being decoded is reused to seed the encoding of the other format. Picture-type decisions are reused so that a Group of Pictures structure of the transcoded bitstream is the same as the original stream. A look ahead in the compressed original bitstream is used for rate-control of the bitstream being encoded. An MPEG-2 bitstream is decoded in a native macroblock order (i.e., simple raster scan) and encoded into an H.264 bitstream in a simple raster scan order or in a macroblock pair raster scan order. Mode decisions of individual macroblocks are reused in determining the mode of corresponding macroblocks in the transcoded bitstream. Furthermore, motion compensation partitioning decisions are reused so that only the subset of partition sizes that are available in MPEG-2 is used in the H.264 bitstream.
The conventional solutions are inefficient for transcoding between video standards in cases where one format consistently uses either field coding or frame coding within an independently decodable sequence of pictures (i.e., MPEG-2 GOP) and the other format may switch on a picture basis between field, frame, and frame MBAFF coding (i.e., H.264). Inefficiencies are also experienced where one format has multiple ways to partition a macroblock for motion compensation (i.e., H.264 16×16, 8×16, 16×8, 8×8, etc. partitions) and the other format has only a few options (i.e., MPEG-2 16×16 or 16×8 field partitions). Cases where one format supports quarter-pixel accurate motion compensation and the other format only supports half-pixel accurate motion compensation can result in weak transcoding. In addition, efficiency suffers where one format uses a field/frame macroblock decision independently for each macroblock (i.e., MPEG-2) and the other format uses a field/frame macroblock decision for a macroblock pair (i.e., H.264).
Further, conventional solutions are not optimal for implementation on hardware architectures that contain independent dedicated hardware units for parallel motion estimation, MPEG-2 decoding and H.264 encoding. The conventional solutions result in non-optimal coding performance by implementing complexity reduction methods that are unnecessary for such hardware architectures. Such methods include maintaining the predictions as in the decoded bitstream and re-coding only the transform-domain residual. The half-pixel accurate MPEG-2 motion vectors are reused without refining the H.264 motion vectors to quarter-pixel accuracy. Furthermore, the motion compensation partition of the decoded bitstream is reused instead of performing motion estimation refinement for all possible motion compensation partitions available in H.264. As such, the conventional methods result in sub-optimal coding efficiency in the H.264 output stream because the more powerful prediction capabilities of the H.264 standard over the MPEG-2 standard are not being utilized.