The present invention relates to compression of multimedia data and, in particular, to a video transcoder that allows a generic MPEG-4 decoder to decode MPEG-2 bitstreams. Temporal and spatial size conversion (downscaling) are also provided.
The following acronyms and terms are used:
CBP—Coded Block Pattern
DCT—Discrete Cosine Transform
DTV—Digital Television
DVD—Digital Video Disc
HDTV—High Definition Television
FLC—Fixed Length Coding
IP—Internet Protocol
MB—Macroblock
ME—Motion Estimation
ML—Main Level
MP—Main Profile
MPS—MPEG-2 Program Stream
MTS—MPEG-2 Transport Stream
MV—Motion Vector
QP—quantization parameter
PMV—Prediction Motion Vector
RTP—Real-Time Transport Protocol (RFC 1889)
SDTV—Standard Definition Television
SIF—Standard Intermediate Format
SVCD—Super Video Compact Disc
VLC—Variable Length Coding
VLD—Variable Length Decoding,
VOP—Video Object Plane
MPEG-4, the multimedia coding standard, provides a rich functionality to support various applications, including Internet applications such as streaming media, advertising interactive gaming, virtual traveling, etc. Streaming video over the Internet (multicast), which is expected to be among the most popular application for the Internet, is also well-suited for use with the MPEG-4 visual standard (ISO/IEC 14496-2 Final Draft of International Standard (MPEG-4), “Information Technology—Generic coding of audio-visual objects, Part 2: visual,” December 1998).
MPEG-4 visual handles both synthetic and natural video, and accommodates several visual object types, such as-video, face, and mesh objects. MPEG-4 visual also allows coding of an arbitrarily shaped object so that multiple objects can be shown or manipulated in a scene as desired by a user. Moreover, MPEG-4 visual is very flexible in terms of coding and display configurations by including enhanced features such as multiple auxiliary (alpha) planes, variable frame rate, and geometrical transformations (sprites).
However, the majority of the video material (e.g., movies, sporting vents, concerts, and the like) which is expected to be the target of streaming video is already compressed by the MPEG-2 system and stored on storage media such as DVDs, computer memories (e.g., server hard disks), and the like. The MPEG-2 System specification (ISO/IEC 13818-2 International Standard (MPEG-2), “Information Technology—Generic coding of Moving Pictures and Associated Audio: Part 2—Video,” 1995) defines two system stream formats: the MPEG-2 Transport Stream (MTS) and the MPEG-2 Program Stream (MPS). The MTS is tailored for communicating or storing one or more programs of MPEG-2 compressed data and also other data in relatively error-prone environments. One typical application of MTS is DTV. The MPS is tailored for relatively error-free environments. The popular applications include DVD and SVCD.
Attempts to address this issue have been unsatisfactory to date. For example, the MPEG-4 studio profile (O. Sunohara and Y. Yagasaki, “The draft of MPEG-4 Studio Profile Amendment Working Draft 2.0,” ISO/IEC JTC1/SC29/WG11 MPEG99/5135, October 1999) has proposed a MPEG-2 to MPEG-4 transcoder, but that process is not applicable to the other MPEG-4 version 1 profiles, which include the Natural Visual profiles (Simple, Simple Scaleable, Core, Main, N-Bit), Synthetic Visual profiles (Scaleable Texture, Simple Face Animation), and Synthetic/Natural Hybrid Visual (Hybrid, Basic Animated Texture). The studio profile is not applicable to the Main Profile of MPEG-4 version 1 since it modifies the syntax, and the decoder process is incompatible with the rest of the MPEG-4version 1 profiles.
The MPEG standards designate several sets of constrained parameters using a two-dimensional ranking order. One of the dimensions, called the “profile” series, specifies the coding features supported. The other dimension, called “level”, specifies the picture resolutions, bit rates, and so forth, that can be accommodated.
For MPEG-2, the Main Profile at Main Level, or MP@ML, supports a 4:2:0 color subsampling ratio, and I, P and B pictures. The Simple Profile is similar to the Main Profile but has no B-pictures. The Main Level is defined for ITU-R 601 video, while the Simple Level is defined for SIF video.
Similarly, for MPEG-4, the Simple Profile contains SIF progressive video (and has no B-VOPs or interlaced video). The Main Profile allows B-VOPs and interlaced video.
Accordingly, it would be desirable to achieve interoperability among different types of end-systems by the use of MPEG-2 video to MPEG-4 video transcoding and/or MPEG-4-video to MPEG-2-video transcoding. The different types of end-systems that should be accommodated include:
Transmitting Interworking Unit (TIU): Receives MPEG-2 video from a native MTS (or MPS) system and transcodes to MPEG-4 video and distributes over packet networks using a native RTP-based system layer (such as an IP-based internetwork). Examples include a real-time encoder, a MTS satellite link to Internet, and a video server with MPS-encoded source material.
Receiving Interworking Unit (RIU): Receives MPEG-4 video in real time from an RTP-based network and then transcodes to MPEG-2 video (if possible) and forwards to a native MTS (or MPS) environment. Examples include an Internet-based video server to MTS-based cable distribution plant.
Transmitting Internet End-System (TIES): Transmits MPEG-2 or MPEG-4 video generated or stored within the Internet end-system itself, or received from internet-based computer networks. Examples include a video server.
Receiving Internet End-System (RIES): Receives MPEG-2 or MPEG-4 video over an RTP-based internet for consumption at the Internet end-system or forwarding to a traditional computer network. Examples include a desktop PC or workstation viewing a training video.
It would be desirable to determine similarities and differences between MPEG-2 and MPEG-4 systems, and provide transcoder architectures which yield a low complexity and small error.
The transcoder architectures should be provided for systems where B-frames are enabled (e.g., main profile), as well as a simplified architecture for when B-frames are not used (simple profile).
Format (MPEG-2 to MPEG-4) and/or size transcoding should be provided.
It would also be desirable to provide an efficient mapping from the MPEG-2 to MPEG-4 syntax, including a mapping of headers.
The system should include size transcoding, including spatial and temporal transcoding.
The system should allow size conversion at the input bitstream or output bitstream of a transcoder.
The size transcoder should convert a bitstream of ITU-R 601 interlaced video coded with MPEG-2 MP@ML into a simple profile MPEG-4 bitstream which contains SIF progressive video suitable, e.g., for a streaming video application.
The system should provide an output bitstream that can fit in the practical bandwidth for a streaming video application (e.g., less than 1 Mbps).
The present invention provides a system having the above and other advantages.