Wireless (as well as wired) networks are able to provide increasingly rich media content to client devices. However, a limitation of some client devices, particularly mobile client devices, is that these devices may not have the resources (e.g., the display real estate) to render the rich content that is traditionally created for more resourceful devices such as desktop computers or DVDs (digital video disks). Moreover, the spectrum of client devices that are available have widely varying capabilities and attributes; that is, a network typically must serve a heterogeneous mix of devices. Furthermore, a wireless network typically has less bandwidth than a conventional wired network. Therefore, there is a need to adapt the original media content to the attributes of a receiving client device as well as to the capabilities of the network. This function is performed by network nodes commonly referred to as transcoders.
A transcoder takes a compressed, high resolution, high bit rate media stream as input, then processes it to produce another compressed media stream, at a reduced resolution and bit rate, as output. The original content may have been coded at, for example, a resolution of 720×480 pixels and a bit rate of two to eight Mbps for DVDs, or at a resolution of 320×240 pixels and a bit rate of 1.5 Mbps for desktop clients connected to the Internet through a T1 line. However, due to the characteristics of mobile communication (e.g., lower bandwidth channels and limited display capabilities), lower bit rates and reduced resolution are desired.
A straightforward method for transcoding media content is to decode (decompress) the original (input) stream, downsample the decoded frames to a smaller size, and re-encode (recompress) the downsampled frames to a lower bit rate. However, this method can consume an extensive amount of the computational resources available on a transcoder. Because a transcoder is expected to conduct sessions with many different types of client devices, and to concurrently conduct as many sessions as possible, the straightforward method is impractical. Thus, it is desirable to develop fast and efficient transcoding methods to reduce the load on computational resources. The present invention provides a novel solution to this need.
Another method for transcoding re-uses motion vectors in the original video, therefore avoiding a costly motion estimation process. Since the derived motion vectors may not be perfectly aligned with the old ones, a drift compensation loop consisting of inverse transform and motion compensation (or transform domain motion compensation) modules is required. However, the motion compensation is always performed based on full macroblock, which makes the drift compensation based on full-resolution frames inevitable. This renders the drift compensation loop as the next computational bottleneck.