The invention relates generally to video hosting systems, and more particularly to a video coding system for transcoding of videos with optimized visual quality under encoding time and bitrate constraints.
Video hosting services, such as YOUTUBE, allow users to post videos. Most video hosting services transcode an original source video from its native encoded format (e.g., MOV) into one or more output formats (e.g., ADOBE FLASH or Windows Media Video (WMV)). Transcoding comprises decoding the source video from the native format into an unencoded representation using a video codec for the native format and then encoding the unencoded representation with video codecs for the output formats. Transcoding can be used to reduce storage requirements, and also to reduce the bandwidth requirements for serving the video to clients.
One challenge in designing a video coding system for video hosting services with millions of videos is to transcode and to store the videos with acceptable visual quality and at a reasonable computing cost. A particular problem is the efficient allocation of coding bits and computations to achieve an optimized rate-distortion (R-D) and computing time of a source video. Generally, given a target resolution and frame rate, a video's visual quality is determined by its encoding bitrate computed using a rate control algorithm. Conventional video encoding systems use a variety of encoding strategies to obtain an optimized rate-distortion of a source video, including one-pass and multi-pass Average Bitrate Encoding (ABR), Constant Bitrate Encoding (CBR), Constant Quantizer Encoding (CQP) and Constant Rate Factor Encoding (CRF).
Conventional encoding strategies fail to provide encoded videos with constant visual quality while meeting the bitrate constraint associated with the videos and do not optimize bitrate, distortion and complexity jointly. For example, an ABR encoding strategy uses scaling factors, and long-term and short-term compensation to achieve a target bitrate and to meet network bandwidth constraint. But the visual quality of ABR encoding may fluctuate when video scenes change. A CBR encoding strategy is designed for real-time streaming with constant bitrate, which is controlled by a storage buffer with a fixed size. CBR provides the highest encoding speed but the lowest R-D performance among the above-mentioned conventional encoding strategies. A CQP encoding strategy maintains a constant quantizer and compresses every frame using the same quantization parameter (QP). CQP may cause temporal perceptual fluctuation of encoded videos, especially when it uses large quantizers on videos with intensive scene changes. A CRF encoding strategy aims to achieve a constant visual quality with a constant rate factor. CRF encodes a video with a nominal quantizer, but increases the QP when a scene has a lot of action and motion and vice versa. The disadvantage of CRF encoding is that the output video file size is unpredictable due to the varying scenes in the video content. Thus, it is hard to choose appropriate constant rate factor values to meet a required bitrate constraint of a network or storage system.