With the increasing popularity of playing streaming audio and video over networks such as the internet, there is a need for optimizing the data transferred from a server to a client such that the client's experience is maximized even if network conditions during playback are inconsistent. Optimizing the client's experience involves choosing a quality level for encoding the audio and video portions of the video playback such that the video can be transferred and reconstructed uninterrupted while preserving the quality of the video content.
The quality level is generally dictated by the bit rate specified for the encoded audio or video portions of the input stream. A higher bit rate generally indicates that a larger amount of information about the original audio or video is encoded and retained, and therefore a more accurate reproduction of the original input audio or video will be presented during video playback. Conversely, a lower bit rate indicates that less information about the original input audio or video is encoded and retained, and thus a less accurate reproduction of the original audio or video will be presented during video playback.
Generally, the bit rate is specified for encoding each of the audio and video based on several factors. The first factor is the network condition between the server and the client. A network connection that can transfer a high amount of data indicates that a higher bit rate can be specified for the input video that is subsequently transferred over the network connection. The second factor is the desired start-up latency. Start-up latency is the delay that a video playback tool experiences when first starting up due to the large amount of data that has to be received, processed, and buffered. The third factor is the tolerance to glitching. Glitching is when video playback has to stop because data is missing. In most cases any amount of start-up latency or glitching is intolerable, and it is therefore desirable to optimize the bit rate specified such that the start-up latency and the glitching are minimized or eliminated.
Currently available commercial streaming media systems rely on multi bit rate (MBR) coding to perform coding rate control. In MBR coding, source video content is encoded into alternative bit streams at different coding rates and typically stored in the same media file at the server. This then allows the content to be streamed in segments or chunks at varying levels of quality corresponding to different coding rates according to the changing network conditions, typically using bit stream switching between segments.
The currently available multi bit rate video streaming systems use a constant bit rate approach to encoding each alternative video stream. However, a typical video will generally include scenes having a wide variety of visual complexity. However, the constant bit rate approach can not efficiently encode video segments with different quality. The constant bit rate approach unnecessarily spends too many bits for encoding low complexity video segments, and conversely the high complexity scenes are allocated too few bits. Consequently, the constant bit rate approach to encoding the alternative streams results in video quality for internet streaming that is undesirable and inconsistent.
The currently available multi bit rate video streaming systems also have a further requirement for the final display resolution to be fixed. By maintaining a fixed display resolution, the video streams at the multiple bit rates can all be decoded and scaled to this same final display resolution in order to achieve a glitch free video presentation. With the fixed display resolution, the various alternative video streams can have a wide range of bit rates from a few megabits per second to a few kilobits per second. One problem is to match an appropriate video resolution to each video stream bit rate. The currently available multi bit rate video streaming systems use a pre-defined encoding resolution, which again may not be well suited to the varying complexity of the video scenes. For low complexity video, the pre-defined resolution may be too small. For complex video, the pre-defined resolution may be too large.