Compression (also called coding or encoding) decreases the cost of storing and transmitting media by converting the media into a lower bitrate form. Decompression (also called decoding) reconstructs a version of the original media from the compressed form.
When it converts media to a lower bitrate form, a media encoder can decrease the quality of the compressed media to reduce bitrate. By selectively removing detail in the media, the encoder makes the media simpler and easier to compress, but the compressed media is less faithful to the original media. Aside from this basic quality/bitrate tradeoff, the bitrate of the media depends on the content (e.g., complexity) of the media and the format of the media.
Media information is organized according to different formats for different devices and applications. Many attributes of format relate to resolution. For video, for example, spatial resolution gives the width and height of a picture in samples or pixels (e.g., 320×240, 640×480, 1280×720, 1920×1080). Temporal resolution is usually expressed in terms of number of pictures per second (e.g., 30, 29.97, 25, 24 or 23.976 frames per second for progressive video, or 60, 59.94 or 50 fields per second for interlaced video). Typically, quality and bitrate vary directly for resolution, with higher resolution resulting in higher quality and higher bitrate.
Delivering media content over the Internet and other computer networks has become more popular. Generally, a media server distributes media content to one or more media clients for playback. Media delivery over the Internet and some other types of networks is characterized by bandwidth that varies over time. If the bitrate of media content is too high, the media content may be dropped by the network, causing playback by the media client to stall. The media client can buffer a large portion of the media content before playback begins, but this results in a long delay before playback starts. On the other hand, if the bitrate of the media content is much lower than the network could deliver, the quality of the media content played back will be lower than it could be. By adjusting bitrate of media content so that bitrate more closely matches available network bandwidth, a media server can improve the media client's playback experience.
Scalable media encoding facilitates delivery of media when network bandwidth varies over time or when media clients have different capabilities. Multiple bitrate (MBR) video encoding is one type of scalable video encoding. A MBR video encoder encodes a video segment to produce multiple video streams (also called layers) that have different bitrates and quality levels, where each of the streams is independently decodable. A media server (or servers) can store the multiple streams for delivery to one or more media clients. A given media client receives one of the multiple streams for playback, where the stream is selected by the media client and/or media server considering available network bandwidth and/or media client capabilities. If the network bandwidth changes during playback, the media client can switch to a lower bitrate stream or higher bitrate stream. Ideally, switching between streams is seamless and playback is not interrupted, although quality will of course change.
For example, a MBR video encoder receives a segment of high-resolution video such as video with a resolution of 1080p24 (height of 1080 pixels per progressive frame, 24 frames per second, or, in some cases, 23.976 frames per second) and encodes the high-resolution video for output as layers with 12 different bitrates. Of the 12 layers, a high bitrate layer might have the original 1080p24 resolution with little loss in quality, a next lower bitrate layer might have the original 1080p24 resolution with more loss in quality, and so on, down to a lowest bitrate layer with 640×360 spatial resolution and the most loss in quality. In MBR video encoding, adjustments to spatial resolution and the level of encoding quality for “lossy” compression are most common, but temporal resolution can also vary between the multiple layers output by the MBR video encoder.
A MBR video encoder typically produces the multiple output streams by separately encoding the input video for each stream. In addition, the MBR video encoder may perform encoding multiple times for a given output stream so that the stream has the target bitrate set for the stream. Video encoding for a single stream can be computationally intensive. Time constraints on media encoding and delivery (e.g., for live sporting events) may require that even more resources be dedicated to encoding. Because it involves encoding and re-encoding for multiple output streams, MBR video encoding can consume a significant amount of computational resources, especially for high resolutions of video. While existing ways of performing MBR video encoding provide adequate performance in many scenarios, they do not have the benefits and advantages of techniques and tools described below.