This invention relates in general to streaming media and more specifically to implementing dynamic bit rate adaptation while streaming media on demand.
Available bandwidth in the internet can vary widely. For mobile networks, the limited bandwidth and limited coverage, as well as wireless interference can cause large fluctuations in available bandwidth which exacerbate the naturally bursty nature of the internet. When congestion occurs, bandwidth can degrade quickly. For streaming media, which require long lived connections, being able to adapt to the changing bandwidth can be advantageous. This is especially so for streaming which requires large amounts of consistent bandwidth.
In general, interruptions in network availability where the usable bandwidth falls below a certain level for any extended period of time can result in very noticeable display artifacts or playback stoppages. Adapting to network conditions is especially important in these cases. The issue with video is that video is typically compressed using predictive differential encoding, where interdependencies between frames complicate bit rate changes. Video file formats also typically contain header information which describe frame encodings and indices; dynamically changing bit rates may cause conflicts with the existing header information.
There have been a number of solutions proposed for dealing with these problems. One set of solutions is to use multiple independently encoded files, however, switching between files typically requires interrupting playback, which is undesirable. These solutions also typically require starting again from the beginning of the file, which is very disruptive. Solutions based on the RTSP/RTP transport delivery protocols have the advantage of being frame-based, which eases the switching between streams, but they require that multiple streams be running simultaneously, which is bandwidth and server resource inefficient. Other solutions propose alternate file encoding schemes with layered encodings. Multiple files are used, but each file can be added to previous files to provide higher quality. Rate adaptation is performed by sending fewer layers of the encoding, during congestion. These schemes require much more complex preprocessing of files, and the codecs are not typically supported natively by most devices. For mobile devices with limited resources, this can be a large barrier to entry.
More recently, schemes have been proposed which use multiple files, each encoded at a different bit rate, but then the files are divided into segments. Each segment is an independently playable file. The segments provide fixed boundaries from which to switch and restart playback. This solves the problem of having to restart from the beginning, and limits the playback disruption. The granularity is not nearly as fine as with RTSP which may be as low as 1/30th of a second, but rather at the granularity of seconds to tens of seconds. With finer granularity, disruption to users is minimized, however, segment overhead is maximized. In cases where round trip latency between the client and the server is higher than the segment duration, undue overhead is introduced as the rate cannot be adapted that quickly. If caching is employed, cache distribution and synchronization latency may compound these issues. However, coarser granularity limits the utility of the switching scheme. If the available network bandwidth varies at a period less than the segment duration, inability to adapt in a timely manner negates the value of segmentation.
Content providers produce content and monetize it through a variety of means (advertising sponsorship, product placement, direct sales, etc.). One of the primary methods for monetizing video content is the periodic insertion of video advertisements, as with television and some internet-based long form video content delivery, as well as through strictly pre-roll and/or post-roll advertisements as with movies and some short form video content delivery.
For desktop delivery of media, switching between content and ads is fairly seamless given the high bandwidth provided by broadband connections and the high CPU power of modern desktop PCs. For mobile delivery of media, however, high latency and low bandwidth cellular networks coupled with low CPU power in most handsets can cause long playback disruptions when retrieving separate content and advertisement video files. On-demand transcoding and stitching of advertisements to content is a CPU intensive task which requires dedicated servers. It incurs the cost of maintaining servers and prevents the use of tried and true content delivery networks (CDN). To alleviate this, pre-stitching of advertisements to content is often used to limit costs. However, advertisements are typically rotated periodically with changing ad campaigns. For long form content, changing the ads may require re-stitching extremely large amounts of content and then re-uploading all of that content to a CDN. Network bandwidth is typically a bottleneck and uploading can take a long time; upload can also be costly if network access is paid for by the amount of bandwidth used. With long form content, the ads are typically very small, relative to the size of the feature content. Re-uploading the entire file, including both ad and feature content needlessly incurs the cost of re-uploading the feature content.