Increasing amounts of data are being transmitted from servers to clients via communication infrastructures such as packet-based Internet Protocol (IP) networks. One particular application that is increasing in popularity is multimedia streaming. However, improvements must be made in providing reliable data streams before wide-spread adoption of such services. For example, as data transmission link rates between the IP network and a client device of a user tend to fluctuate, any disturbances in data delivery to the user may result in severe degradation of the playout to the end user, i.e. a degradation in the quality of the media observed by the user. In particular, it is important that there be a sufficient supply of packets of data at the client device to be fetched by a multimedia application as playout (i.e., the display of the multimedia file by the multimedia application or player) progresses.
In many cases, the packet transmission rate cannot be changed, as this rate depends upon the bandwidth of communication link (or it is at least impractical to change the packet transmission rate). However, the rate at which data is fed to the output device of the user often must be changed. Typically, for streaming applications, such adjustments are achieved using “stream switching”. With stream switching, the same media content, e.g. a particular video sequence, is pre-encoded at different bit rates and stored at the server. Hence, different versions of the same stream are available. During transmission, the server selects the particular version that has a data bit rate most appropriate based upon the current available bandwidth in the network and based upon the status of the client buffer. Switching logic employed by the server decides if and when to switch to another version of the stream. In the case of a so-called “down-switch”, the stream is switched to a version with a lower encoded bit rate. In the case of an “up-switch”, the switch is made to a version with a higher encoded bit rate. In many implementations, the criteria for switching employs predefined thresholds defined with respect to client buffer status. In one example, thresholds are based upon a buffer fill level, which represents the amount of data within the client buffer in bytes. In another example, the thresholds are based upon a playout length (PT) of stored media in the client buffer, which represents the amount of time in seconds it will take for the data already within the client buffer to be played out to the user. Herein, examples involving playout length are described, though buffer fill level or other appropriate parameters can instead be used.
Some conventional techniques for determining the status of the client buffer utilize information within Real Time Transport Control Protocol (RTCP) receiver reports (RRs). Information pertaining to the next sequence number (NSN) or oldest buffered sequence number (OBSN) within the client buffer and the highest received sequence number (HRSN) within the client buffer is contained with the RR and is used to determine the consumed buffer space as the size of each packet within the range from the HRSN to the NSN/OBSN is known. If the free space within the client buffer is below a preferred client buffer fill level, then a different version of the stream is selected. For example, if buffer playout length (PT) falls below a predetermined minimum threshold (PTDOWN), then a risk of buffer draining occurs, i.e. the client buffer becomes empty such that there is no data to stream to the user. This results in a playout freeze, wherein the last image displayed to the user is typically frozen until a sufficient amount of additional data can be added to the client buffer to restart the stream to the output device employed by the user, i.e. a “rebuffering” of the client buffer is required. Rebuffering can be extremely annoying from the standpoint of the user.
To avoid possible rebuffering due to client buffer draining, the server detects when the playout length (PT) within the client buffer drops below threshold PTDOWN, then adjusts the bit rate (i.e. selects a version of the stream having a different bit rate) in an attempt to prevent the client buffer from becoming completely drained. More specifically, the server performs a down-switch, i.e. a switch to a lower bit rate stream. The reason that a down-switch is performed, rather than up-switch, is that the most likely reason that the client buffer is being drained is that the link rate between the server and the client buffer is less than anticipated, i.e. the effective bandwidth is less than needed for the bit rate currently being used. As a result, data is not being received by the client buffer at the same rate at which the client buffer is feeding data to the output device of the user. Hence, the client buffer, which should remain fairly well populated with data, becomes drained. By switching to the lower bit rate, the client buffer feeds data to the display unit at a lower rate, thereby allowing more time for data to be received from the server, and thereby preventing the client buffer from becoming completely drained. From the standpoint of the user, the quality of the media stream is downgraded because of the down-switch, e.g. the size of the displayed image of the video stream becomes smaller, the resolution of the image becomes less, or higher distortions are observed in the image. Yet, this is preferable to the aforementioned playout freeze that occurs during rebuffering.
On the other hand, if buffer playout (PT) length exceeds a predetermined maximum threshold (PTUP), then a risk of buffer overflow occurs, i.e. the client buffer becomes full such there is no room for additional packets. Any packets received by the client buffer but not stored therein are typically not re-sent by the server and hence the data of those packets are simply not forwarded to the output device of user. Once the client buffer is again capable of storing packets, the data stream resumes with the new packets. Thus, from the standpoint of the user, there is a sudden loss of content as the stream simply jumps ahead. In the case of a film or movie, dialogue can be lost, thus interfering with the ability of the user to follow story. In the case of music, the song simply jumps ahead. As will be appreciated, this can be quite annoying from standpoint of the user as well.
To avoid a disruption of the stream due to client buffer overflow, the server detects when the playout length (PT) within the client buffer exceeds threshold PTUP and then performs an up-switch, i.e. a switch to a higher bit rate stream. The reason that an up-switch is performed, rather than down-switch, is that the most likely reason that the client buffer is becoming to full is that the link rate between the server and the client buffer is greater than anticipated, i.e. the effective bandwidth is greater than needed for the bit rate currently being used. As a result, data being received by the client buffer at a rate higher than the rate at which the client buffer feeds the data to the output device of the user. Hence, the client buffer overflows. By switching to the higher bit rate, the client buffer feeds data to the output device at the higher rate, thereby preventing the client buffer from overflowing. From the standpoint of the user, the quality of the media stream is improved due to the up-switch, e.g. the size of the displayed image of the video stream becomes larger or the resolution of the image becomes greater. Hence, the up-switch helps prevent interruption of the stream and improves media quality, which both benefit the user.
Simple logic for performing up-switches and down-switches may be represented as follows:    If PT>PTUP then            Perform up-switch            else if PT<PTDOWN             Perform down-switch            end if.
Appropriate selection of these thresholds is critical to the overall media impression of the user. In the case of down-switch that is performed too late, a rebuffering event will happen. In the case of an up-switch that is performed too late, the user receives a lower quality media then is otherwise necessary and, as noted, a break in the data stream may occur as the result of a buffer overflow. Likewise, if a down-switch is performed earlier than necessary, the user receives a lower quality media than is otherwise necessary. If an up-switch is performed earlier than necessary, a down-switch may then soon be required, resulting in annoying fluctuations in the quality of the media. To avoid these problems, multiple down-switch thresholds and multiple up-switch thresholds can potentially be used. As playout length decreases towards buffer drainage, a series of the down-switch thresholds are crossed, each triggering a down-switch. Conversely, as playout length increases towards buffer overflow, a series of up-switch thresholds are crossed, each triggering an up-switch.
However, after a switch has occurred and a stream with the new bit rate has been transmitted, it takes some time before the switch has any effect on the playout length of the client buffer. First, there is a transmission delay until a first packet containing data encoded at the new rate reaches the client buffer. During this time period, the playout length of the stored media in the client buffer is unaffected by the new rate. Hence, if the playout length was increasing toward a possible buffer overflow, it will likely continue to increase. Conversely, if the playout length was a decreasing toward possible buffer drainage, it will likely continue to decrease. Also, even after the arrival of the first packet at the new bit rate, the playout length may change only slowly at first. For example, there may still be some packets sent with data at the previous bit rate that had not yet been received by the client buffer. Therefore, the switching conditions are often still valid and several switches then follow a first switch, which are often unnecessary. In the case of a first down-switch, several further down-switches may be performed, resulting in a stream bit rate that is much lower than necessary. Often, the down-switches do not stop until the lowest stream bit rate has been selected. This behavior results in an unnecessarily low media stream quality for the user. In the case of an up-switch, several further up-switches can happen, resulting in a stream bit rate that is too high, often to the highest rate possible. This results in a stream bit rate that is much too high compared with the current available network bandwidth, triggering a series of down-switches.
As a result, frequent and annoying variations in stream quality are observed by the user. Moreover, if a bit rate that is much too high has been selected, subsequent down-switches often cannot be executed fast enough, resulting in annoying rebuffering events and playout freeze. Likewise, if a bit rate that is much too low has been selected, subsequent up-switches often cannot be executed fast enough, resulting in annoying buffer overflows and associated loss of data. Even with only a single up-switch threshold and a single down-switch threshold, these sorts of problems can arise, particularly if the thresholds are set too close together.
Even more problems can arise when transmitting media content that has a variable bit rate. Conventionally, each pre-encoded version of the multimedia stream has a single bit rate, and hence the bit rate of a stream only changes if the server switches to a different stream having a faster or slower rate, as already described. However, in some cases, it is appropriate to provide streams with a varying bit rate, particularly to accommodate storage and transmission of large media files. In other words, each version of a stream may have portions at one bit rate and other portions at another. Preferably, the bit rate for individual sections of a particular version of a stream is chosen based on the content of the individual section. For example, one portion of a stream may be fairly static, permitting a low bit rate to adequately capture the content. Thereafter, a higher bit rate may be needed to adequately capture more dynamic content. By setting the bit rate of each portion of a multimedia stream based on the dynamic content of that portion of the stream, overall file size can be reduced while still adequately conveying the content.
When applying conventional stream switching techniques to variable bit rate streams, various problems can arise. In particular, the changing bit rates of the stream can compound the aforementioned problems, resulting in even more frequent and unnecessary switches, causing further annoyance to the user and, often, wasting bandwidth.
Accordingly, there is a need for an improved technique for controlling stream switching of variable bit rate data so as to provide more stable and reliable content to user, and it is to this end that the invention is principally directed.