1. Technical Field
The invention is related to a codec-independent system for efficiently delivering media content, such as, for example, scalable coded audio and/or video content, over a network, such as the Internet or a wireless network, and in particular, to a system and method for automatically and dynamically delivering streaming media content which is optimally scaled in real time to match current network bandwidth and packet loss ratio.
2. Related Art
Reliable delivery of streaming audio or video media content or some combination thereof over an inherently unreliable packet-based network such as the Internet is a challenging task. During any given connection between a server and one or more clients, the available bandwidth between the server and any given client can vary greatly, and individual data packets representing encoded portions of the streaming media can be lost or delayed. Consequently, it is difficult to guarantee a smooth and consistent playback quality for streaming media.
For example, one common problem frequently observed with a network such as the Internet is that because such networks have very little guarantee of quality of service (QoS), data packets are often lost or delayed during transmission. Consequently, data packets comprising portions of media data files may arrive at a client either late, out of sequence, or may not arrive at all. Further, where data packets representing a media type of data file are lost or overly delayed beyond a predetermined minimum time constraint, the result is typically a degraded or irreparably damaged media file. Such loss or delay tends to produce noticeable artifacts in the media as the encoded packets are decoded and combined for playback on the client.
Another common problem is that the available bandwidth of a network such as the Internet typically fluctuates considerably over time for a variety of reasons, including network traffic, number of users, etc. Consequently, the available bandwidth between any given server and client, or any given source and destination, will typically fluctuate during any given connection session. Such variance in available bandwidth is not typically of great concern with non-media data files, however, with streaming media, the fluctuations can result in drastic changes in the quality of the media playback over time, along with noticeable artifacts in the playback as the playback quality changes.
In view of the aforementioned problems, a number of conventional media delivery schemes have been created in an attempt to deliver streaming media over a network such as the Internet. For example, one of the most basic schemes for streaming audio or video files simply compresses the file into a single bitstream. The packets representing this bitstream are then sent sequentially over the Internet from a server to a client where they are decoded, reassembled, and presented for playback. However, because the bitstream cannot typically be altered after it has been compressed, it is difficult to adapt to fluctuating network bandwidth conditions.
Several conventional schemes for streaming media files have expanded on the aforementioned media delivery scheme by using a multi-rate scheme to generate several compressed media files at different bit rates for each media file. The server then determines the available bandwidth between the server and the client, and sends the compressed media file having the highest bit rate that can be successfully transmitted using the given bandwidth. The server will then automatically change to either a higher or lower bit rate version of the media file, as appropriate, where the bandwidth between the server and client changes during transmission. One of the problems with switching to a file having a different bit rate is that there tends to be noticeable artifacts in file playback where file bit rate is changed during playback. Another problem is that more storage space is required on the server because multiple versions of each media file, compressed at different bit rates, must be stored to account for the available bandwidth.
The playback provided by the aforementioned schemes has been greatly improved by the simple addition of the concept of buffering. With buffering, playback of the media file is delayed on the client for a period of time, typically measured in a number of seconds. Such buffering tends to smooth out bandwidth fluctuations, thereby reducing, but not entirely eliminating the need to sometimes switch between different media file bit rates. As with the previous schemes, data packets are sometimes lost during transmission. However, where a packet is lost during transmission, the use of a buffer typically provides a window of time during which any lost packets can be retransmitted. If the retransmitted packets are received in time, the playback of the media file is not interrupted. However, if any of the retransmitted packets are not received in time, the playback of the media file will have noticeable artifacts corresponding to the lost packets.
Because lost packets can seriously degrade media playback, several schemes have been developed to address occasional packet loss. For example, several conventional schemes use an Automatic Retransmission Request (ARQ) which retransmits lost packets after the server receives a negative acknowledgement (NACK) from the client for any given packet. Such schemes begin to degrade rapidly as the packet loss ratio increases.
Other conventional schemes address the packet loss problem by using Forward Error Correction (FEC). FEC involves the transmission of parity packets along with the data packets of the media file. These parity packets can often be used to recover or regenerate lost data packets by using the received data packets along with the parity packets to recreate lost packets. Such schemes provide for a fairly reliable delivery of streaming media where the packet loss ratio is low. However, as the packet loss ratio increases, the ability of FEC schemes to recover lost packets quickly degrades, thereby also causing the playback of the media file to degrade.
Related schemes for addressing the packet loss problem go a step further by using interleaving and buffer management to disperse burst errors caused by a lost packet into random errors in the bitstream which is then further corrected by using an FEC scheme. As with the aforementioned FEC schemes, these schemes ensure a fairly reliable delivery of streaming media where the packet loss ratio is low. However, as with the previous schemes, as the packet loss ratio increases or fluctuates widely, the ability of these schemes to correct for lost packets quickly degrades, thereby again causing the playback of the media file to be degraded.
Still other schemes have achieved even better results for streaming media files over a network such as the Internet by using the concept of scalable audio or video coding. With scalable coding of audio or video, the compressed bitstream is comprised of a number of layers of decreasing importance level. As the bandwidth between the server and the client increases, packets representing more layers are transmitted. Conversely, as the bandwidth decreases, fewer packets representing layers are transmitted. Decoding of the media file can be achieved using only a subset of the layers. However, only switching among layers does not achieve an optimum transmission performance for the scalable coded media. Since there is no special processing of the lost packets, the quality of the decoded media will decrease rapidly as the packet loss ratio increases.
Therefore, what is needed is a system and method for reliably delivering streaming audio or video media content or some combination thereof over a network such as the Internet. Such a system should automatically account for fluctuations in available bandwidth between the server and client while maximizing the quality of streamed media files during client playback. Further, such a system should automatically minimize any degradation of streamed media files caused by packet loss during network transmission of the streamed media files.
The present invention involves a new system and method which solves the aforementioned problems, as well as other problems that will become apparent from an understanding of the following description by providing a network aware rate-distortion optimization solution for addressing the problems of streaming media files over a network such as, for example, the Internet or other wired or wireless network. Such problems include bandwidth fluctuations between a server and one or more clients, and packet loss during streaming of media files between the server and the clients.
A network aware rate-distortion optimization system and method according to the present invention uses any conventional scalable coding scheme to first generate at least one encoded bitstream consisting of a number of Data Units (DUs) for at least one media file. As is well known to those skilled in the art, the bitstream of a scalably encoded media file can be truncated at any point while still allowing decoding of the received portion of the bitstream. In other words, a set or subset of the DUs comprising any bitstream of a scalably encoded media file can be used to reconstruct the encoded media file as various levels of resolution or quality.
The contribution of each of the DUs to the overall quality of the decoded media file is first calculated, with higher scores being assigned to those DUs having a greater influence on the quality of the decoded media file. In particular, those DUs providing the greatest decrease in rate-distortion of the decoded media file will receive higher scores. Further, in one embodiment, the size of particular DUs is also used to determine the score for that DU. In particular, scores are reduced in proportion to the size of a particular DU, as it is more expensive, in terms of bandwidth, to send a single large DU then it is to send a number of smaller DUs.
In alternate embodiments of the present invention, additional elements are also considered in scoring DUs. For example, an element which is used in one embodiment of the present invention for scoring DUs is called a xe2x80x9creliance factor.xe2x80x9d This reliance factor accounts for the fact that while a truncated bitstream of a scalably encoded media file can be decoded, any portion of a bitstream that relies on a missing DU can not be decoded. In other words, the reliance factor accounts for the fact that one or more current DUs may rely on the receipt of one or more prior DUs before it can be decoded.
In another embodiment, a xe2x80x9csent statusxe2x80x9d is included in scoring DUs. The sent status is simply an indication of whether a DU has been sent, or whether its receipt has been negatively acknowledged, i.e., a NACK. This NACK is simply part of a conventional ACK/NACK network protocol for determining whether network packets have been received by a client after being sent from a server. The sent status helps to reduce potentially wasted use of the available bandwidth by eliminating duplicate sends of DUs that have already been sent without receiving a NACK.
Still another embodiment of the present invention includes the use of a probability of on-time delivery for particular DUs in computing the score for those DUs. For example, where a DU is delivered too late to be decoded for playback of a streamed media file, the transmission of that DU is simply a waste of bandwidth, as it is not usable when it is late. Such bandwidth could have been better used to transmit other usable packets. As the probability of on-time delivery decreases, the score for the particular DU will also decrease.
Finally, in yet another embodiment, a xe2x80x9cbalance factorxe2x80x9d is used to address the importance of near future time slots. In particular, those DUs that are required more immediately if they are to be useful for improving the rate-distortion of a streamed media file are considered to be more important than those DUs having far future time slots. In other words, it is more critical to deliver DUs which are to be used sooner rather than delivering those DUs which are to be used later. Thus, in this embodiment, the scoring of individual DUs is adjusted to reflect the urgency of sending the DU if it is to be used. This element serves to balance between the quality of the streamed media file and error robustness.
It should be noted that the scoring of DUs is dynamic in the sense that the scores of particular DUs may change over time, as the scores for DUs are computed repeatedly during the transmission over the network. The elements and factors described above may change over time as the network conditions, receipt status of the DU and the time to play the DU change, thereby potentially changing the scores of particular DUs.
The system and method according to the present invention then automatically adapts to network bandwidth fluctuations by transmitting either more or fewer data packets representing DUs of the encoded media file depending on the available bandwidth. Data units representing the media file are streamed as packets from the server to the clients based on the score calculated for each DU. In particular, those DUs having a higher score, and thus a greater influence on the quality of the decoded media file are transmitted first, with less important DUs being transmitted as allowed by the available bandwidth. In other words, those DUs that offer the greatest distortion decrease per coding rate, and thus have a higher calculated score, are sent prior to those DUs that have a lower calculated score and thus offer a lesser rate-distortion decrease to the decoded media file. Again, as noted above, additional factors may also be used in determining scores for each DU.
Further, in order to deal with potentially the severe packet loss that is commonly observed during network transmission, only the more important lost DUs are retransmitted. Thus, the system and method of the present invention also determines which DUs, and thus which lost packets, if retransmitted, would be most beneficial to the reconstruction of the media file by providing the greatest decrease in distortion of the decoded media file. This determination of importance is based on which DUs have already been received by the client, which DUs have been lost, and any relationship between DUs, using the scoring criteria described above.
For example, assume that three sets of DUs, (DUa, DUb, and DUc), are represented by three corresponding data packets that are transmitted from a client to a server. Now, assume that the packets representing both DUa and DUc are lost during transmission, while the client successfully receives the data packet representing DUb. Finally, assume that decoding of DUa is independent of the other DUs, while the decoding of DUb is dependent on having received DUc. If the decoding of DUb and DUc together provide a better representation of the media file than the decoding of DUa by itself, then in this case, DUc, is more important than DUa, and thus should have a higher score than DUa, even though without the receipt of DUb, DUc by itself may have a lower score than DUa. This is true because DUc allows the decoding of DUb, and the combination of DUb and DUc offers higher quality than DUa alone. Consequently, DUc will be retransmitted prior to DUa. Of course, given both sufficient transmission time and bandwidth, DUa will also be retransmitted.
Consequently, a system and method according to the present invention uses a system of rate-distortion based packet selection to maximize the quality of a streamed media file. Again, with respect to bandwidth, those packets representing DUs having a higher score and thus having a greater contribution to file quality are transmitted prior to those packets having a lower score. Further, with respect to lost packets, those packets that will provide the maximum distortion decrease per rate transmitted based on information of the already received packets are considered to be more important, and will be retransmitted prior to other packets which, if transmitted in the same time slot, would provide a lesser rate-distortion tradeoff. In this manner, a system and method according to the present invention efficiently and reliably delivers streaming media content over the network while automatically accounting for both fluctuating network bandwidth and packet loss.
In addition to the just described benefits, other advantages of the present invention will become apparent from the detailed description which follows hereinafter when taken in conjunction with the accompanying drawing figures.