Streaming media delivery is becoming more and more important as it is becoming more and more common that high quality audio and video is delivered over packet based networks, such as the Internet, cellular and wireless networks, powerline networks, and many other networks. The quality of the delivered streaming media depends on a number of factors, including the quality of the original content, the encoding quality of the original content, the capabilities of the receiving devices to decode and display the video, timeliness and quality of the signal received at the receivers, etc. To create a perceived good streaming media experience, transport and timeliness of the signal received at receivers are especially important. Good transport provides fidelity of the stream received at the receiver compared to what is sent from the sender, while timeliness represents how quickly a receiver can start playing out the content after the initial request for that content.
Recently, it has become common practice to consider using Forward Error Correction (FEC) codes for protection of streaming media during transmission. When sent over a packet network, examples of which include the Internet and wireless networks such as those standardized by groups such as 3GPP, 3GPP2 and DVB, the source stream is placed into packets as it is generated or made available, and thus the packets are used to carry the source or content stream in the order it is generated or made available to receivers.
In a typical application of FEC codes to these types of scenarios, an encoder uses FEC code in the creation of repair packets, which are then sent in addition to the original source packets containing the source stream. The repair packets have a property that, when source packet loss occurs, received repair packets can be used to recover the data contained in the lost source packets. Repair packets can be used to recover content of lost source packets that are lost entirely, but might also be used to recover from partial packet loss occurs, either entirely received repair packets or even partially received repair packets. Thus, wholly or partially received repair packets can be used to recover wholly or partially lost source packets.
In yet other examples, other types of corruption can occur to the sent data, e.g., values of bits may be flipped, and thus repair packets may be used to correct such corruption and provide as accurate as possible recovery of the source packets. In other examples, the source stream is not necessarily sent in discrete packets, but instead may be sent for example as a continuous bit-stream.
There are many examples of FEC codes that can be used to provide protection of a source stream. Reed-Solomon codes are well known codes for error and erasure correction in communication systems. For erasure correction over, for example, packet data networks, a well-known efficient implementation of Reed-Solomon codes uses Cauchy or Vandermonde matrices as described in L. Rizzo, “Effective Erasure Codes for Reliable Computer Communication Protocols”, Computer Communication Review, 27(2):24-36 (April 1997) (hereinafter “Rizzo”) and Bloemer, ET AL., “An XOR-Based Erasure-Resilient Coding Scheme”, Technical Report TR-95-48, International Computer Science Institute, Berkeley, Calif. (1995) (hereinafter “XOR-Reed-Solomon”) or elsewhere.
Other examples of FEC codes include LDPC codes, chain reaction codes such as those described in Luby I and multi-stage chain reaction codes such in Shokrollahi I.
Examples of the FEC decoding process for variants of Reed-Solomon codes are described in Rizzo and XOR-Reed-Solomon. In those examples, decoding is applied after sufficient source and repair data packets have been received. The decoding process may be computationally intensive and, depending on the CPU resources available, this may take considerable time to complete, relative to the length of time spanned by the media in the block. The receiver must take into account this length of time required for decoding when calculating the delay required between the start of reception of the media stream and play-out of the media. This delay due to decoding is perceived by the user as a delay between their request for a particular media stream and the start of playback. It is thus desirable to minimize this delay.
In many applications, packets are further subdivided into symbols on which the FEC process is applied. A packet can contain one or more symbol (or less than one symbol, but usually symbols are not split across packets). A symbol can have any size, but often the size of a symbol is at most equal to the size of the packet. Source symbols are those symbols that encode the data that is to be transmitted. Repair symbols are symbols generated from source symbols, directly or indirectly that are in addition to the source symbols (i.e., the data to be transmitted can be entirely recovered if all of the source symbols are available and none of the repair symbols are available.
Some FEC codes are block-based, in that encoding operations depend on the symbol(s) that are in a block and can be independent of the symbols not in that block. With block-based encoding, an FEC encoder can generate repair symbols for a block from the source symbols in that block, then move on to the next block and not need to refer to source symbols other than those for the current block being encoded. In a transmission, a source block comprising source symbols can be represented by an encoded block comprising encoded symbols (which might be some source symbols, some repair symbols, or both). With the presence of repair symbols, not all of the source symbols are required in every encoded block.
For some FEC codes, notably Reed-Solomon codes, the encoding and decoding time grows impractical as the number of encoded symbols per source block grows. Thus, in practice, there is often a practical upper bound (255 is an approximate practical limit for some applications) on the total number of encoded symbols that can be generated per source block, especially in a typical case where the Reed-Solomon encoding or decoding process is performed by custom hardware, e.g., the MPE-FEC processes that use Reed-Solomon codes included as part of the DVB-H standard for protecting streams against packet loss are implemented in specialized hardware within a cell phone that is limited to 255 Reed-Solomon total encoded symbols per source block. Since symbols are often required to be placed into separate packet payloads, this places a practical upper bound on the maximum length of the source block being encoded. For example, if a packet payload is limited to 1024 bytes or less and each packet carries one encoded symbol, then an encoded source block can be at most 255 KB (kilobytes), and this is also, of course, an upper bound on the size of the source block itself.
Other concerns, such as being able to decode the source blocks fast enough to keep up with the source streaming rate, to minimize the decoding latency introduced by FEC decoding, and to only use a small fraction of the available CPU on the receiving device at any point in time during FEC decoding are issues.
Other concerns include the ability to start playing out a stream, for example decoding and rendering received audio and video streams using a personal computer and displaying the video on a computer screen and playing the audio through built in speakers, or as another example decoding and rendering received audio and video streams using a set top box and displaying the video on a television display device and playing the audio through a stereo system. A primary concern is to minimize the delay between when a user decides to watch a new content delivered as a stream and when the content starts playing, hereinafter called the “content zapping time”. An example of content zapping is when a user is watching a first content delivered via a first stream and then the user decides to watch a second content delivered via a second stream and initiates an action to start watching the second content. The second stream may be sent from the same set or a different set of servers as the first stream. Another example of content zapping is when a user is visiting a website and decides to start watching a first content delivered via a first stream by clicking on a link within the browser window. Another example of content zapping is when a user wants to seek to and start viewing at a new position, either forwards or backwards, within the same content stream. Minimizing content zapping time is important for video watching to allow users a high quality fast content surfing experience when searching and sampling a wide range of available contents. A high quality quick content surfing experience is often positively correlated with the amount of content that users consume.
It is often the case that a primary contributor to the content zapping time is the underlying FEC structure. Another concern is minimizing the time gap between the end of the playing of one piece of content and the start of the play of another piece of content, which is preferably consecutive with little or no pause. For example, where the one piece of content is a broadcast show and the next piece of content is an advertisement, or vice versa, a long gap (herein referred to as the “content transition time”) between their play-out is undesirable.
Clearly, minimizing the content transition time is desirable, while at the same time minimizing the rate of the streaming to the receiver during the periods of time surrounding the transition.
Another concern is to maximize the quality of the delivered stream when using a best effort delivery network such as the Internet which can drop packets and which can cause wide variations in the amount of time it takes for packets to be delivered, while at the same time minimizing usage of network resources such as bandwidth.
Another concern is to provide a robust and scalable streaming delivery solution that allows components of the system to fail without adversely affecting the quality of the streams delivered to receivers.
Interleaving can be used to provide superior protection against channel imperfections such as intermittent packet loss. For example, packet loss is often somewhat bursty and thus spreading out a source block over longer periods of time can be advantageous. For some FEC codes the native use of large source blocks is practical, but for other FEC codes such as Reed-Solomon codes there are often practical limitations of the size of a source block that can be used. Thus, to spread out the transmission of packets associated with a source block over a longer interval of time it can be advantageous to interleave the sending of packets containing encoded symbols for different source blocks.
Previously, methods have been introduced which address some of the concerns expressed above. For example, some novel FEC source block formation and interleaving methods are described in Luby II. Some interleaving methods are static, in the sense that the amount of interleaving is fixed for the entire stream. Thus, there is sometimes a trade-off between the amount of interleaving, which affects the quality of the protection offered by such methods, and the content zapping time, i.e., greater amounts of interleaving provide better stream protection but longer content zapping times, and this trade-off is determined in a fixed way for the entire duration of the streaming to a receiver.
There are some methods that provide short content zapping times and greater amounts of interleaving during the majority of the stream sending process, for example some methods described in Watson. Some of the methods described in Watson dynamically transition from short initial source blocks to increasingly longer source blocks, and during the transition period send at a slightly faster rate than the content streaming rate. Such methods provide a short content zapping time while at the same time allowing for the build up of the quality of protection provided as the stream progresses. For example, one way of applying some of the methods described in Watson is to determine the source block structure and perform the FEC encoding while the stream is being sent, i.e., the short to long source block structure is determined and are FEC encoded as they are being sent at each point where they are accessed to individual receivers, and thus the source block structure formation and FEC encoding is performed uniquely for each receiver, and the stream sent to each receiver is unique. However, it is sometimes desirable to have the source block structure of the content stream determined independently of the delivery of the stream, e.g., independently of the receivers, independently of when the content is viewed and where in the content stream the viewing commences, and independent of which order the data within the stream is delivered. This is especially important if a content stream is to be delivered from multiple servers to a single receiver.
Thus, it is desirable to have improved processes and apparatus.