Use of communication networks for distribution of entertainment, collectively referred to as ‘multimedia entertainment content’, or ‘content’, continues to gain popularity fuelled by the decreasing cost of equipment and bandwidth to the home, and emergence of interactive personalized services. These services include TV programming, pay-per-view (PPV), video-on-demand (VoD), games, as well as Internet access.
Because multimedia files tend to be large, the content is currently packaged in information streams, which are transmitted to the user via a broadband communication network.
Sequences of images in a video stream often contain pixels (picture elements) that are very similar or identical, such as the images of a green lawn, a blue sky, etc. Compression and motion compensation protocols, of which MPEG is widely spread today, are typically used to minimize these redundant pixels between adjacent images for improving the use of transmission bandwidth. The video and audio specifications for compression/decompression (encoding/decoding) protocols give the syntax and semantics of encoded streams necessary for communicating compressed digital content as well as for storing and playing such video on media in a standard format.[AIP2]
It is to be noted that this invention is applicable to any multimedia stream format that incorporates milestones within the stream, milestones that can be identified by the decoder and used to synchronize stream startup upon channel change. The ensuing description refers to MPEG (Moving Picture Experts Group described in ISO/IEC 11172) and/or to MPEG2 (described in ISO/IEC 13818) transport streams to describe and illustrate the invention by way of example only, having in view that MPEG protocols are popular today.
To compress (encode) a stream carrying multimedia entertainment content, discrete samples in a stream are transformed into a bit-stream of tokens, which is uses less bandwidth than the corresponding initial stream, since essentially only data that has changed from image to image is captured in the compressed stream, instead of capturing all information from each image. The signal is broken into convenient sized data blocks (frames, or packets), and header information is added to each data block; the header identifies the start of the packets and must include time-stamps because packetizing disrupts the time axis.
The multimedia encoding/decoding format tells the decoder (receiver) how to inverse-represent the compacted stream back into data resembling the original stream of un-transformed data, so that the data may be heard and viewed in its normal form. However, if the decoder is not reset on channel change, it will display noise when channels are switched. Hence, the receiver needs to delay processing video packets from the new channel until a certain pointer (also referred to as key data or a milestone) showing the start of a data block is received.
A MPEG transport stream includes one or more video and audio packetized elementary streams (PES), each PES including an independent timebase for clock recovery and audio/video synchronization information. The transport stream also includes program guide and system information (PSI), which describes the elementary streams which need to be combined to build programs. Conditional access information in each stream enables selective access to the programs and to data services which may be associated with the programs. The PSI includes a Program Association Table (PAT), Program Map Tables (PMT) and Conditional Access Tables (CAT). The PAT includes data that the decoder uses to determine which programs (also referred to as channels) exist in the respective transport stream. The PAT points to a number of PMTs (one per program), which, in turn points to the video, audio, and data content of a respective program carried by the stream. A CAT is used for a scrambled stream.
Each MPEG packet has a fixed size with a program identifier (PID); packets in the same elementary stream all have the same PID, so that the decoder can select the elementary streams it wants and reject the remainder. A PID of ‘0’ indicates that the packet contains a PAT PID. Currently, the elementary video, audio and data streams for the same channel use a different PID.
In general, a client (receiver, decoder, set-top box, or player) has the option to select for viewing one of a plurality of channels, which are broadcast from a head-end or streamed from a server with pre-stored content files. A channel change is performed in response to a request from the client to the server; in response, the server provides the client with the new address from where to receive the new channel. The receiver leaves the currently viewed channel and joins the new channel. Channel change speed is adversely impacted by a plurality of factors, such as PAT/PMT latency, key press propagation (from the channel selector to the server), IGMP leave/join operations latency, packet buffering and propagation, I-frame latency and frame decode and presentation times, to name a few. PAT/PMT latency refers to the time necessary for the decoder to identify and access a PAT frame, in the transport stream, identify in the respective table the PMTs for the respective program, to then access the PMT for identifying the PID for the elementary streams. Once the PIDs for the video, audio, and data content of the respective program carried by the stream are known, the decoder start decoding only the packets that have these PIDs.
Currently, a subscriber terminal joins a channel at a random point in the data stream and has to wait for key data structures (milestones) in the new stream for displaying fully synchronized audio and video. For a MPEG2 stream, the I-frame is one of these key data structures, PAT/PMT are others. Since these milestone data structures are not sent very frequently, the channel change time ranges from several hundreds of milliseconds to a couple of seconds. As such, channel change times less than one second are difficult to achieve today with the current technology. Attempts to reduce the server side delay are currently emerging.
For example, Microsoft has proposed to connect a server (D-server concept) at the edge of a broadband network with a view to provide clients in a certain geographical area with broadcast multimedia streams. The server includes for each stream of multimedia content a buffer that manages and buffers multicast packets in the received stream. When a client device changes channels, it contacts the server which in turn bursts unicast video down to the client device for approximately 20 seconds. The client device immediately begins decoding, and then it joins the appropriate multicast address for that channel and continues decoding the multimedia stream. The problem with this approach is that it requires additional servers at the edge of the network, which increase cost to the overall solution. The other disadvantage to this solution is that it requires additional bandwidth to burst the data to the client device to obtain sub-second channel changing; this again increases cost to the overall solution in addition to requiring careful planning for enabling the network to handle the data bursts when a terminal performs a channel change. This can be a serious problem particularly for HDTV (high definition TV) content, and especially with more then one terminal in the same house.
Another disadvantage of this approach is that the client must be aware of the server, and is not able to change channels if the server is not accessible. Also, messaging is used by the client to request and receive packets that are missing, so that the client does not have any autonomy if the connection with the server is lost for whatever reason.
There is a need for a solution that significantly reduces channel change delays (channel zapping time) without introducing server-side complexity at the edge of the network.