1. Field of the Invention
This invention generally relates to multicast or unicast media communications and, more particularly, to a system and method for creating simultaneous media playout at multiple heterogeneous clients.
2. Description of the Related Art
In a streaming media system, a server typically streams media data to multiple clients. Each client has its own capability, which may be different than other clients. For example, each client may have different amount of buffering space available. The server streams the same media stream to all the clients in a multicast session. The goal is for all these clients with different buffering capabilities, is to achieve a simultaneous playback of the media stream. One application of a multicast session is a home-network environment where a single audio server streams a song to multiple client devices, perhaps in different rooms of a home, so that the clients achieve a simultaneous playout of a song.
Because each client has different buffering capabilities, the server needs to adapt its streaming whenever a new client joins the session or leaves the session. When a media packet is transmitted from the server to the clients, the packet may have a variable delay before reaching the clients. This variation in the end-to-end delay is called jitter. As noted by Schulzrinne et al., “RTP: A transport protocol for real-time applications”, IETF, Jan. 2000, Sender and Receiver Report (SR, RR) packets of Real-time Transport Control Protocol (RTCP) part of Real-time Transport Protocol (RTP) have a field interarrival jitter that captures the mean deviation of the difference in packet spacing at the client compared to the server for a pair of packets. To compensate for the jitter, a streaming media client often buffers a certain number of packets, and then plays them out at scheduled playout times. A typical streaming media system often transports packets using RTP on top of User Datagram Protocol (UDP). UDP provides an unreliable service where transmitted packets can get lost, arrive out of order, or get duplicated. The client side buffering of data in a streaming media system helps to alleviate the out-of-order packet delivery by rearranging the buffered packets. The client can also discard the duplicate packets. The buffering is also helpful if there is any disruption in the streaming during the session, causing interruption in stream reception for a short time.
The amount of buffering done at the client side in a streaming media system is based mainly on two factors: the available client buffer size, and the acceptable delay which the user can tolerate before the media actually starts playing, after the time of the request. The type of media encoding used may also impose a certain amount of delay before a client can actually start to decode the received packets. This is typically the case for video encoded using any of the popular video encoding standards (e.g. MPEG1, MPEG2, MPEG4, H.263(+)), where the frames are encoded independently (Intra=I frames) or by referring to other frames (Inter=P,B frames). The delay may also be dependent upon whether the media is being played out is a live Stream or an on-demand archived stream.
With respect to the first factor, the available client buffer size, each client can have a different buffer size. In the case where each client is playing the same media stream in a session, it is assumed that packets are application data units and are independently decodable. This is true for majority of the audio encoding standards. The goal is to achieve a simultaneous playout of the media stream at each client. Another problem is the case where clients can join or leave the session midway. Client heterogeneity and dynamic session membership are major problems to be addressed in multicasting.
Adaptive playout delay adjustment and synchronization of streams have been proposed to address the above-mentioned problems. Assuming a media stream consisting of talkspurts interspersed with silence, adaptive playout algorithms have been proposed by Ramjee et al., “Adaptive playout mechanisms for packetizing audio applications in wide-area networks”, Proceedings of IEEE INFOCOM, pp. 680-688, 1994. These algorithms estimate the mean and variation in end-to-end delay to adjust the starting time for playout of each talkspurt. However, this work does not address the issue of multiple heterogeneous clients having different buffering capabilities. It is also more targeted towards an interactive conferencing type of systems. Such a solution does not address an on-demand archived media (especially audio) distribution system, that has a continuous media stream without separate talkspurts. Neither does this method address server side adaptation based on different client buffering capabilities.
The other class of related work addresses the problem of synchronization between different media streams in a presentation session. For example, video and audio streams that need to be synchronously presented in a multimedia session. A start-up protocol to initiate synchronized playback of multiple media streams is proposed by Biersack et al., “Synchronized delivery and playout of distributed multimedia steams”, Multimedia Systems, Vol. 7, No. 1, pp. 70-90, Jan. 1999. A scheme that allows audio and video stream synchronization using a local conference bus, is also proposed by Kouvelas et al., “Lip Synchronization for use over the Internet”, Proceedings of IEEE Globecom, Nov. 1996. These systems address a set of multiple streams transmitted from one or more servers to a single client, with the focus on achieving a synchronized playback of these multiple streams at that client.
Yuang et al., “Intelligent video smoother for multimedia communications”, IEEE Journal of Selected Areas in Communications, Vol. 15, No. 2, pp. 136-146, Feb. 1997, describes an intelligent neural network based video smoother that compensates for jitter and smoothes the playout of video frames. However, these solutions do not address the issue of achieving simultaneous playback of the same media stream at multiple heterogeneous clients having different buffering capabilities.
Further, although prior art systems describe client side buffering, none of known solutions appear to handle simultaneous media playout at multiple clients, where each client has a different buffer capacity. Methods designed to achieve isochronous streams appearing simultaneously at output ports cannot handle simultaneous playout when the buffer capacity of each client is different.
It would be advantageous if media could be played out at several clients simultaneously, even if the clients had different buffering capacities.
It would be advantageous if simultaneous playout could be maintained despite disruptions in the media stream.
It would be advantageous if simultaneous playout could be maintained despite changes in client membership during a session.