1. Field of the Invention
The present invention relates to a media playout method in a receiver side in a network media streaming system, and more particularly, to an adaptive media playout method using buffer-based adaptive intra-media synchronization in a receiver side in a network media streaming system.
2. Description of the Related Art
In a media streaming service in a best-effort IP network, it is important that the final user on a receiver side satisfy a playout quality in term of space and time. However, the playout quality may be significantly damaged due to, for example, packet loss, delay, and jitter under network congestion conditions. For example, a partial packet loss in one video frame may cause a PNSR (Peak Signal to Noise Ratio), which is a unit for measuring a spatial quality, to be lowered. A reduction in temporal playout quality, such as playout pause and skip, due to packet delay and jitter deteriorates the overall playout quality.
The deterioration of the spatial playout quality is handled by error control techniques, such as an automatic repeat request (ARQ) and forward error correction, while the deterioration of the temporal playout quality is handled by a media synchronization technique. In general, the media synchronization is divided into intra-media synchronization, inter-media synchronization, and inter-client synchronization.
The intra-media synchronization is a technique for preserving the temporal relationship between media units (MUs) in a single stream. The inter-media synchronization is a technique for preserving the temporal relationship between streams, such as lip synchronization between audio and video. The inter-client synchronization is a technique for matching the playout times of clients in the multicast media streaming applications, such as a real-time sports broadcast or a video conference.
A general streaming system includes basic functions for the intra-media synchronization. As a first basic function, a transmitter generates a time stamp and inserts it in a media stream such that a receiver side can restore the playout time of the media units. For example, an MPEG (Moving Picture Experts Group) standard provides time stamps, such as SCR (System Clock Reference), PTS (Presentation Time Stamp), and DTS (Decoding Time Stamp), and defines a synchronization model in a system layer. As a second basic function, a playout buffer for buffering packet delay and jitter is provided in the receiver side. The buffering capacity of packet delay and jitter may conflict with playout delay according to the size of the playout buffer. Therefore, the size of the playout buffer should be determined carefully in consideration of the type of streaming application.
However, such basic functions cannot ensure the intra-media synchronization under the conditions that heavy network congestion occurs. For example, buffer underflow or overflow is likely to occur in the receiver side according to the network conditions that are time variable. As a result, playout discontinuity, such as playout pause or playout skip, occurs.
An adaptive media playout (hereinafter, referred to as AMP) technique adjusts the playout times of media units to improve intra-media synchronization quality. That is, the AMP technique enables the receiver side to schedule the playout times of media units according to network conditions. The basic operation of the AMP technique is based on the fact that the viewers feel that short playout discontinuity that is controlled well has better quality than long playout discontinuity that is difficult to predict. Informal experiments prove that the viewers little perceive playout rate control up to 25% by the AMP and playout rate control up to 50% can be accepted according to characteristics of media contents.
The AMP techniques that have been proposed until now are classified into time-based models and buffer-based models. The techniques using the time-based model explicitly measure network delay and jitter using time stamps and time information received from the transmitter side and the receiver side. In contrast, the techniques using the buffer-based model implicitly measure network delay and jitter using the amount of packets stored in the playout buffer of the receiver side.
Both the time-based model and the buffer-based model adjust the playout times of the media units, on the basis of measured parameters (explicit network delay and jitter, or buffered data), in order to avoid playout discontinuity. However, there are limitations that the performances of the techniques using the time-based model depend on the presence of the synchronized time between a transmitter and a receiver. This is because the accuracy of the measured network delay and jitter may be lowered due to the time error between the transmitter and the receiver. In order to solve this problem, an approximate time measuring technique and a synchronizing technique not using time have been proposed. The buffer-based techniques have an advantage in that the synchronized time between the transmitter side and the receiver side is not required.
In the beginning, the AMP techniques were used for packet audio applications, such as VoIP (Voice over Internet Protocol). An AMP algorithm that adjusts delay in the playout of audio packets due to a variable network delay has been proposed in Ramjee, J. Kurose, D. Towsley and H. Schulzrinne, “Adaptive playout mechanisms for packetized audio applications in wide-area networks,” in Proc. IEEE INFOCOM '94, vol. 2, pp. 680-688, June 1994. The proposed algorithms use the time-based model, and it is assumed that there is the synchronized time as a whole. A media synchronization algorithm introducing the concept of time has been proposed in Y. Ishibashi and S. Tasaka, “A synchronization mechanism for continuous media in multimedia communications,” in Proc. IEEE INFOCOM '95, pp. 1010-1019, April 1995. In the paper, it is assumed that the delay time of a network is limited to a predetermined value. An optimum average playout delay computing method for packet audio applications in the time-based model has bee proposed in S. B. Moon, J. Kurose and D. Towsley, “Packet audio playout delay adjustment: Performance bounds and algorithms,” ACM/Springer Multimedia Systems, vol. 5, no. 1, pp. 17-28, January 1998. In this paper, the performances of the packet audio applications are compared.
An AMP protocol, which is called ASP (Adaptive Synchronization Protocol), has been proposed in K. Rothermel and T. Helbig, “An adaptive stream synchronization protocol,” in Proc. NOSSDAV '95, vol LNCS 1018, pp. 189-202, April 1995. In this paper, a technique for controlling a receiver-side playout buffer has been proposed for network video applications. Further, in this paper, buffer threshold values, such as a low water mark (LWM) and a high water mark (HWM), are defined. Furthermore, in this paper, the playout rate is adaptively controlled when the current buffer level is lower than LWM or higher than HWM. Another AMP technique, which is called a buffer-based slide control protocol, has been proposed in M. Kato, N. Usui and S. Tasaka, “Stored media synchronization based on buffer occupancy in PHS”, in Proc. IEEE PIMRC '97, vol. 3, pp. 1049-1053, September 1997. This paper discloses a method of appropriately determining a buffer threshold value, which is an important guideline for AMP control.
A single-buffer-based AMP technique has been proposed in M. C. Yuang, P. L. Tien and S. T. Liang, “Intelligent video smoother for multimedia communications,” IEEE Journal on Selected Areas in Communications, vol. 15, no. 2, pp. 136-146, February 1997. This paper discloses a method of determining a proper buffer threshold value using a neural network traffic predictor, and defines VoD (Variance of Discontinuity) as a measure for intra-media synchronization quality. However, in order to apply the technique disclosed in the paper, a complicated traffic predictor is required, which makes it difficult to implement the technique. VDoP (Variance of Distortion of Playout) is defined as a measure for intra-media synchronization quality in N. Laouaris and I. Stavrakakis, “Adaptive playout strategies for packet video receivers with finite buffer capacity”, in Proc. IEEE ICC '01, vol. 3, pp 1660-1672, September 1999. VDoP is obtained by adding a factor capable of measuring the effect of frame loss due to buffer underflow to VoD. A quality-based adaptive media synchronization technique that adaptively determines a playout rate on the basis of RMSE (Root Mean Square Error) of the previous playout discontinuity has been proposed in H. Liu and M. E. Zarki, “A synchronization control scheme for real-time streaming multimedia applications”, in Proc. Packet Video Workshop 2003, April 2003.
The AMP techniques using the known playout discontinuity model use, as input parameters, the current buffer level, predetermined buffer threshold values, and the RMSE value of the previous playout discontinuity, or a complicated traffic analyzer to determine the playout rate, thereby performing intra-media synchronization. Therefore, there are limitations in improving intra-media synchronization quality.