The real-time video transmission usually adopts the connectionless-oriented User Datagram Protocol (UDP) due to its large data volume and real-time requirement. However, there is no congestion control mechanism in UDP. Currently, the internet is a Best-Effort network and cannot ensure the Quality of Service (QoS), which results in the problems such as the bandwidth fluctuations, the packet loss and the transmission delay. Also, the network status changes dynamically and cannot be accurately estimated through the conventional models, which is a major problem for real-time video transmission. Therefore, to ensure the QoS of the real-time video transmission, the adaptive video transmission control mechanism has to be adopted, and the video transmission control has the following three requirements:
(1) Transmission Control Protocol (TCP)-Friendliness: A flow is TCP-friendly if, and only if, in a steady state, it uses in the long term no more bandwidth than a conforming TCP flow that would be used under comparable conditions, that is, a TCP-Friendly flow and a TCP flow can evenly share the bandwidth for a long time in the same channel and there is no aggressive occupation of the bandwidth.
(2) Real-time: To reduce the influence (on the video quality) from bandwidth fluctuations, which may be caused by abrupt changes of background traffic, the real-time transmission control has to detect the network bandwidth fluctuations, and correspondingly, to adjust the sending rate to adapt to the network bandwidth fluctuations in real time, so as to reduce the packet loss ratio during the video transmission.
(3) Smoothness: The target bit rate of the encoder is determined according to the bandwidth output by the real-time video transmission control, and furthermore, the target bit rate determines the perceptual quality of a video stream. The perceptual quality of a video stream is better in the case of a slightly degradation of image fidelity but maintaining a smooth subjective quality than in the case of a high fidelity but a serious fluctuation of subjective quality. Therefore, smoothness is a special requirement of the video transmission control.
The real-time video transmission needs to meet different requirements within different time-scales, namely, a long time-scale requirement and a short time-scale requirement. The real-time video transmission has an obvious characteristic of two time-scales in terms of both the network bandwidth fluctuation and the perceptual quality of video stream.
The network bandwidth can be regarded as a time sequence and can be divided into two parts, namely, the trend item and the disturbance item, as shown in FIG. 1. The trend item is a general tendency of the network bandwidth and shows the direction in the way network bandwidth is developing, which is usually relatively smooth. Therefore, the trend of network bandwidth can be appropriately predicted through a suitable prediction model. The trend item is the long time-scale feature of the network bandwidth, and the TCP-friendly requirement of video transmission control reflects such the long time-scale requirement (according to the aforementioned definition of TCP-friendliness). Due to the changes of the network background traffic and the instability of the video traffic (caused by the fluctuations of the video quality), the actual network bandwidth fluctuates around the bandwidth trend within a short time. The variation of the network bandwidth within a short time is regarded as a disturbance item and is mainly caused by an abrupt change of the network background traffic. The disturbance item is the short time-scale feature of the network bandwidth, and the real-time requirement of the video transmission control reflects such the short time-scale requirement.
The real-time video transmission requires that the network bandwidth can be friendly shared in a long time-scale and the fluctuations of the network bandwidth can be quickly responded in a short time-scale. Therefore, to ensure the quality of the real-time video transmission, a multiple-scales video stream transmission control mechanism is required. In the long time-scale, the trend of the network bandwidth should be accurately extracted, and meanwhile, the TCP-friendliness of the transmission should be ensured. On the other hand, in the short time-scale, the fluctuations of the network bandwidth should be responded in time, and meanwhile, the smoothness of video quality should be ensured. According to the above analysis, the real-time video transmission control has an obvious multiple time-scales requirement. Therefore, in order to meet the requirements of TCP-friendliness, real-time and smoothness, the video transmission control scheme should be designed aiming at different time-scales.
The conventional transmission control methods can be divided into two categories, namely, the model-based method and the additive increase/multiplicative decrease (AIMD) method. In the model-based method, the TCP throughput model is employed to calculate the network bandwidth according to network feedback information, such as the packet loss ratio, transmission delay. This method can achieve a smooth bandwidth estimation and can efficiently extract the trend of network bandwidth. However, the TCP throughput model highly relates to the network feedback information, which may result in some delay, such as, the statistics and the transmission of feedback information at the decoder (video receiver), and the processing of the feedback information at the encoder (video sender). So there will be a mismatch between the estimated bandwidth and the current actual bandwidth. Therefore, the model-based method cannot adapt to the bandwidth fluctuation in time. In the AIMD method, according to different network statuses, the previous estimated bandwidth is additively increased or multiplicatively decreased to achieve the estimation of current actual bandwidth. So the AIMD method can quickly adapt to the network bandwidth fluctuations. But because the output bandwidth of AIMD method will result in a sawtooth shape similar to that of TCP within a short time, the AIMD method cannot achieve the smooth estimation of bandwidth. Moreover, both the above two methods are single time-scale control method, which cannot meet the multiple time-scales requirement of video transmission control.