The network context in which the present invention is situated shall now be briefly presented.
The improvement of computer performance as well as the bit rates offered by new generations of networks are opening the way for novel services based on multimedia streams. In fact, the quantity of audiovisual information transmitted on the networks (for example of the IP (“Internet Protocol”)) type is constantly on the increase, and the compression algorithms (for example of the MPEG (“Moving Picture Experts Group” type)) are improving, offering better quality with lower bit rates. However, today, the level of quality is not always acceptable. While each link in the chain has intrinsic capacities to provide this quality, the end-to-end positioning of these links and the sharing of the IP network resources by numerous users sometimes gives mediocre results.
In general, the transmission of information in an IP network relies on the transport layer for quality control between the source and the receivers. This layer, located between the routing and the applications, is traditionally set up by the TCP (“Transmission Control Protocol)”. From the applications viewpoint, the TCP is responsible for retransmitting lost or poorly received information through a check at the session. From a network viewpoint, certain protocol parameters enable the detection of possible congestion, and the matching of the bit rates of the source to the constraints of the network. The goal then is to limit the bit rate if the network cannot let through everything, and to avoid the sending of packet that will be lost. Many studies today are seeking to apply equivalent mechanisms to video streams with the real-time constraints of a dynamic matching of the encoders to the available bit rate.
However, owing to the substantial reaction time between one or more customers and the video source, the real-time and audiovisual protocols presently perform but few processing operations and are limited chiefly to the marking of the sending time and the packaging of the packets of the application in order to route them in the IP (for example the RTP/UDP) layer, and it is left to the applications to deal with the received data.
The development of networks is offering the possibility of managing “quality of service” (QoS) in routers. Now, it is relevant to note that this is the place in which the greatest losses occur in the IP networks, and mechanisms are implemented at this level in order to selectively process the different streams so as to achieve quality objectives with the utmost efficiency. The means used to improve the quality of the streams transmitted are the subject of research by the “IntServ” (integrated services) and “DiffServ” (“differentiated services”) groups at the IETF (Internet Engineering Task Force):                “IntServ” defines means to reserve a resource in terms of guaranteed bit rate between two nodes of a network;        “DiffServ defines means to dynamically control the stream aggregate bit rate as a function of the load of the network.        
As compared with end-to-end solutions (analogous to the TCP), localized solutions in routers have several advantages:                they remove the need for any session constraint;        they are also adapted to real time because their action in the routers is immediate without necessitating any return of information from the users;        they are also naturally suited to selective broadcasting (“multicast” broadcasting) because they are independent of the number of users supplied by the stream and independent of the reports coming from a variable number of users.        
The processing operations in the routers rely on a distinction of the packets arriving in the routers supporting “quality of service” (QoS”).
This distinction exists naturally in MPEG streams because the compression of the MPEG video streams leads to a sequence of data of different and non-independent natures. Three types of images can be distinguished: I, P and B. The I (Intra) type images are each encoded without making reference to another image, the compression being limited to the reduction of spatial redundancy within each image. To optimise the quantity of information transmitted, the MPEG encoders exploit the fact that two consecutive images have little difference between them in general. A motion detection considers only the part that has changed relative to the previous image to obtain a piece of information on reduced size encoded in a P (predicted) type of image. Other means are used to obtain even more limited information by interpolating the motions between two I or P type images; these images are then of a B (bidirectional) type.
The size of the P images is generally far smaller than that of the I type images, and an encoding with few I type images gives far higher decoding quality for an equivalent bit rate. Thus, the loss of an image is not equivalent on the basis of the nature of information that it contains. This structure of information must lead to considering the importance or weight of each piece of information in its processing by the network.
Two reasons warrant the preservation of I type images:                the periodic re-synchronisation of the stream in the event of losses;        any changes of scene because it is no longer possible to rely on the previous images for motion encoding.        
Another way of considering this weight of information is to subdivide the MPEG4 stream into several hierarchical levels to obtain quality that is variable as a function of the overall content received by the user. A hierarchical level N must rely on the presence of the N−1 lower levels to provide a quality complement. An elementary case lies in considering a video constituted by a basic stream (containing I and P images) and an enhancing stream (containing P and B images). For this elementary case, the basic stream in its totality is considered to have greater priority than the enhancing stream.
The natural distinction between images or streams in the MPEG traffic has to be exploited in the “DiffServ” routers for the selective processing of the different pieces of information in a video stream:                either by a marking of the TOS (“type of service”) or DSCP (“Differentiated Services Code Point”) fields of the packets by the video server,        or by a classification made by the router, which also leads to a marking of the IP packets.        
In the present document, the marking of the packets is considered to be possible in all cases. The means used to carry out this marking are considered to be well known to those skilled in the art. This point shall therefore not be discussed in detail.
When a situation of congestion appears in a router, the packets received are eliminated depending on the load and of their priority level.
A major characteristic of the data contained in these packets is their great variation as a function of the content of the scene. Now, the most vital information for restitution to the user is contained in the biggest bursts, and the main problem of “best effort” type IP networks is their difficulty in letting through bursts in the event of congestion.
As a consequence, the simple marking of the elementary streams of an MPEG stream is not sufficient for their optimum processing. Indeed, the usual “DiffServ” mechanisms are designed to accept bursts that are habitually found in the IP networks, with applications that are majority applications today: the transfer of URL (“Uniform Resource Locator”) for Web applications and the transfer of files by FTP (“File Transport Protocol”). All these applications exploit the TCP whose mechanisms have been the subject of many studies designed to obtain a gradual rise in the load and an adaptation of the bit rate to the load of the network.
Now, the video streams bring this operation into question because their behaviour is qualified as being excessive, inasmuch as they are generally unaware of the state and load of the network. Furthermore, most of the mechanisms introduced into the IP networks have the goal of smoothing the streams in order to foster an efficient flow of traffic. Paradoxically, the encoders provide higher quality when they produce variable bit-rate streams, which are the streams most ill-treated in IP networks.
The introduction of video streams into the IP network therefore comes up against three contradictions:                increasing the quality of the encoding leads to the production, for a defined mean bit rate, of bursts of packets when the encoded sequence requires it;        the access networks offer a maximum bandwidth that is limited (including in ADSL (“Asymmetric Digital Subscriber Line”) conditions because the applications are tempted to exploit a mean bit rate close to the maximum in order to provide better quality for users;        the bursts constitute the most important information because they correspond to a changes in scene or at least to a major change in the content of the image. Very often, these images are of the I (Intra) type whose loss is very critical because it becomes impossible to reproduce the following images, even if they are properly received.        
We shall now present the prior techniques of traffic conditioning designed to reduce congestion in the networks.
It shall first of all be noted that studies on the application of shaping to MPEG streams in IP networks on the basis of UDP and RTP as transport protocols are very rare. Most of the studies are on ATM (“Asynchronous Transfer Mode”) networks and the use of TCP as a transport protocol.
The traffic shaping and conditioning mechanisms are used in IP networks with “quality of service” (QoS). Reference may be made especially to M. F. Alam, M. Atiquzzaman, M. A. Karim, in “Traffic Shaping for MPEG video transmission over next generation internet” In this document, traffic smoothing or shaping (TS) is used to ensure compliance of the MPEG stream with the TSPEC (“Traffic Specifier”) necessary for the reservation of resources in the “IntServ” networks. For the “DiffServ” networks, the processing is done by stream aggregates and therefore without excessive concern over applications. The traffic shaping (TS) algorithms are, on the contrary, widely used in encoding with the aim of controlling the bit rate of the encoder. This remains insufficient to control the streams at the network level.
The use of traffic shaping by smoothing (TS) or traffic policing (TP) to reduce congestion in the network may significantly improve the “quality of service” (QoS) level that the network is capable of delivering to the applications.
Smoothing (TS) smoothes the bursts by bufferizing the packets concerned by the excess of bursts in the boundary equipment of the network. It can reduce congestion to acceptable levels especially as the “scheduler” algorithms such as CBQ (“class-based queuing”) or else PQ (“priority queuing”) algorithms are not capable of doing it. When used alone, these mechanisms propagate the bursts in the network.
As in the case of smoothing (TS), traffic policing (TP) limits the bit rate of the traffic to the configured bit rate. However, instead of “bufferizing” the packets as in traffic shaping (TS), the non-compliant packets are either rejected or re-marked to reduce their priority level. Traffic policing (TP) therefore does not shape the traffic but it does not introduce any “bufferization” time either.
In the majority of the architectures that take “quality of service” (QoS), there is a service contract between the network service provider (NSP) and the application service provider (ASP). In ATM networks, this contract is called a “traffic contract”. In “DiffServ” networks, the contractualization aspects are dealt with in the service level agreement (SLA) and more specifically in the service level specification (SLS).
The following document is also known: RFC2475, “An Architecture for Differentiated Services, December 1998. It states that the traffic sources may carry out the tasks of traffic classification and conditioning. Indeed, the traffic may be marked before it leaves the source domain. This is called “Initial Marking” or “Pre-marking”. The advantage of initial marking is that the preferences and the needs of the applications may be better taken into account to decide which packets must receive better processing in the network.
Prevention against the overload of a given service class is not an easy task in “DiffServ” networks. Furthermore, in the event of overload in a service class, it must be noted that all the streams of this service class suffer from deterioration in “quality of service” (QoS).
Furthermore, several mechanisms used in the implementation of “DiffServ” work less efficiently in the presence of bursts. The RED (“Random Early Drop”) mechanism for example is more efficient when it is applied to smoothed traffic. Else, it is these small streams that are penalised while the streams in bursts do not undergo any significant improvement.
The MPEG streams are characterized by the fact that they have a bursty nature and by their sensitivity to packet losses. These losses cause a deterioration of subjective quality, but it is very difficult to foresee the level of this deterioration caused by losses. This deterioration is closely related to the nature of the information conveyed by the lost packets. The error-correction mechanisms are used during the decoding to overcome the losses.
The management of the MPEG streams in the network or even in the edge routers (ER) is a complicated task. The network service provider (NSP) is not obliged to process the MPEG streams differently. Moreover, the traffic aggregate resulting from several audiovisual streams is generally difficult to describe:                the packets arrival process is auto-similar        there is great variation in the data conveyed by the packets;        the dynamic range of the protocols        
One solution consists in marking the packets and assigning them the appropriate “quality of service” (QoS) level before they leave the domain of the Internet service provider (ISP). The media access gateway (MAG) is for example responsible for this task. This MAG manages the traffic according to the specified SLS. This approach facilitates the negotiation of SLA/SLS for services in streaming mode and dictates a particular profile on the client.
Among present-day techniques, the most widespread one used for traffic control is the WRED (“Weighted Random Early Drop”) which consists of a loss of packets that is random and different as a function of the marking of the packets. This mechanism is based on an average rate of filling of the sending queue on a link of a network. However, this technique introduces a random character for the packet rejection, and the queue filling rate is not optimised. Depending on the sizes of the bursts and the frequency, these cases of rejection may occur for a low queue filling rate or a very high queue filling rate. This leads, firstly, to under-utilisation of the queue and, secondly, to the reservation of substantial memory size for the making of this queue. This problem exists for any type of application, and it is even more real for audiovisual streams because of their big bursts.
To state this point in detail, it is very important to note, first of all, that video stream bursts are unpredictable in size as well as in duration while at the same ensuring a mean bit rate during a period of about one second. This time slot for the computation of the mean bit rate is far too great to obtain reasonable sizes of the associated queues. In fact, the routers react by computing the filling averages for about 10 packets received, whereas certain bursts may substantially exceed 20 packets.
Thus, a succession of bursts may lead to cases of rejection by saturation of the capacity of the queue, inhibiting the normal working of the mechanism. On the contrary, when low traffic occurs after a sequence of bursts, the mean value may temporarily remain abnormally high and packets are rejected at a time when the queue is practically empty.