There are many traffic scheduling algorithms that attempt to enhance the bandwidth utilization and the quality of service on a network. In the context of communication networks, the works initiated by Cruz [“A Calculus for Network Delay”, Part I: Network Elements in Isolation and part II: Network Analysis, R L Cruz, IEEE Transactions on Information Theory, vol. 37, No. 1 January 1991] and by Stiliadis [“Latency-Rate Servers: A General Model for Analysis of Traffic Scheduling Algorithms”, Dimitrios Stiliadis et al, IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 6, NO. 5 OCTOBER 1998] have built a theory that relates the notions of service rate, worst-case latency of a shared communication channel, and utilization rate of storage resources on the network elements.
This theory served as a basis for different traffic management systems. The most common method used at the router level is the weighted fair queuing method described in “Computer Networks (4th Edition)” by Andrew Tannenbaum, page 441 of the French version. An alternative better suited for networks-on-chip is to inject the traffic using the leaky bucket mechanism, described in “Computer Networks (4th Edition)” by Andrew Tannenbaum, from page 434 of the French version.
In every case, this amounts to assigning an average flow ρi to a “session” Si on a network link.
A buffer or queue is allocated to each data transmission session Si (i=1, 2, . . . n), for instance a channel, a connection, or a flow. The contents of these queues are transferred sequentially on a network link L at the nominal link speed r.
A flow regulator operates on each queue in order to limit the average rate of the corresponding session Si to a value ρi≦r. The rates ρi are usually chosen so that their sum is less than or equal to r.
To understand the operation globally, it may be imagined that the contents of the queues are emptied in parallel into the network at respective rates pi. In reality, the queues are polled sequentially, and the flow regulation is performed by polling less frequently the queues associated with lower bit rates, seeking an averaging effect over several polling cycles.
Under these conditions, Stiliadis et al. demonstrate that the latency between the time of reading a first word of a packet in a queue and sending the last word of the packet on the link L is bounded for certain types of scheduling algorithms. In the case of weighted fair queuing (WFQ), this latency is bounded by Spi/ρi+Spmax/r, where Spi is the maximum packet size of session i, and Spmax the maximum packet size among the ongoing sessions.
This latency component is independent of the size of the queues. Now it is known that in systems using multiple queues for channeling multiple flows on a shared link, the size of the queues introduces another latency component between the writing of data in a queue and the reading of the same data for transmission on the network.