The invention generally relates to the art of scheduling systems wherein messages associated with plural processes are stored in a number of queues for subsequent processing by a single resource having limited processing capability. The invention has particular application to the field of digital communications systems and in this aspect relates to a scheduler and related method for efficiently allocating the bandwidth of a communications link amongst multiple queues which may be associated with a variety of service classes.
In various types of communication systems, including Asynchronous Transfer Mode (ATM) systems, situations often arise where a number of connections vie for the bandwidth of a communications link in a communication device, such as at a network node. When such a situation arises, it is necessary to queue or buffer data packets or cells from the contending connections, and the queues must be serviced in some xe2x80x9cfairxe2x80x9d way in order to ensure that all of the connections are adequately serviced.
A similar situation arises in the more general case where plural processes contend for a single resource. For instance, a distributed processing system may comprise a number of local controllers, responsible for various facets of the systems, which are connected to a central controller, responsible for the overall management of the system. The local controllers communicate with the central controller by sending it messages, which the central controller must process, i.e., act upon. In this sense, the local controllers present xe2x80x9cjobsxe2x80x9d to the central controller. At any instant of time, some of the local controllers will not be busy, having no messages which must be processed by the central controller. Concurrently, some of the local controllers will be busy, presenting multiple messages, and hence potential jobs, to the central controller. Since the central controller may be busy with other jobs, it stores the messages in various queues, e.g., according to the type or class of local controller from which the message originated, until such time the central controller can process the message and carry out the associated job. These messages must also be serviced in some fair way to ensure that all of the local controllers are adequately handled. It will be seen from the foregoing that the messages or jobs correspond to data packets of the digital communication system, and the fixed processing power or speed of the central controller corresponds to the bandwidth of the communications link.
A common xe2x80x9cfairxe2x80x9d scheduling scheme is proportional weighted fair queuing (hereinafter xe2x80x9cproportional WFQxe2x80x9d) wherein each queue, corresponding to each connection, is assigned a weight proportional to its allocated service rate. The proportional WFQ scheduler uses this weight to determine the amount of service given to the queue such that the scheduler is able to provide the allocated service rate for a given connection over a reasonably long busy period (i.e., when its queue is continuously non-empty), provided that the scheduler is not over-booked. The notion of an allocated service rate suits ATM systems in particular because almost all of the five currently defined ATM service classes rely on rate as a basis for defining quality of service (QoS). For instance, constant bit rate (CBR) connections are guaranteed a cell loss ratio (CLR) and delay for cells that conform to the peak cell rate (PCR). Variable bit rate (VBR) connections, real-time and non-real-time, are also guaranteed a CLR and delay for cells that conform to the sustained cell rate (SCR) and PCR. An available bit rate (ABR) connection is given a variable service rate that is between a minimum cell rate (MCR) and PCR. Unspecified bit rate (UBR) connections are associated with PCRs, and are soon anticipated to also be associated with MCRs.
In addition to the allocated service rate, because a proportional WFQ scheduler is work conserving, each non-empty queue will also receive a certain amount of instantaneous idle bandwidth. This is the extra service bandwidth that a queue receives due to (1) any unallocated bandwidth of a communications link, and (2) any allocated but currently unused bandwidth arising from the idle, non-busy periods of the other queues at the contention point.
To explain this in greater detail, suppose that queue n is given a weight xcfx86n which is proportional to the allocated service rate queue n should receive. The proportional WFQ scheduler thus distributes the total allocated bandwidth of the communication link amongst all the queues in proportion to their allocated service rates. Consequently, the idle bandwidth of the link is also distributed in proportion to the allocated service rates of all the non-empty queues. An example of this is shown in FIG. 1(a) where four queues 14, corresponding to four connections A, B, C and D, are serviced by a proportional WFQ multiplexer 8 in order to produce an output cell stream or link 16. Connections A, B and C have allocated service rates equal to 30% of the total bandwidth associated with the link 16 and are thus equally weighted. The allocated service rate of connection D is equal to 10% of the total bandwidth of link 16. FIG. 1(b) is a bandwidth occupancy chart illustrating how the link bandwidth is allocated to the connections. From time t=0 to 8, each of the connections has cells requiring servicing and thus the instantaneous bandwidth received by each connection is 25% of the total bandwidth. At time t=8, however, only connections B and D are non-empty having cells to be serviced, and thus the instantaneous idle bandwidth (now being 50% of the total bandwidth) is allocated to connections B and D in proportion to their allocated service rates. Thus, at time t=8, connection B receives 75% of the instantaneous total bandwidth and connection D receives 25% of the instantaneous total bandwidth. In general, the theoretical instantaneous service that queue n receives at time t when it is non-empty is xcfx86n/xcexa3i∈A(t)xcfx861 where A(t) is the index set of non-empty queues at time t.
Suppose then that a proportional WFQ scheduler is used in an ATM communications device, such as a network node. A CBR connection should have an allocated service rate equal to its PCR. A VBR connection should have an allocated service rate, VBW (virtual bandwidth), which is at least equal to its SCR and less than its PCR. (VBW is typically statistically calculated at set up by the connection and admission control (CAC) function of a network.) An ABR connection should have an allocated service rate equal to its SCR, and a UBR connection should have an allocated service rate equal to zero. So, in such an scenario, the amount of idle bandwidth that a CBR connection sees is proportional to its PCR, and that an ABR connection sees is proportional to its MCR. This may result in very undesirable service. For example, suppose that a switch is carrying four connections (only): one is CBR with PCR=980 kbps, two connections are ABR with MCR=10 kbps, and one is UBR. Consequently, the idle bandwidth distribution is 98% for the CBR connection and 1% for each of the ABR connections, assuming a period when all the connections are busy. Such a distribution is certainly not desirable, since CBR connections should generally not receive service bandwidth beyond their PCRs. ABR connections would get extra bandwidth in proportion to their MCRs; a phenomenon commonly termed MCR proportional service. MCR proportional service is one way of fairly distributing idle bandwidth fairly, but the literature has other methods such as MCR plus fair share which proportional WFQ cannot support. And the UBR connection only gets service if all the other queues are empty. Such distributions of the idle bandwidth are not optimal, and hence it is desired to achieve a more efficient distribution of the idle bandwidth.
Generally speaking, the invention provides a method for servicing a plurality of queues holding messages, such as data packets, destined for processing by a resource having a finite processing bandwidth, such as a communications link having a finite transmission bandwidth. The method comprises the steps of: (a) provisioning each queue with a minimum guaranteed service rate; (b) provisioning each queue with an idle bandwidth proportion; (c) servicing each queue by forwarding messages thereof to the resource at time intervals corresponding to the minimum guaranteed service rate of the queue, provided the queue is non-empty; and (d) servicing the queues in accordance with the proportion of idle bandwidth allocated to each queue during time intervals when none of the queues have packets being forwarded to the resource in conformance with step (c). In this manner, the amount of instantaneous idle bandwidth that a queue receives is decoupled from the allocated service rate granted to the queue.
In the preferred embodiment, the above method is carried out by a hierarchical scheduler which comprises (a) an exhaustive scheduler servicing a plurality of lower level schedulers in accordance with non-equal priority levels assigned thereto; (b) a non-work conserving shaper scheduler feeding the exhaustive scheduler; and (c) a work conserving idle bandwidth scheduler feeding the exhaustive scheduler. The exhaustive scheduler is configured so that the shaper scheduler is given exhaustive priority over the idle bandwidth scheduler. The hierarchical scheduler is coupled to the queues such that each queue concurrently contends for service from the shaper scheduler and from the idle bandwidth scheduler.
The non-work conserving shaper scheduler, such as a virtual clock shaper described below, generates a stream of data packets at a constant average bit rate. Since the shaper scheduler servicing a particular queue (which may correspond to one connection) has a higher priority than the work conserving idle bandwidth scheduler, such as a WFQ scheduler, the queue is guaranteed its allocated service rate during its busy period. However, the shaper scheduler does not always submit messages (or in the preferred embodiment, the identity of queues) to the exhaustive scheduler because not all queues are busy at all times, and even if a queue is busy, it may not be eligible to be serviced due to the non-work conserving nature of shaping. These periods constitute the idle bandwidth of the resource. During this xe2x80x9cidlexe2x80x9d time, the lower priority work conserving idle bandwidth scheduler servicing the queue is able to feed the exhaustive scheduler. The idle bandwidth scheduler distributes this idle bandwidth in manner which is preferably non-dependent upon the guaranteed service rates allocated to the queues. In the preferred embodiment, the idle bandwidth scheduler partitions the instantaneous idle bandwidth in a fixed manner or ratio between QoS classes, and equally between all connections associated with a particular QoS class.
In certain preferred embodiments, the shaper scheduler and the idle bandwidth scheduler are also each preferably composed of a plurality of sub-schedulers in order to more flexibly accommodate the distribution of idle bandwidth in an ATM application environment, as explained in greater detail below.
According to another broad aspect of the invention, there is provided a hierarchical scheduler for servicing a plurality of queues holding messages. This scheduler comprises an exhaustive sub-scheduler servicing a plurality of lower level sub-schedulers in accordance with non-equal priority levels assigned thereto; M non-work conserving shaper sub-schedulers feeding the exhaustive sub-scheduler; and N work conserving idle bandwidth sub-schedulers feeding the exhaustive sub-scheduler. A given queue concurrently contends for service form one of the shaper sub-schedulers and from one of the idle bandwidth sub-schedulers, and the shaper sub-scheduler servicing the given queue has a higher priority level with respect to the exhaustive sub-scheduler than the idle bandwidth sub-scheduler servicing the given queue.