The present invention relates generally to communication networks, and more specifically, to a method and system for rate-based scheduling.
High speed networks are designed to carry services with a wide range of traffic characteristics and quality-of-service (QoS) requirements. A common task for rate-based QoS enabled scheduling is ensuring that each queue is guaranteed a minimum rate, excess bandwidth is shared in a fair manner, each queue does not exceed a specified maximum rate, and the link is maximally utilized within the maximum rate constraints. While many implementations target this problem, current implementations either scale linearly with the number of queues, or result in substantial underutilization or unfairness.
As noted above, it is important for environments with large numbers of queues that its scheduling system operates in a scaleable manner. Incoming traffic streams flow through the queues (q0−qn) and a scheduler serves each queue to support a maximum rate (m0−mn). Conventional schedulers use O(n) (order (n)) algorithms to perform the scheduling and shaping of the queues. However, at some level of scalability, these algorithms become either infeasible (e.g., there is no available CPU that can process at the required rate) or cost ineffective (e.g., the amount of hardware required to perform the O(n) operation at the required performance level is too costly).
A typical approach taken by conventional scheduler implementations to ensure rate-limiting to maximum rates is to use a set of token-bucket shapers (one per queue). This typically results in an O(n) complexity, as previously described. It is possible to reduce the complexity, in cases where the total rates of all queues do not exceed the link rate, by scheduling the queue eligibility times on a real-time calendar queue. However, this approach does not work in the typical case when the maximum rates are overbooked, and also causes substantial underutilization.