1. Technical Field
The present invention relates in general to telecommunication networks, and in particular to a method and system for efficiently allocating bandwidth among multiple pipes that share a common queue. Still more particularly, the present invention relates to dynamically adjusting bandwidth allocation among multiple pipes that share a common queue in accordance with a maximum output threshold level required for preventing downstream congestion.
2. Description of the Related Art
Switches are often used in telecommunications to couple portions of a network together or to couple networks. For example, FIG. 1 depicts a high-level block diagram of a switch 10 that can be implemented within a telecommunication network. Switch 10 includes a switch 24 fabric coupled with blades 7, 8 and 9. Each of blades 7, 8 and 9 is generally a circuit board and includes at least a network processor 2 coupled with ports 4. Each of ports 4 are in turn coupled with hosts (not shown). Blades 7, 8 and 9 can provide traffic to switch fabric 24 and accept traffic from switch fabric 24. Thus, any host connected with one of blades 7, 8 or 9 can communicate with another host connected to another of blades 7, 8 or 9.
FIG. 2A depicts a simplified block diagram of switch 10, wherein the functionality of the network processors is illustrated. Switch 10 couples hosts (not shown) connected with ports A 12 with those hosts (not shown) connected with ports B 36. Switch 10 performs various functions including classification of data packets delivered to switch 10, transmission of data packets across switch 10 and reassembly of packets. These functions are performed by a classifier 18, switch fabric 24, and a reassembler 30, respectively. Classifier 18 classifies packets that are provided to it and divides each packet into convenient-sized portions, which will be termed cells. Switch fabric 24 is a matrix of connections through which the cells are transmitted on their way through switch 10. Reassembler 30 reassembles the cells into the appropriate packets. The packets can then be provided to the appropriate port or ports 36, and output to the destination hosts. Classifier 18 may be part of one network processor 1, while reassembler 30 may be part of another network processor 5. The depicted portions of network processor 1 and network processor 5 process traffic traveling to and from ports A 12 and ports B 36. Thus, each network processor 1 and 5 can perform classification and reassembly functions. Furthermore, each network processor 1 and 5 can be a network processor 2 shown in FIG. 1.
Continuing with FIG. 2A, due to traffic bottlenecks that arise across switch 10, data packets maybe required to wait prior to execution of the classification, transmission and reassembly functions. Several queues 16, 22, 28 and 34 address this situation by providing convenient storage locations for delayed packets. Enqueuing mechanisms 14, 20, 26 and 32 are coupled to queues 16, 22, 28, and 34, respectively. Enqueuing mechanisms 14, 20, 26 and 32 enqueue the packets or cells into the corresponding queues 16, 22, 28 and 34 and can furthermore provide a suitable notification to the host from which an enqueued packet originated.
Although queues 16, 22, 28 and 34 are depicted separately, one of ordinary skill in the art will readily realize that some or all of the queues 16, 22, 28 and 34 may be part of the same physical memory resource. FIG. 2B depicts one such switch 10xe2x80x2. Many of the components of switch 10xe2x80x2 are analogous to components of switch 10. Such components are, therefore, labeled similarly. For example, ports A 12xe2x80x2 in switch 10xe2x80x2 correspond to ports A 12 in switch 10. As depicted in FIG. 2B, switch 10xe2x80x2, a queue 16xe2x80x2, and a queue 22xe2x80x2 share a single memory resource 19. Similarly, a queue 28xe2x80x2 and a queue 34xe2x80x2 are part of another single memory resource 31. Thus, in switch 10xe2x80x2, queues 16xe2x80x2, 22xe2x80x2, 28xe2x80x2 and 34xe2x80x2 are logical queues partitioned from memory resources 19 and 31.
Conventional switches process traffic flows from switch queues uniformly. There is, however, a trend toward providing customers with different services based, for example, on the price paid by a consumer for service. A consumer may wish to pay more to ensure a faster response or to ensure that the traffic for the customer will be transmitted even when traffic for other customers is dropped due to congestion. Thus, the concept of differential services has been developed. Differentiated services (Diffserv) can provide different levels of service, or flows of traffic through the network, for differentiated customers.
DiffServ is an emerging Internet Engineering Task Force (IETF) standard for providing differentiated services (see IETF RFC 2475 and related RFCs). Diffserv is based on behavior aggregate flows. A behavior aggregate flow can be viewed as a pipeline from one edge of the network to another edge of the network. Within each behavior aggregate flow, there could be hundreds of sessions between individual hosts. However, DiffServ is unconcerned with sessions within a behavior aggregate flow. Instead, DiffServ is concerned with allocation of bandwidth between the behavior aggregate flows. According to DiffServ, excess bandwidth is to be allocated fairly between behavior aggregate flows. Furthermore, one interpretation of DiffServ provides criteria, discussed below, for measuring the level of service provided to each behavior aggregate flow.
A mechanism for providing different levels of services utilizes a combination of weights and a queue level to provide different levels of services. FIG. 3 depicts such a conventional method 50. The queue level thresholds and weights are set, via step 52. Typically, the queue level thresholds are set in step 52 by a network administrator turning knobs. The weights can be set for different pipes, or flows, through a particular queue, switch 10 or network processor 1 or 5. Thus, the weights are typically set for different behavior aggregate flows. The queue levels are observed, typically at the end of a period of time known as an epoch, via step 54. The flows for the pipes are then changed based on how the queue level compares to the queue level threshold and on the weights, via step 56. Flows for pipes having a higher weight undergo a greater change in step 56. The flow for a pipe determines what fraction of traffic offered to a queue, such as the queue 15, by the pipe will be transmitted to the queue 16 by the corresponding enqueuing mechanism, such as the enqueuing mechanism 14. Traffic is thus transmitted to the queue or dropped based on the flows, via step 58. A network administrator then determines whether the desired levels of service are being met, via step 60. If so, the network administrator has completed his or her task. However, if the desired level of service is not achieved, then the queue level thresholds and, possibly, the weights are reset, via step 52 and the method 50 repeats.
Although the method 50 functions, one of ordinary skill in the art will readily realize from 10 that it is difficult to determine what effect changing the queue level thresholds will have on particular pipes through the network. A network administrator using the method 50 may have to engage in a great deal of experimentation before reaching the desired flow rate for different customers, or pipes (behavior aggregate flows) in a computer network. This problem is particularly acute with regard to determining the effect that queue thresholds will have on downstream traffic. Conventional bandwidth allocation techniques require feedback from downstream devices as a parameter for adjusting shared queue pipe flow rates and are thus subject to increased cycle adjustment latency.
Furthermore, the method 50 only has indirect and imprecise effects on parameters that are typically used to measure the quality of service. Queue levels are not a direct measure of criteria typically used to characterize a particular service. Typically, for example in DiffServ (see IETF RFC 2475 and related RFCs), levels of service are measured by four parameters: drop rate, bandwidth, latency and jitter. The drop rate is the percentage of traffic that is dropped as it flows across a switch. The bandwidth of a behavior aggregate flow is a measure of the amount of traffic for the behavior aggregate flow which crosses the switch and reaches its destination. Latency is the delay incurred in sending traffic across the network. Jitter is the variation of latency with time. The queue levels are not considered to be a direct measure of quality of service. Thus, the method 50 does not directly address any of the criteria for quality of service. Thus, it is more difficult for a network administrator to utilize the method 50 for providing different levels of service.
Another conventional method for controlling traffic utilizes flows, minimum flow rates, weights, priorities, thresholds and a signal indicating that excess bandwidth, or ability to transmit traffic, exists in order to control flows. However, it is not clear that this conventional method is a stable mechanism for controlling the output rate from a port in a switch. Consequently, this conventional method may not adequately control traffic through switch 10.
It can therefore be appreciated that a need exists within a shared queue environment for allocating bandwidth among a plurality of pipes that such differentiated services may be provided while observing an output flow limit from the shared queue. The present invention addresses such a need.
A method for dynamically adjusting the flow rate of a plurality of logical pipes that share a common output queue is disclosed herein. In accordance with the method of the present invention, a minimum flow rate and a maximum flow rate are set for each of the pipes. Next a determination is made of whether or not excess queue bandwidth exists in accordance with the output flow rate of the shared queue. The determination of whether or not excess bandwidth exists comprises comparing the output flow rate of the shared queue with a pre-determined threshold queue output value. An instantaneous excess bandwidth signal has a value of 1 if there is excess bandwidth and is otherwise 0 if there is no excess bandwidth. In an alternate embodiment, the instantaneous excess bandwidth signal for a particular pipe is logically ANDed with one or more additional excess bandwidth signals to form a composite instantaneous excess bandwidth signal. In response to the existence of excess queue bandwidth, a flow rate of a pipe is linearly increased while in response to a lack of excess queue bandwidth, the flow rate of the pipe is exponentially decreased.
All objects, features, and advantages of the present invention will become apparent in the following detailed written description.