The present invention relates to computer networks, and more particularly to a method and system for controlling discarding and, therefore, transmission of data packets in a computer network.
Driven by increasing usage of a variety of network applications, such as those involving the Internet, computer networks are of increasing interest. In order to couple portions of a network together or to couple networks, switches are often used. For example, FIG. 1 depicts a simplified block diagram of a switch 10 which may be used in a computer network. The switch 10 couples hosts (not shown) connected with ports A 12 with those hosts (not shown) connected with ports B 36. The switch 10 performs various functions including classification of data packets provided to the switch 10, transmission of data packets across the switch 10 and reassembly of packets. These functions are provided by the classifier 18, the switch fabric 20 and the reassembler 30, respectively. The classifier 18 classifies packets which are provided to it and breaks each packet up into convenient-sized portions, which will be termed cells. The switch fabric 24 is a matrix of connections through which the cells are transmitted on their way through the switch 10. The reassembler 30 reassembles the cells into the appropriate packets. The packets can then be provided to the appropriate port of the ports B 36, and output to the destination hosts.
Due to bottlenecks in transferring traffic across the switch 10, data packets may be required to wait prior to execution of the classification, transmission and reassembly functions. As a result, queues 16, 22, 28 and 34 may be provided. Coupled to the queues 16, 22, 28 and 34 are enqueuing mechanisms 14, 20, 26 and 32. The enqueuing mechanisms 14, 20, 26 and 30 place the packets or cells into the corresponding queues 16, 22, 28 and 34 and can provide a notification which is sent back to the host from which the packet originated.
Although the queues 16, 22, 28 and 34 are depicted separately, one of ordinary skill in the art will readily realize that some or all of the queues 16, 22, 28 and 34 may be part of the same physical memory resource. FIG. 1B depicts one such switch 10xe2x80x2. Many of the components of the switch 10xe2x80x2 are analogous to components of the switch 10. Such components are, therefore, labeled similarly. For example, the ports A 12xe2x80x2 in the switch 10xe2x80x2 correspond to the ports A 12 in the switch 10. In the switch 10xe2x80x2, the queue A 14 and the queue B 22 share a single memory resource 19. Similarly, the queue C 28 and the queue D 34 are part of another single memory resource 31. Thus, in the switch 10xe2x80x2, the queues 16, 22, 28 and 34 are logical queues partitioned from the memory resources 19 and 31.
Conventional methods have been developed in order to control traffic flowing through the switch 10 or 10xe2x80x2, thereby improving performance of the network in which the switch 10 or 10xe2x80x2 is used. In particular, a conventional method known as RED (random early discard or detection) is used. FIG. 2 depicts the conventional method 40 used in RED. The conventional method 40 is typically used by one of the enqueuing mechanisms 14, 20, 26, 32, 14xe2x80x2, 20xe2x80x2, 26xe2x80x2 and 32xe2x80x2 to control the traffic through the corresponding queue 16, 22, 28, 34, 16xe2x80x2, 22xe2x80x2, 28xe2x80x2 and 34xe2x80x2 respectively. For the purposes of clarity, the method 40 will be explained with reference to the enqueuing mechanism 14 and the queue 16.
At the end of a short period of time, known as an epoch, a queue level of the queue 16 for the epoch is determined by the enqueuing mechanism 14, via step 41. Note that the queue level determined could be an average queue level for the epoch. In addition, the queue level determined could be the total level for the memory resource of which the queue 16 is a part. It is then determined if the queue level is above a minimum threshold, via step 42. If the queue level is not above the minimum threshold, then a conventional transmission fraction is set to one, via step 43. Step 43, therefore, also sets the conventional discard fraction to be zero. The transmission fraction determines the fraction of packets that will be transmitted in the next epoch. The conventional discard fraction determines the fraction of packets that will be dropped. The conventional discard fraction is, therefore, equal to one minus the conventional transmission fraction. A transmission fraction of one thus indicates that all packets should be transmitted and none should be dropped.
If it is determined in step 42 that the queue level is above the minimum threshold, then it is determined whether the queue level for the epoch is above a maximum threshold, via step 44. If the queue level is above the maximum threshold, then the conventional transmission fraction is set to zero and the conventional discard fraction set to one, via step 45. If the queue level is not above the maximum threshold, then the conventional discard fraction is set to be proportional to the queue level of the previous epoch divided by a maximum possible queue level or, alternatively, to some other linear function of the queue level, via step 46. Thus, the conventional discard fraction is proportional to the fraction of the queue 16 that is occupied or some other linear function of the queue level. In step 46, therefore, the conventional transmission is also set to be proportional to one minus the conventional discard fraction. The conventional transmission fraction and the conventional discard fraction set in step 43, 45 or 46 are then utilized for the next epoch to randomly discard packets, via step 47. Thus, when the queue level is below the minimum threshold, all packets will be transmitted by the enqueuing mechanism 14 to the queue 16 during the next epoch. When the queue level is above a maximum threshold, then all packets will be discarded by the enqueuing mechanism 14 during the next epoch or enqueued to a discard queue. When the queue level is between the minimum threshold and the maximum threshold, then the fraction of packets discarded by the enqueuing mechanism 14 is proportional to the fraction of the queue 16 that is occupied or some other linear function of the queue level. Thus, the higher the queue level, the higher the fraction of packets discarded. In addition, a notification may be provided to the sender of discarded packets, which causes the sender to suspend sending additional packets for a period of time. The individual packets which are selected for discarding may also be randomly selected. For example, for each packet, the enqueuing mechanism 14 may generate a random number between zero and one. The random number is compared to the conventional discard fraction. If the random number is less than or equal to the conventional discard fraction, then the packet is dropped. Otherwise, the packet is transmitted to the queue 16. This process of discarding packets based on the transmission fraction is continued until it is determined that the epoch has ended, via step 48. When the epoch ends, the method 40 commences again in step 41 to determine the conventional transmission fraction for the next epoch and drop packets in accordance with the conventional transmission fraction during the next epoch.
Because packets can be discarded based on the queue level, the method 40 allows some control over the traffic through the switch 10 or 10xe2x80x2. As a result, fewer packets may be dropped due to droptail than in a switch which does not have any mechanism for discarding packets before the queue 16 becomes full. Droptail occurs when packets must be dropped because a queue is full. As a result, there is no opportunity to account for the packet""s priority in determining whether to drop the packet. Furthermore, in some situations, the method 40 can reduce the synchronization of hosts sending packets to the switch 10 or 10xe2x80x2. This occurs because packets may be dropped randomly, based on the conventional transmission fraction, rather than dropping all packets when the queue level is at or near the maximum queue level. Performance of the switch 10 and 10xe2x80x2 is thus improved over a switch that does not utilize RED, that is, a switch that simply drops next arriving packets when its buffer resources are depleted.
Although the method 40 improves the operation of the switches 10 and 10xe2x80x2, one of ordinary skill in the art will readily realize that in many situations, the method 40 fails to adequately control traffic through the switch 10 or 10xe2x80x2. Despite the fact that packets, or cells, may be dropped before the queue becomes fall, the hosts tend to become synchronized in some situations. This is particularly true for moderate or higher levels of congestion of traffic in the switch 10 or 10xe2x80x2. The conventional transmission fraction is based on the queue level. However, the queue level may not be indicative of the state of the switch. For example, a queue level below the minimum threshold could be due to a low level of traffic in the switch 10 or 10xe2x80x2 (a low number of packets passing through the switch 10 or 10xe2x80x2). However, a low queue level could also be due to a large number of discards in the previous epoch because of high traffic through the switch 10. If the low queue level is due to a low traffic level, increasing the conventional transmission fraction is appropriate. If the low queue level is due to a high discard fraction, increasing the conventional transmission fraction may be undesirable. The conventional method 40 does not distinguish between these situations. As a result, the conventional transmission fraction may be increased when it should not be. When this occurs, the queue may become rapidly filled. The transmission fraction will then be dropped, and the queue level will decrease. When the queue level decreases, the transmission fraction will increase, and the queue may become filled again. The switch 10 or 10xe2x80x2 thus begins to oscillate between having queues full and queues empty. As a result, the average usage of the switch 10 or 10xe2x80x2 becomes quite low and the performance of the network using the switch 10 or 10xe2x80x2 suffers.
Accordingly, what is needed is a system and method for better controlling traffic through the switch. The present invention addresses such a need.
The present invention provides a method and system for controlling a flow of a plurality of packets in a computer network. The network includes a queue having a maximum queue level that is possible. The method and system comprise determining a queue level for the queue and determining an offered rate of the plurality of packets to the queue. The method and system also comprise determining a virtual maximum queue level based on the queue level and the maximum queue level and controlling a transmission fraction of the plurality of packets to the queue, based on the queue level, the offered rate and the virtual maximum queue level.
According to the system and method disclosed herein, the present invention provides a mechanism for improving the transmission fraction even at higher traffic rates so that the computer network is stable.