1. Field of the Invention
The present invention relates to an apparatus for controlling packet output including a plurality of queues in which packets are stored, and more particularly to an apparatus for controlling packet output efficiently selects a packet that is to be transmitted in the shortest length of time from packets stored in the top positions of the plural queues to output the selected packet.
2. Description of the Related Art
The recent spread of ADSL, CATV, FTTH and the like provides homes and offices without exclusive communication lines with high-speed access to Internet, realizing distribution of high-volume contents, such as movies and news, over the Internet as well as conventional e-mail and files.
In the future, various applications may be integrated on the IP (Internet Protocol) network to serve to users. Therefore, the spread of broadband communication via Internet or LAN (Local Area Network) may expand the demand for routers (L3 switches) that play an important role in IP networks. Further, such routers require improvement of throughput to accommodate communication of higher-volume information.
When providing various services having respective different traffic characteristics (e.g., tolerance loss rate, delay) through IP network, which services are exemplified by smooth sending of e-mail, transferring files via FTP (File Transfer Protocol) and distributing moving pictures and audio data, a router needs to execute QoS (Quality of Service) control that controls priorities and bandwidths for the traffic characteristics in accordance with traffic flows (a plurality of subsequent packets that a user desires to transfer) to prevent each service from deteriorating.
For example, in relation to traffic data such as FTP data and WWW (World Wide Web) data, it is most important to equally transfer the data of each flow at a high speed utilizing the maximum resources (mainly, bandwidth) . For this purpose, a router has to allocate and save bandwidths reserved for the individual flows (or flow sets), and has to perform high-speed control to equally distribute unused resources of an output port (an output link) in order to the flows in accordance with the high-speed output port.
On the other hand, when audio traffic data exemplified by VoIP (Voice over IP) is transferred, it is more important to ensure a fixed transfer rate that minimizes delay jitters than to equally use the maximum resource of the bandwidths.
Here, a QoS scheme is shown in accompanying drawing FIG. 45. A conventional QoS control carries out the following steps (1) through (4):
(1) Identification (classification) of a flow of each input IP packet, which flow is defined in terms of individual application, with reference to an IP address, MAC (Media Access Control) address and ToS (Type of Service) field of the input packet;
(2) Distribution of each input packet to an appropriate queue corresponding to the identified flow number (queueing);
(3) Computing (scheduling) the order and the time of dequeing packets on the top positions of the individual queues based on the lengths of the packets and the time when the queuing of the packets in accordance with the priority and the bandwidth set for the individual queue; and
(4) Reading the top packets of each queue in response to the scheduling result (dequeuing).
The succession of processes (1) through (4) realizes QoS control over packets, each distributed to the corresponding queue, in accordance with a flow of each packet.
Among various algorithms for packet scheduling, technique of WFQ (Weighted Fair Queuing) and WRR (Weighted Round Robin) are commonly used in the art. Above all, those skilled in the art regard WFQ technique as the most suitable for fair distribution of bandwidths to variable-length packets. The algorithm of WFQ will now be described.
An accompanying drawing FIG. 46 shows the principles of controlling bandwidths of queues for which a fixed bandwidth is reserved. Supposing that the bandwidth set (reserved) for a queue of a scheduling object and the length of the packet on the top position (hereinafter called “the top packet”) of the object queue are respectively φ and L, completion of outputting the top packet takes the length of time period L/φ. In the technique of controlling a bandwidth, the output completion due time (a scheduling evaluation factor) F of the packet is derived from L/φ. When the current time Tr indicated by a timer exceeds the output completion due time F, the subsequent packet is allowed to output through the bandwidth. The packet output completion due time F is derived by the following formula (1).
                              F          i                =                              max            ⁢                                                  ⁢                          {                                                F                                      i                    -                    1                                                  ,                Ti                            }                                +                      (                                          L                i                            ϕ                        )                                              (        1        )            
In formula (1), Ti denotes the arrival time of the i-th packet of the flow number k; Li, the length of the i-th packet of the flow number k; and max{Fi−1, Ti}, the greater of the previous-packet output completion due time Fi−1 and the packet arrival time Ti.
As the case of Ti<Fi−1, the packet output completion due time Fi is derived based on the previous-packet output completion due time Fi−1 using formula (1) as shown in section (1) of FIG. 46; and conversely, as the case of Ti≧Fi−1, the packet output completion due time Fi is derived based on the packet arrival time Ti using formula (1) as shown in section (2) of FIG. 46.
Further in WFQ technology, when a packet arrives, computation using the formula (2) is performed on all i-th packets having the flow number k, which packets are the top packets of the individual queues in order to obtain scheduling evaluation factors.
                              F          i          k                =                              max            ⁢                                                  ⁢                          {                                                F                                      i                    -                    1                                    k                                ,                                  T                  i                  k                                            }                                +                                    (                                                L                  i                  k                                                  ϕ                  k                                            )                        ×                          (                                                ϕ                  b                                R                            )                                                          (        2        )            
In formula (2), φk denotes the bandwidth reserved for the flow k; φb, the total sum (Σ φk) of φk of WFQ in the active state WFQ; and R, the total sum of bandwidths of the WFQ sets. For convenience, formula (2) is represented by the following formula (3).Fik=αik+βik×γik  (3)
In formula (3), the α values of the top packets in the active state are constantly compared with the current time Tr indicated by a timer and the top packet having the minimum scheduling evaluation factor (hereinafter simply called an “evaluation factor”) F is selected from top packets with α values equal or less than Tr (Tr≧α) to be output. A queue whose immediate previous packet took a long time to complete being output and whose current top packet seems to be taking a long time to be output takes low priority to output the current top packets (to dequeue) therein.
As mentioned above, packet scheduling of WFQ determines the priority order of outputting the top packets from queues in accordance with the packet completion due time Fi of each top packet thereby realizing fair output of packets among the queues. At the same time, WFQ equally allocates unused bandwidth of each queue in the non-active state to queues in the active state whereupon service to the high-speed network communication is improved because resources of the bandwidth are efficiently used to output packets from the queues in the active state at a high-speed.
The active state represents the presence of at least one packet at an individual queue, and such a queue with one or more packets is called an “active queue”. A γ value in formula (3) changes in accordance with changes in states of the individual queues. Consequently, evaluation factors F for top packets should be re-computed each time states of the individual queues change.
As shown in FIG. 47, when there are five queues each for which 100 Mbps bandwidth is reserved and three of the five queues have one or more packets (i.e., only three packets are in the active state), 200 Mbps unused bandwidth reserved for the two non-active queues state are equally distributed to the three active queues, and bandwidth for each active queue therefore becomes 167 Mbps. After the distribution, packet scheduling is carried out again based on the new bandwidths to output (dequeue) each packet to a corresponding port.
Conventionally, an arithmetic operation to obtain a scheduling evaluation factor in WFQ is mainly performed by software on a processor. Software however executes only sequential processing and takes an excessive length of time to reflect real-time γ values on evaluation factors as the consequence of the arithmetic operations. Therefore such a conventional arithmetic operation cannot achieve a high throughput in processing in relation to the same output port and to a sequentially-arranged packet in the same queue.
In contrast, when hardware performs such an arithmetic operation in an attempt to boost the throughput, the hardware needs a multiplier and a divider because the arithmetic operations in formula (2) include multiplication and division. With a multiplier and a divider, the hardware becomes much larger in scale so that it is difficult to incorporate the hardware to an LSI or an FPGA (Field Programmable Gate Array). For this reason, arithmetic operation for WFQ performed on hardware has restricted the number of queues for each output port to several queues.
With a plurality of output ports, each output port may assign a dedicated scheduler, which is selected to perform arithmetic operation for the output port. However, such a configuration realized by hardware may also result in a circuit scale large in size thereby being substantially unable to be incorporated in an LSI or an FPGA. As a conventional solution, the amount of arithmetic operation may be reduced by a derivation algorithm for WFQ, such as a virtual clock that does not re-compute γ values.
However, a packet scheduler having a derivation algorithm, such as a virtual clock, causes the time scale used to compute packet output completion due time F to differ from actual time as disclosed in Japanese Patent Application Laid-Open (Kokai) No.2000-101637 (hereinafter called “prior publication”) . If a plurality of schedulers that control the order of outputting packets from the plural output ports are installed for service (QoS control), it is difficult to select a packet that is to be most-preferentially output or to perform other operations among such plural schedulers.
The technique proposed in the prior publication reduces the amount of arithmetic operations to compute γ values without introducing a derivation algorithm (such as a virtual clock). However, it is difficult for even the technique to install a WFQ (scheduler) having a number of output ports into an LSI and FPGA without increasing the scale of WFQ.
The prior publication focuses on the low possibility of changing over the size relation of packet output completion due times F for two packets having β values that are not much different even if their γ values change and on the short time difference between the two output completion due times F even if the size relation changes over. For this reason, the prior publication groups packets having evaluation factors F that have similar β values into clusters, and reflects changes in the γ values only on the evaluation factor F of the top pointer of each cluster. Thereby, it is possible to reduce the number of arithmetic operations by performing re-computing caused from a change in γ values as much as the number (M) of clusters.
However, the technique of the prior publication has the benefit of reducing the number of arithmetic operations when the difference of the β value is relatively small and the number M of clusters is much less than the number N of queues (M<<N). If the bandwidths φ of individual services change at a range of several dozen to several hundred kbps or several Gbps, flexibility of β values should be restricted or the number M of clusters needs to be changed in accordance with the flexibility.
Further, since changes in γ values are disregarded in each cluster, it is important to consider the number of clusters, which number tolerates the error of the γ values in QoS. Latency of management for pointers in each cluster creates bottlenecks in scheduling when the shortest packets are subsequently input to the apparatus. Even if the grouping into clusters reduces the number of arithmetic operations to some extent, more schedulers need to be installed in accordance with the increased number of output ports whereupon it is also difficult to install the schedulers into an LSI or FPGA.