The present invention relates to a node apparatus such as a router or switch which stores a received packet or cell in a given queue on the basis of the attributes of the packet or cell and controls the transmission of packets on the basis of transmission conditions set for the respective queues and a packet transmission control method and, more particularly, to a node apparatus which transmits packets from a large number of queues at high speed and a packet transmission control method.
Packet communication is performed by using variable-length packets, whereas ATM communication is performed by using fixed-length cells. These packets and cells are given attributes by means of information contained in them. A router and a switch process packets and cells in accordance with their attributes.
For the sake of simplicity, in this specification, a node apparatus such as a router or switch will be simply referred to as a node apparatus, and a packet or cell will be simply referred to as a packet hereinafter.
In general, a node apparatus has a plurality of queues corresponding to the attributes of packets, and transmission conditions are set for the respective queues. The node apparatus includes a packet transmission controller, stores a received packet in a given queue in accordance with the attributes of the packet, and transmits packets in accordance with the transmission conditions set for the respective queues.
As a conventional packet transmission control method, a weighted round robin scheme (M. Katavenis et al., “Weighted round-robin cell multiplexing a general-purpose ATM switch chip”, IEEE Journals on Selected Areas in Communication, Col. 9, No. 8, 1991; to be referred to as a WRR scheme hereinafter) is available. In the WRR scheme, a weight is set for each queue, and a packet is transmitted from each queue in accordance with the ratio of the set weights.
In the WRR scheme, a weight is set for each queue. A weight is allocated to each queue. The respective queues transmit packets while sharing a line band in accordance with the weights. A WRR node apparatus has a counter for each queue. Each counter records the number of packets that can be transmitted from a corresponding queue. The initial value of each counter is the value of a corresponding weight. A circulation sequence is set for each queue.
In this scheme, in accordance with the circulation sequence, the node apparatus searches for a queue in which one or more packets are stored and the counter value is 1 or more. Upon finding a queue in which one or more packets are stored and the counter value is 1 or more, the node apparatus transmits one packet from the queue and decrements the counter by one. The node apparatus begins to search for a queue in which one or more packets are stored and the counter value is 1 or more, from the queue next to the queue from which the packet has been transmitted, in accordance with the circulation sequence.
If there is no queue in which one or more packets are stored and the counter value is 1 or more, i.e., there is no queue that can transmit a packet, even after making one round of the queues, the node apparatus returns the counters of all the queues to the initial values, i.e., the values of the weights, and then starts processing again. Operation of returning the counters to the initial values will be referred to as reset operation hereinafter.
In the WRR scheme, therefore, when only one queue can transmit a packet, and a packet is transmitted from it, the node apparatus checks all the queues to see if packets can be transmitted. The node apparatus needs to check all the queues at the maximum to transmit one packet. The processing amount is proportional to the number of queues. As the number of queues increases, therefore, the packet processing speed of the node apparatus cannot be increased.
Japanese Patent Laid-Open No. 2001-053798 discloses the first conventional method of speeding up packet processing in the WRR scheme. According to the method disclosed in Japanese Patent Laid-Open No. 2001-053798, a content addressable memory (CAM) is mounted in a node apparatus and used to select a queue that can transmit a packet. The state of each queue is recorded on the content addressable memory to allow a quick search for a queue that matches a search word. The node apparatus finds a queue that can transmit a packet next by searching the content addressable memory with a search word indicating a state wherein a packet can be transmitted.
A node apparatus based on this method can always make a quick search for a queue that can transmit a packet in a predetermined period of time regardless of the number of queues, and hence is effective for high-speed processing in a case wherein there are many queues.
Japanese Patent Laid-Open No. 11-041316 discloses the second conventional method of speeding up packet processing in the WRR scheme. According to the method disclosed in Japanese Patent Laid-Open No. 11-041316, a node apparatus manages first and second lists. The first list manages queues that can transmit packets, i.e., the numbers of queues in each of which the counter value is 1 or more and one or more packets are stored. The second list manages the number of queues each of which stores one or more packets but cannot transmit it because the counter value is 0.
The node apparatus extracts the queue number at the start of the first list and transmits a packet from the queue having the number. If the queue is still in a ready state after having transmitted the packet, the node apparatus adds the number of this queue to the end of the first list. When a packet is left and the counter value is 0 in the queue which has transmitted the packet, the node apparatus adds the queue to the second list. When the counter becomes 0 in the queue which has transmitted the packet, the node apparatus does not add the queue to any list.
When there is no queue that can transmit a packet, i.e., the first list becomes empty, the node apparatus performs reset operation with respect to the counters of all the queues. The node apparatus then adds all the queue numbers in the second list to the first list, and starts packet processing again.
As the third conventional method, a deficit round robin scheme which is another conventional packet transmission control method (M. Shreedhar and G. Varghese, “Efficient Fair Queuing using Deficit Round Robin”, IEEE/ACM Transactions on Networking, to be referred to as a DRR scheme hereinafter) is available. The DRR scheme is the same as the WRR scheme in that a node apparatus has weights and counters.
Unlike in the WRR scheme of cyclically transmitting packets one by one from the respective queues, in the DRR scheme, packets in one queue which can be transmitted are continuously transmitted. The DRR scheme differs from the WRR scheme in that the value of a counter value represents the data amount (e.g., byte) of packets that can be transmitted. In addition, weights and data amounts are defined as units. Furthermore, when reset operation is done for a counter, a weight is added to the counter.
For this reason, the node apparatus manages the first list described above. The node apparatus extracts a queue number from the start of the first list, and continuously transmits packets until no packet is left in the queue having the number or the counter value becomes smaller than the packet length of the first packet. Upon transmission of one packet, the node apparatus subtracts the packet length from the counter.
Upon transmission of all packets that can be transmitted from one queue, the node apparatus performs reset operation with respect to the counter of the queue. If one or more packets are stored in the queue, the node apparatus adds the queue to the end of the first list.
According to still another conventional packet transmission control method, unlike the WRR scheme and DRR scheme, no counter is used, and transmission times are used. As examples of this method, a weighted fair queuing scheme (A. K. Parekh and R. G. Gallager, “A Generalized processor sharing approach to flow control in integrated services networks—The single node case”, IEEE INFOCOM92, Vo. 12, 1992; to be referred to as a WFQ scheme hereinafter) and various schemes originating from the WFQ scheme are available.
In the WFQ scheme, a node apparatus records a virtual finish time of transmission of the last packet transmitted from each queue, and sequentially transmits packets from a queue with the earliest finish time of transmission. In the WFQ scheme, therefore, when a packet is to be transmitted, the node apparatus needs to search for a queue, of all the queues, which has the earliest finish time of transmission. The calculation amount of processing for searching for a queue with the earliest finish time of transmission is generally proportional to the number of queues. As the number of queues increases, therefore, the packet processing speed of the node apparatus cannot be increased.
In the WFQ scheme, a conventional method in which packet processing is speeded up uses a calendar indicating a transmission schedule of packets. Note that this calendar is called a time wheel and the like. The calendar includes a plurality of entries. The respective entries are tables indicating the times and arranged in chronological order. On each entry, the number of a queue scheduled to transmit a packet at the corresponding time is recorded.
The node apparatus generates a calendar on the basis of the finish times of transmission, and uses the calendar when transmitting a packet. On an entry corresponding to the transmission time of each queue, the number of the queue is recorded. When a packet is to be transmitted, the node apparatus obtains a queue for transmitting the packet by reading the entries of the calendar in chronological order. With this operation, when a packet is to be transmitted, the node apparatus need not perform a large amount of processing proportional to the number of queues in finding a queue, of all the queues, which has the earliest finish time of transmission.
Various schemes similar to the WFQ scheme have been proposed. For example, the fourth conventional method is disclosed in Japanese Patent Laid-Open No. 2000-183959. According to the method disclosed in Japanese Patent Laid-Open No. 2000-183959, the node apparatus records, on each entry of a calendar, a queue which transmits a packet at the time corresponding to the entry. In this case, the node apparatus can record a plurality of queues on each entry in a list form.
In the first conventional scheme, the node apparatus needs to have a content addressable memory. Since the content addressable memory is much more expensive than a general memory, the cost of the node apparatus based on the first conventional scheme is very high. The node apparatus based on the first conventional scheme needs to have a general memory for counters in addition to a content addressable memory. This further increases the cost of the apparatus.
In the second conventional scheme, as shown in FIG. 14, the first or second list generally has a structure in which the respective data are concatenated by pointers. In this structure, each point must be read before the next pointer is read. For this reason, the node apparatus cannot shorten the transmission intervals of packets more than the read intervals of pointers. According to the second conventional scheme, any node apparatus applicable to an ultrafast line could not be made.
In the third conventional scheme, since a node apparatus manages queue numbers in a list, the apparatus cannot be applied to an ultrafast line as in the second conventional scheme.
In the node apparatus based on the second conventional scheme, when a list on which queue numbers are recorded is managed in the form of a table in which queue numbers are arranged in order as shown in FIG. 15, the next queue number can be read out before the preceding queue number is read out. This makes it possible to perform high-speed operation. However, tables equal in number to lists are required, and hence a large memory is required to manage a plurality of lists as in the node apparatus based on the second conventional scheme.
In the fourth conventional scheme, the node apparatus manages queue numbers in a list for each transmission time. For this reason, when queue numbers are managed by using a list having a general structure in which the respective data are concatenated by pointers, the packet transmission intervals cannot be shortened more than the time required to read out queue numbers from the memory. The node apparatus cannot therefore cope with an ultrafast line.
According to the second to fourth conventional schemes, the node apparatus extracts a queue number from a list, reads out a state variable associated with the extracted number, and then transmits a packet. Upon transmitting the packet, the node apparatus updates the state variable, and adds the queue number to a proper list in accordance with the result. Since it takes more time to read out data from the memory than it does to perform other processing. Therefore, reading out state variables from the memory has hindered an increase in processing speed of the node apparatus.