The present invention relates to a packet transfer method and apparatus which fairly transfer packets between a plurality of connections when the packets are to be transferred by using TCP (Transmission Control Protocol) used in the Internet and the like.
Conventionally, in packet communication, the fairness between connections has been improved by devising a packet scheduling scheme in packet transfer apparatuses. As a conventional scheduling scheme, a technique called DRR (Deficit Round Robin) is available (see, e.g., M. Shreedhar, G. Varghese, “Efficient fair queuing using deficit round robin”, Proc. ACM SIGCOMM 1995).
The above DRR will be described below with reference to FIGS. 14 and 15. This packet transfer apparatus includes an input terminal 001, a connection information storage section 002, a packet classifying section 003, a queue manager 004, a queue set 005 including queues 006 to 008, a default queue 011, scheduler 009, and an output terminal 010.
If the packet transfer apparatus shown in FIG. 14 is installed between a plurality of communication apparatuses which perform packet communication using TCP/IP, packets input through the input terminal 001 are classified according to connections by the packet classifying section 003. The packet classifying section 003 performs header analysis to identify a proper connection by recognizing the connection on the basis of a set of a connection protocol type, source address, source port number, destination address, and destination port number.
If it is determined as a result of the header analysis that TCP is used as a transport layer protocol and a connection establishment packet in which a SYN flag is set in the TCP header portion has been received, the queue manager 004 generates a new queue, and new connection information containing a set of source and destination address port numbers and the identifier information of the generated queue is registered in the connection information storage section 002. The packet is then stored in the generated queue. If it is determined as a result of the header analysis that TCP is used as a transport layer protocol and a data packet has been received, the queue manager 004 inquires of the connection information storage section 002 by using the set of source and destination address port numbers as a key, thereby obtaining the identifier of a queue in which the packet should be stored and storing the packet in the queue.
If it is determined as a result of the header analysis that TCP is used as a transport layer and a connection release packet in which a FIN flag in the TCP header portion is set has been received, the queue manager 004 obtains the identifier of a queue in which the packet should be stored from the connection information storage section 002 by using the set of source and destination address port numbers as a key, and stores the packet in the corresponding queue. Thereafter, the queue manager 004 requests the connection information storage section 002 to erase the corresponding registered information. Upon reception of the request to erase the registered information, the connection information storage section 002 erases the connection information after a lapse of a predetermined period of time. Furthermore, if it is determined as a result of the header analysis that a packet using a protocol such as UDP (User Datagram Protocol), other than TCP, as a transport layer protocol has been received, the packet is stored in the default queue 011 set in advance, and a band is ensured independently of the TCP connection, thereby processing this queue.
The connection information storage section 002 holds timers for the respective connections for the case of abnormal ends of TCP connections, and deletes connection information if no packet is input in a predetermined period of time. When a queue is to be added, the queue manager 004 may perform active queue management, e.g., RED (see Random Early Detection: S. Floyd, V. Jacobson., “Random early detection gateways for congestion avoidance”, IEEE/ACM trans. networking, 1995) in which a packet is dropped in accordance with a certain condition. The input packet is added to one of the queues 006 to 008 for each connection through such processing.
The scheduler 009 selects one of the queues 006 to 008 in the queue set 005, extracts a packet from the head of the queue, and outputs the packet to the network via the output terminal 010. When the output terminal 010 finishes transmitting the packet, the scheduler 009 selects a packet to be transmitted next.
Queue selection, i.e., packet transfer control, in the scheduler 009 is like the processing shown in FIG. 15. Assume that in this case, the default queue and the queues for the respective connections are processed at the timings respectively assigned to the queues. The DRR scheduler 009 sequentially selects queues by the round robin scheme, and determines for each queue in the following manner whether to output a packet. At this time, a variable called a deficit counter is prepared for each queue for each connection, and the deficit counter is reset to 0 upon connection establishment.
First of all, when packet transfer at the output terminal 010 is completed and a packet to be transferred next is to be selected, the scheduler 009 determines whether the current timing is for the default queue 011 to be processed (step S1). If the current timing is for the default queue to be processed, the scheduler 009 checks whether the default queue is empty (step A10). If the default queue is empty, the flow immediately returns to step S1. If the default queue is not empty, the leading packet in the default queue is transmitted (step A11), and the flow returns to step S1 again.
If it is determined in step S1 that the current timing is not for the default queue to be processed, the scheduler 009 checks whether the queue to be processed is empty (step S2). If there is no packet to be transmitted in this queue, the deficit counter is reset (step S7), and the processing of this queue is terminated.
If it is determined in step S2 that the queue is not empty, a constant called quantum is added to the deficit counter (step S3), and the size of the packet at the head of the queue is compared with the deficit counter (step S4). If the deficit counter is larger, the packet is output (step S5). A value corresponding to the packet size is then subtracted from the deficit counter (step S6), and the flow returns to step S5 again. If the deficit counter is smaller than the size of the leading packet, the value of the deficit counter is stored (step S8), and the processing of this queue is terminated.
After the queue processing is terminated, the scheduler 009 checks whether all the queues are empty (step S9). If all the queues are not empty, the flow shifts to step S1 to process the queue selected next. If all the queues are empty, the series of packet transfer control operations is terminated. As a result, the transfer rates for the respective connections are almost averaged to maintain the fairness between the connections to an extent corresponding to the time required to transmit data having a data length determined by quantum.
According to such a conventional packet transfer control method, however, even with the use of a fair schedule such as a DRR scheduler, when data are to be transferred by using TCP that is generally used in the Internet, the smaller the data to be transferred in a connection, the lower the throughput. That is, fairness in throughput between connections cannot be ensured. When, in particular, TCP communication is performed by using HTTP (HyperText Transfer Protocol) which is widely used when Web browsing is done in the Internet, most of data transfer is for small files, but large files are seldom transferred, resulting in high unfairness.
The following is the reason for this. TCP includes slow start operation of exponentially increasing the transmission rate from the low rate immediately after connection establishment, and congestion avoiding operation of linearly increasing the transmission rate a given period of time after connection establishment. If the size of data to be transferred is small, a connection is terminated during slow start operation, and the throughput tends to decrease.