1. Field of the Invention
The present invention relates to management of a server of a communication network. In particular, it relates to a method of managing a congestion state in a server, as well as a related computer program product. The present invention further relates to a server for implementing such a method.
2. Description of the Related Art
As it is known, in a packet-switched communication network information to be transmitted is divided into packets. Each packet is transmitted through the network independently from the other packets. At the receiver side, information is recovered by reconstructing the correct sequence of packets. In a packet-switched communication network, servers are provided for receiving and serving incoming service requests from a plurality of users. Different types of servers are known, each type of server being adapted to provide a set of services.
For instance, an FTP server is adapted to serve requests of users wishing to forward a file to a receiving user through a proper protocol, termed File Transfer Protocol. Besides, a SIP proxy server is adapted to serve requests of users wishing to set up a vocal session with a called user through a proper protocol, termed Session Initiation Protocol. For a detailed description of SIP protocol, reference can be made to IETF RFC 3261 “Session Initiation Protocol” by J. Rosenberg, June 2002.
Each service request contains service information allowing the server to provide the required service. For instance, a request of sending a file may contain the sending user address and the receiving user address. Besides, a request of setting up a vocal session may contain the caller user identifier and the called user identifier.
Each server has a buffer, i.e. a memory device where requests to be served are stored. In particular, when a server accepts an incoming request, it allocates a respective memory portion of said buffer. Said memory portion is adapted to store request service information and request status information. Once a request is served, the server deletes, the request from the buffer, i.e. it de-allocates the corresponding memory portion, which becomes available for storing further requests.
A server can perform allocation and de-allocation of memory portions in a substantially continuous manner. Alternatively, servers are known which perform allocation and de-allocation in different time frames. In the following description, a server performing continuous allocation and de-allocation will be referred to as “single-phase server”. Besides, a server performing allocation and de-allocation in different time frames will be referred to as “two-phase server”.
A single-phase server continuously checks for new incoming requests and, at the same time, checks the buffer for already served requests. Hence, in any check instant, a single-phase server allocates memory portions for new incoming requests, and it de-allocates memory portions associated to already served requests. Thus, in a single-phase server, at each check instant, a filling level of the buffer (i.e. the number of requests stored into the buffer) may be either increasing or decreasing.
On the other hand, in a two-phase server, allocation and de-allocation steps are performed in two separated time frames, which temporally alternate in a cyclic way. During a first time frame (also referred to as “allocation time frame”), the server only checks for new incoming requests, and allocates respective memory portions. During a second time frame (also referred to as “de-allocation time frame”), which is generally shorter than the first time frame, the server only checks whether the buffer contains already served requests, and, in the affirmative, the server de-allocates the respective memory portions. Thus, in a two-phase server, the filling level of the buffer is non-decreasing during the allocation time frame, while it is non-increasing during the de-allocation time frame.
Typically, FTP servers continuously manage the service requests and thus are single-phase servers. The continuous management of the requests is deemed to be advantageous since it is generally possible to estimate in advance a request service period, i.e. the time for serving a request of sending a file through the File Transfer Protocol. Thus, it is possible to provide a check instant for each service period; in this way, memory portions are de-allocated as soon as possible, thus resulting in a very efficient request management.
Typically, SIP proxy servers are two-phase servers. In a SIP proxy server, indeed, it is not possible to estimate in advance a service period, since it depends on the time required by the called user to answer the call. This period is almost unpredictable. Consequently, a single-phase management would require to continuously check the buffer for already served requests. However, this would require a large amount of computation resources, thus reducing the computation resources available for serving the requests.
Generally speaking, storage capacity of a server buffer is limited. Thus, the maximum number of requests which can be contemporarily served by a server is limited by the storage capacity of its buffer. When the overall number of incoming requests exceeds the buffer storage capacity, the server experiences a congestion state.
When a server is in a congestion state, it must determine whether to accept or to refuse an incoming request. In the following description, the term “management of a congestion” or similar expressions, will refer to the rules upon which a server in a congestion state decides whether to accept of refuse an incoming request.
Two methods of managing a congestion of a server are known in the art.
A first method, which is known as “tail drop”, consists in accepting all the incoming requests until the buffer is saturated, i.e. completely filled. Once the buffer is saturated, any further incoming request is refused. The server accepts a new incoming request only when the buffer filling level has decreased and at least a memory portion has become available. For a detailed description of the “tail drop” technique reference can be made to IETF RFC 2309 “Recommendations on Queue Management and Congestion Avoidance in the Internet”, April 1998.
Advantageously, this method allows a server to use always the whole buffer. Moreover, advantageously, the tail drop method may be applied both to single-phase servers and to two-phase servers. However, saturation of the buffer results in some disadvantages. When a new incoming request arrives to the server, said request will be refused until a memory portion is de-allocated. This means that, when the server is in a congestion state, the delay in serving requests is not equally shared between all the users connected to the server, but it affects only the users trying to send a request when the buffer is saturated. Such a behaviour leads to synchronization between users trying to send a request, thus increasing the severity of the congestion state.
To avoid the above drawbacks, a second method for managing a congestion has been proposed in the art by S. Floyd and V. Jacobson in their article “Random Early Detection Gateways for Congestion Avoidance”, IEEE/ACM Transactions on Networking, August 1993. This method is known as “random early detection” or RED. According to the RED method, the incoming requests are organized in a queue. The server detects incipient congestion by computing an average queue size. The average queue size is compared with two preset thresholds, a minimum threshold and a maximum threshold. When the average queue size is lower than the minimum threshold, no request is dropped (i.e. refused). When the average queue size exceeds the preset minimum threshold, the server drops any incoming request with a certain probability, where the probability is a function of the average queue size. This ensures that the average queue size does not significantly exceed the maximum threshold. Estimating the average queue size and the probability requires a set of parameters, such as:                minimum threshold of the queue;        maximum threshold of the queue;        queue weight;        maximum value for dropping probability; and        number of requests that could have been served by the server during an idle period, where an idle period is a period wherein the queue is empty.        
Therefore, the RED method allows to equally share the delay in serving requests between all the users connected to the server. Moreover, since the RED method aims to keep the queue length below a maximum threshold, synchronization effects are avoided, and bursts of requests can be managed.
However, the RED method exhibits some disadvantages. Firstly, estimating the probability of refusing a request requires five parameters, which must be manually adjusted by a server manager. Thus, the server manager must periodically check the status of the server and adjust said parameters, if needed. Moreover, the RED method can not be applied to a two-phase server, for the following reasons. First of all, the requests are organised in a queue; queues are managed according to a FIFO logic (First-In-First-Out), so they are incompatible with a two-phase management. Moreover, the average queue size is a variable parameter which must be estimated substantially continuously; such estimation could be performed in a single-phase server but it is rather ineffective for a two-phase server, such as a SIP proxy server.