This invention relates to controlling of multi-processing servers, and more particularly, to fair assignment of processing resources to queued requests in multi-processing servers.
There exist multi-processing server systems which are capable of serving many requests in parallel fashion. Requests may also be called tasks, jobs, loads, messages or consumers. A typical existing system uses multi-processing servers, all of which are capable of serving any type of request that is submitted to the system. Requests are processed by available servers as they are received by the system. When all servers become busy serving other requests, any new requests received by the system cannot be served as received. The system needs to handle those new outstanding requests. It is desirable to assign multi-processing servers and other processing resources in the system to those outstanding requests in a fair manner.
Some existing systems attempt to solve this problem by rejecting new requests when all servers are busy. Rejecting new requests is unfair because requests submitted later can be processed before ones submitted earlier and rejected.
Some existing systems attempt to provide fair assignment by queuing outstanding requests in the order of receipt while they are waiting to be served. A typical existing system provides a single queue for all outstanding requests, regardless of how many servers are available. In this system, when a server becomes available, a request at the head of the queue is simply dispatched to that server.
Queuing outstanding requests is fairer compared to rejection of them. However, when there are high priority requests and low priority requests, these conventional systems often allow high priority requests to completely block low priority requests, or even the reverse. This common phenomenon is called xe2x80x9cstarvationxe2x80x9d.
Some systems avoid the starvation problems by designing the system to handle requests in a fixed way, appropriate for a specific application and hardware configuration. This technique cannot be applied to other situations without a re-design.
Some systems work around the starvation problems by giving the administrator a high degree of instantaneous control over assignment of processing resources to requests. Such systems have a very high administrative cost to keep running well.
It is therefore desirable to provide a system which is capable of automatically assigning processing resources effectively and fairly to requests that exceed the system""s capacity for concurrent processing.
In computers, requests are served by running process instances of server programs. Each such instance may serve more than one request concurrently, if the server program is multi-threaded. For the purpose of this invention, each such process of single-threaded programs or thread of multi-threaded programs is called a server instance. Each request has request parameters that determine the cost of-preparing a server instance to serve the request (e.g., starting a particular program, opening files, connecting to particular external resources). In the present invention, those request parameters are identified and used collectively to define a service type
The present invention uses one queue for each service type, and reserves for each queue a minimum number of server instances.) In one embodiment, idle server instances may be configured on demand to serve requests of a different type.
In accordance with an aspect of the present invention, there is provided a method for dispatching requests to a predetermined number of server instances, in order to process multiple requests in parallel. Each request has a service type. The method comprises steps of utilizing one or more queues, each queue being associated with a service type for queuing requests having that service type; setting a minimum number of server instances for each queue; allocating to each queue at least the minimum number of server instances; preparing each server instance to provide a service type corresponding to that of the queue to which the server instance is allocated; and dispatching each request in each queue to its corresponding server instance when the server instance assigned to the server instance is available.
In accordance with another aspect of the invention, there is provided a request dispatching system for dispatching requests to a predetermined number of server instances, in order to process multiple requests in parallel. Each request has its service type and is queued in a queue which is associated with its service type. The request dispatching system comprises a server instance controller and a dispatching controller. The server instance controller is provided for controlling allocation of server instances to each queue such that each queue maintains at least a minimum number of server instances to serve requests of the service type of the queue. The dispatching controller is provided for controlling dispatching of each request in each queue to its corresponding server instance when the server instance reserved for the queue is available.
Other aspects and features of the present invention will be readily apparent to those skilled in the art from a review of the following detailed description of preferred embodiments in conjunction with the accompanying drawings.