In some cases, a heavy load is applied from service consumers to a server providing a service such as contents or applications on a network. As a method for reducing such service request load on the server, one that uses a server load distribution apparatus is known. In this method, the address of servers at which services are stored is registered in the server load distribution apparatus. As a result, a variety of types of the service request received by the server load distribution apparatus are transferred to a registered server or redirected to another registered server, whereby the load of requests on each server is distributed.
Examples of the technique that distributes the requests received in the server load distribution apparatus among different servers during the request transfer or redirection include a round-robin technique in which requests are uniformly distributed among the registered servers, a weighted round-robin technique in which a weight for distributing the requests is changed in accordance with a load measurement result of each server. The term “weight” as used herein means a rate at which the requests are distributed. A use of the weighted round-robin technique allows the load of requests to be distributed uniformly even if there is some range of variation in the performance among the servers. Further, by changing the address list registered in the load distribution apparatus, it is possible to change the load applied on each server. That is, by adding or deleting the address to/from the list, the load applied on each server is decreased or increased.
An example of a server-resource management system is disclosed in U.S. Patent Application No. 2004/0181794, and E. Lassettre, et al “Dynamic Surge Protection: An Approach to Handling Unexpected Workload Surges with Resource Actions that Have Lead Times”, (DSOM 2003, LNCS 2867), Springer, October 2003, pages 82 to 92. FIG. 18 is a block diagram showing an example of the configuration of a conventional server-resource management system.
As shown in FIG. 18, the conventional server-resource management system includes a management-targeted system 110 and a management system 100 for managing the system 110. The management-targeted system 110 includes an application server 111 for providing a specific service and pool servers 112 that can be used by plurality of services. The management system 100 includes a monitoring means 101 for monitoring service response time, throughput, etc., i.e., load on the application server, a load predicting means 102 for predicting a future load based on the past load data, a resource-capacity planning means 103 for calculating the amount of resources required for achieving service-level targeted value 104 specified by a service administrator in consideration of the future load, a server determining means 105 for selecting a server, and a provisioning means 106 for changing the configuration of the selected server and that of a related network.
The conventional server-resource management system having the above configuration operates as follows. That is, when the number of clients that issue requests to the application server 111, the monitoring means 101 that constantly monitoring the load on the application server transmits information related to the load to the load predicting means 102. The load predicting means 102 predicts the load in the future at a time instant after a predetermined time required for the provisioning means 106 to change a pool server 112 to an application server 111 has elapsed from the present time, and transmits the estimated value to the resource-capacity planning means 103. The resource-capacity planning means 103 determines whether or not the target service-level value 104 set by the service administrator can be achieved when the future load is applied to the application server 501. If the target service-level value 104 can be achieved, the resource-capacity planning means 103 does not transmit data to the server determining means 105. If it is determined that the target service-level value 104 cannot be achieved, the resource-capacity planning means 103 calculates the number of servers required to achieve the target service-level value 104 and transmits the calculation result to the server determining means 105. The server determining means 105 selects servers in number corresponding to the specified number of servers from the pool servers 112 and transmits information related to the selected servers to the provisioning means 106. The provisioning means 106 changes the setting of the selected server in the pool servers 112 as the application server 111. With the above operation, even if the number of requests is unexpectedly increased, the service level of the application server 111 can be controlled to the target service-level value 104. In the following description, the number of requests that a service receives per unit time and number of requests that a service processes per unit time are referred to as “throughput”.
The conventional server-resource management system has the problem that when the number of accesses is unexpectedly increased under an environment where there is some range of variation in the performance between servers including the management-targeted application servers and pool servers, the service level is degraded or service availability is lowered.
The reason is that only the number of servers is calculated as the resources required for maintaining the service level, and the difference in the calculating performance between the servers is not taken into consideration. If the server performance considered in the course of calculation of required resource amount by which only the number of servers is calculated differs from the server performance in the server environment actually existing in the server-resource management system, there occurs a problem in that a server having an unnecessarily higher performance or, conversely, a server having an insufficient performance may be assigned to the service.
Further, in the conventional server-resource management system, if the number of accesses is unexpectedly increased, the accuracy with which the service request load is estimated may be deteriorated so that a server having an adequate resource amount cannot be assigned to the service. This may result in deterioration of the service level or reduction of the service availability.
The reason is as follows. That is, since the amount of resources is controlled by the number of servers in the conventional technique, a plurality of servers often need to be controlled when the server configuration is to be changed. Although it is possible to control a plurality of servers in parallel, processing needs to be executed in a successive manner when a shared resource such as a load distribution apparatus is controlled. Accordingly, the time length consumed for the server control is determined depending on the number of servers to be controlled. In the conventional technique, the number of future requests is estimated by the time length corresponding to the determined control time. However, the time length to be estimated is changed depending on the prediction result, thereby degrading the prediction accuracy.