An information processing system including a plurality of information processing apparatuses may employ a load balancer to distribute workload across the information processing apparatuses. The load distribution allows each of the information processing apparatuses to execute processes efficiently, resulting in an improvement in the processing performance of the information processing system as a whole.
In the case of implementing distributed processing across a plurality of information processing apparatuses, computer resources would be wastefully diverted, therefore decreasing the resource use efficiency, if too many information processing apparatuses are employed. On the other hand, if there are too few information processing apparatuses, too much load is put on each of the information processing apparatuses, resulting in prolonged response times to processing requests. In view of this, there has been proposed a technology for automatically increasing and decreasing the number of information processing apparatuses according to the amount of processes. This technology is referred to as auto scaling. As used herein, the terms ‘scale-out’ and ‘scale-down’ are taken to mean an increase and a decrease, respectively, in the number of information processing apparatuses. The implementation of auto scaling allows the information processing system as a whole to maintain sufficient performance while saving the computer resources.
The implementation of auto scaling is determined, for example, based on the central processing unit (CPU) utilization or threshold exceedance of a response time. There has also been proposed a technology for monitoring variations in the amount of requests received for each network service to thereby predict the amount of requests after a lapse of a predetermined time and then controlling amounts of the network service assigned to individual information processing apparatuses according to the predicted amount of requests.
International Publication Pamphlet No. WO 2004-092971
In the case where scale-out implementation is determined based on a response time, a scale-out action is performed, for example, when a process has occurred where the response time has exceeded a predetermined threshold. Increasing the number of information processing apparatuses for executing processes by the scale-out action is expected to reduce processing load of each information processing apparatus and decrease response times of the processes.
However, in fact, a scale-out action does not always produce a reduction in response times. For example, in the case where response times are delayed due to a failure in a system or communication congestion between information processing apparatuses in a multi-tier system, scaling out the information processing apparatuses has no effect on decreasing the response times. In addition, the prior art is not able to appropriately determine whether a scale-out action is effective when a response time has exceeded a predetermined threshold. As a result, when a response time has exceeded the threshold, a scale-out action is performed even though it is not possible to prevent response times of processes from exceeding the threshold, thus leading to a waste of resources.