Conventionally, a server-client-type system in which a request for execution of a server program is made, to a server, by an application program on a client through a network is used. In this system, in response to the request for executing a program from a client, the server that provides a service resulted from executing the program starts up a process that is a processing unit of the program to provide the service. When a process of a web content distributing program, business operation software, game software, etc. is started on the server, a service in a form of, for example, text data, image data, etc., is provided to the client who has made the request to the service.
When a processing in such a server/client system is executed by plural servers, a technique that distributes loads of a processing requested from the client is used to prevent the loads from concentrating on some of servers. For example, in a case of web servers that distribute web contents, servers to be assigned to execute a processing are selected using schemes such as a round robin scheme that shares processing of accesses from clients sequentially to servers prepared and a least connection scheme that selects from the servers the server that has the minimal number of sessions.
When the loads is distributed in the round-robin scheme or the least connection scheme, the servers are to be assigned regardless of amount of resources (resources retained by a personal computer such as a CPU, a memory, etc.) to be consumed by execution of the process or amount of resources retained by the servers.
FIG. 19 is an explanatory view showing an example of a relation between an amount of resource retained by a server and an amount of resource consumed by a process. In FIG. 19, the X-coordinate represents an amount of CPU of the server and the Y-coordinate represents an amount of memory of the server that are consumed to execute a server program. “Xmax” on the X-coordinate shown in FIG. 19 represents “100%” that is the maximum value of the consumption of the CPU, and “Ymax” on the Y-coordinate represents “100%” that is the maximum value of the consumption of the memory. An ideal position at which the CPU and the memory are respectively consumed 100% is represented by coordinates 1902. When a process 1901 that consumes the CPU and the memory is executed on this server, the consumption of the memory becomes 100% even though the consumption of the CPU is less than 100%.
FIG. 20 is an explanatory view showing another example of the relation between the amount of resource retained by a server and the amount of resource consumed by a process. In FIG. 20, similarly to FIG. 19, the X-coordinate represents an amount of CPU of the server and the Y-coordinate represents an amount of memory of the server that are consumed to execute a server program. “Xmax” on the X-coordinate represents “100%” that is the maximum value of the consumption of the CPU, and “Ymax” on the Y-coordinate represents “100%” that is the maximum value of the consumption of the memory. An ideal position at which the CPU and the memory are respectively consumed 100% is represented by coordinates 2002. When a process 2001 that consumes the CPU and the memory is executed on this server, contrary to the case shown in FIG. 19, the consumption of the CPU becomes 100% even though the consumption of the memory is less than 100%.
Such a system exists that a load distribution device grasps the number of sessions in each of servers, and selects a suitable server from among servers to assign a session based on the numbers of sessions and weights determined respectively based on a machine performance of each of the servers (for example, Japanese Patent Application Laid-Open Publication No. 2002-269061).
However, in the load distribution according to the round robin scheme or the least connection scheme described above, the load is distributed based on the number of sessions with clients thereof regardless of the resources retained by the servers and the resources to be consumed by the process. Thus, sufficiency of the resource is judged when the process that has been requested to be executed is started on the server. As a result, delay occurs in responding to the request, and a latency time occurs to the client.
On the other hand, because the amount of resource that is retained by each server differs according to a performance of each device to be used as a server, amounts of used resources after distributing the load is not considered when a scheme of load distribution simply based on the numbers of sessions such as the round robin scheme, the least connection scheme, etc., is used. Therefore, a process can be assigned to a server that has a small amount of resource left or to a server that has unbalanced consumption of the resources.
Moreover, when a server in which consumption is concentrated on either of the CPU and the memory is caused to execute a process, a new process can not be executed even though a resource that has smaller consumption has an empty region because only a resource region that has a larger consumption is not sufficient. Therefore, such server that has unbalanced consumption of resources can execute fewer processes compared to a server in which resources are consumed equally.
It is an object of the present invention to solve the problems in the conventional techniques described above, and to provide a server/client system, a load distribution device, a load distribution method, and a load distribution program that can select an optimal server from among plural servers installed at one or more locations by numerically evaluating resources and operational states of the servers to be assigned to a process, and can cause each of the servers execute the process efficiently.