With a rapid spread of the Internet and Intranet over the recent years, an efficient utilization and service stability of a network service server have been requested. Optimum sharing of the services to the servers is indispensable for the efficient utilization and stable service supply of the servers, and, for attaining this optimum sharing, it is required that a load of the server be accurately recognized.
The followings are known methods of recognizing the load of the server in the prior art.
(1) Agent Method
This is a method in which a program for counting an activity ratio of resources such a CPU, a memory etc is installed into the server. When an agent itself increases the server load and communicates with the outside, there occurs an interference with a load measuring accuracy by the agent such as consuming a band therefor. Further, the agent program must be installed into the server, and hence there arises a problem of being lack of a general-purposed characteristic and large in instruction cost.
(2) Load Measurement Communication Method
This is a method of issuing a ping command to the server and performing a pseudo service communication therewith, and obtaining a server load from a response time. The communication for the measurement, however, consumes a band on the route, and the server is also burdened with a load for the response, resulting in an interference with the load measurement. Further, the server is required to support a protocol etc used for the measurement, and there is still the problem of being lack of the general-purposed characteristic.
(3) Counting Method of VC count, Connection Time, Connection Frequency, Connection Error Rate and Response Time
This is a method of obtaining the server load from a VC count, a connection time, a connection frequency, a connection error rate and a response time with respect to the server, which are counted during a routing process in a router for routing a packet from a client to the server. This method is, however, based on a behavior of the server when establishing the connection, and therefore an error is large. A large quantity of connections are needed for enhancing the accuracy, so that this method is not suited to the services where a large amount of communications are performed with a small number of connections. Further, the routing is indispensable, and hence a problem is that a throughput of the server is restricted by a throughput of the counting method.
(4) Hit Count/Hit Rate Calculation Method
This is a method in which an access count (hit count) and an access frequency (hit rate) are counted per content such as an access target file by checking the packets to a WWW server, and the server load is obtained from a result of this counting. This method needs a packet analyzing process per protocol for specifying the access target file and is incapable of corresponding to a new service. Moreover, a performance of the server must be already known. There is no alternative but to obtain the server performance from catalog values or empirically in order to give the server performance beforehand. The server performance is, however, largely influenced by a system architecture and an operation mode. Therefore, a problem is that the catalog performance value based on the standard architecture and mode is not precise, and at least one trouble is inevitable when obtained empirically.
As explained above, any method is not capable of detecting the server load efficiently at a high speed without putting a burden on the server.
Further, the server load can not be thus accurately recognized, and it is therefore difficult to allocate the services provided by the server.
The following methods are proposed only in terms of sharing the services.
(5) Round Robin DNS Method
This is a method wherein, in DNS (Domain Name System) services, a mapping of one domain name to a plurality of IP addresses of the server is set in an entry table, in response to a client's request for an inquiry about the server IP address, the respective servers are allocated cyclically (Round Robin) according to the entry table, and the IP address of the allocated server is selected to respond to the client, thus sharing the services to the plurality of servers.
According to this Round Robin DNS method, however, the services can be shared only at an equal or simple service sharing rate, and each server must perform the service in accordance with the sharing rate allocated irrespective of its capability and dynamic load state. Therefore, there is a difference in the load sate between the servers, and the method is inefficient on the whole. Further, pieces of DNS inquiry information are normally cached on the client side, and hence a problem is that, even if the rate changes, this change can not be immediately reflected.
(6) Sharing Method Using Hash Table
This is a method of allocating entries in a Hash table for managing the connections to the servers, and the services are shared to the servers at a rate corresponding to the number of entries to be allocated.
In this method, to begin with, when the client makes a request for the service, the entry is determined from the client address and the service as well. This request is sent to the server to which that entry is allocated. Then, the services of which the number corresponds to a ratio of the number of entries allocated, are shared to the servers. Thus, the efficient utilization of the servers is actualized by allocating the many entries to the high-performance server or by re-allocating the entries to the server with the high load to servers having a comparatively low load.
According to this sharing method using the Hash table, however, a Hash function for generating a Hash value with no bias is necessary for properly reflecting a ratio of the number of Hash entries in the service sharing rate. In general, however, it is impossible to find out the Hash function for generating the Hash value with no bias with respect to all sorts of distributions of Hash keys (client addresses, port numbers etc). Further, the accuracy of the sharing rate is proportional to the number of Hash entries, and hence a multiplicity of entries are needed for enhancing the accuracy, resulting in more of consumption of storage resources (buffers) usable for managing the connections. There arises a problem that a large quantity of accesses can not be handled.
(7) Sharing Method Based on State and Performance of Server
This is a method of sharing the services of which a quantity corresponds to the load and a performance ratio by predicting a magnitude of the server load or predicting a performance ratio between the servers by counting a response time by issuing the ping command to the server and counting a connection time and a connection error rate during a routing process by routing the packet from the client.
According to this method, however, the services for whichever client are equally shared to the servers regardless of a throughput of the client, a length of the route to the client and a bandwidth, so that the utilization efficiency of the server can not be maximized.
A difference in the performance (especially, a speed) and the load of the server do not appear in QoS (Quality of Service) to the client with its route becoming a bottleneck due to a short or long bandwidth of the route and to the client having a low throughput.
Reversely, the difference in the performance and the load of the server exerts a great influence on QoS to the client connected to a near and high-speed line or to the client exhibiting a high throughput. Such being the case, there is a problem in which when trying sharing equally all the services for the clients, it follows that more server resources than needed to the clients are shared, or there is nothing but to share the deficient server resources.
As described above, the problems are inherent in both of the server load recognizing method and the server sharing method in the prior art.
The present invention, which was devised to obviate the above problems, aims at recognizing the server load efficiently at the high speed without putting any burden on the server, sharing the services in accordance with the dynamic load state in the server, accurately reflecting the service sharing rate obtained by setting and the adjustment in the service sharing, and maximizing the utilization efficiency of the server by sharing the services in a way of estimating the necessary server resources for every client.