1. Field of the Invention
The present invention relates to a dynamic load-distributed computer system including tightly-coupled computers which is called a cluster structure.
2. Description of the Related Art
In a load-distributed computer system including tightly-coupled computers, each computer formed by a plurality of symmetrical type multi-processors (SMPS), i.e., central processing units (CPUs), when a large number of messages requesting transaction processes arrive at the computer system, the messages are optimally distributed to the computers to enhance the performance of the computer system. Generally, since each transaction process is a small job, after one of the computers is selected to process this transaction process, the transaction process is completely carried out by the selected computer. Also, since transaction processes are generally interactively processed, the target of the load distribution is to minimize the average and deviation values of response time. When selecting one of the computers, one computer having the minimum load is selected in accordance with a load index such as the CPU utilization rate, the number of executed transactions and the past response time individually or in combination.
In a first prior art load-distributed computer system (see; JP-A-10-312365), a load such as a CPU utilization rate for each of the computers is measured and stored at predetermined time periods. Then, when a message requesting a transaction process arrives from a terminal unit, one of the computers having the minimum load is selected, so that this message is allocated to the selected computer. On the other hand, the terminal unit always determines whether or not a response time is larger than a threshold value. Only when the response time is larger than the threshold value, does the terminal unit request the computer system to change the selection of the computers.
In the above-described first prior art load-distributed computer system, however, since the CPU utilization rate as the first load index is calculated by the past average value which does not reflect the accurate current load, the reliability is not high. Particularly, under a dynamic control, when the same load index data is used until the next measuring timing, messages are concentrated on one of the computers whose load has been believed to be the minimum value, which would cause a seesaw phenomenon in the load. Also, the CPU utilization rate is not appropriate in a computer system including SMP computers. On the other hand, the response time as the second load index is used for switching the selected computer, which would be useful if the overhead is decreased. However, when a message requesting a transaction process arrives, the response time thereof is not obtained at this time, so that the response time is not an appropriate load index.
The first prior art load-distributed computer system using the CPU utilization rate as a load index is also disclosed in JP-A-2001-34591. In JP-A-2001-34591, the distribution of transactions are basically concentratedly controlled by using a round-robin discipline. In this case, when a selected computer is overloaded, another computer is selected. Also, when all the computers are overloaded, one of the computers having the minimum load is selected. Whether or not one computer is overloaded is determined by whether or not the load of the computer such as the CPU utilization rate thereof is higher than an upper limit. In this case, the overload amount is represented by the number of transactions over the upper limit of the CPU utilization rate. For example, if the CPU utilization rate is 90% , the upper limit is 60% and the number of executing-transactions is 10, the overload amount is10·(0.9/0.6−1)=5
That is, the determination of overload is basically carried out by the CPU utilization rate. However, as explained above, the reliability for a dynamic control is not high, and this determination is not appropriate in a SMP computer. Additionally, the number of transactions is assumed to be linear with the CPU utilization rate, which contradicts with a queue theory in which the number of clients has no relationship to the utilization rate particularly with a high load. Therefore, when all the computers are overloaded, the load distribution may be not appropriate. Further, this technology is not intended to decrease the minimum response time for arrival transactions and for the entire system.
In a second prior art load-distributed computer system (see JP-A-10-27168), each computer stores a response time of the latest transaction and multiplies this response time by the number of executed transactions therein, to obtain a load index. That is, when a message requesting a transaction process arrives, the load indexes of all the computers are calculated, and this message is sent to one of the computers having the minimum load index.
In the above-described second prior art load-distributed computer system, however, it is questionable whether the load index obtained based upon the response time of the latest transaction represents a typical response time of the computer. This response time reflects the congestion of the computer and the job characteristics such as a pure process time and a CPU time/input and output time ratio of the latest transaction. If the job characteristics are the same for all messages, the above-mentioned load index is appropriate. However, since there are actually different job characteristics in transaction, the above-mentioned load index is not appropriate. Also, no consideration is given to nonhomogeneous SMP computers
In a third prior art load-distributed computer system (see: JP-A-7-302242), a load is periodically detected and a load tendency Tr is calculated byTr=(W2−W1)/(T2−T1)
where W1 is a load detected at time T1; and
W2 is a load detected at time T2. When a message requesting a transaction process is accepted, after a definite time period Ti had passed, it is determined whether or not Tr·Ti≦Wt where Wt is a predicted load is satisfied. If Tr·Ti≦Wt, the subject computer carries out this transaction process. Otherwise, the requesting message is sent to another computer which has a lower load.
In the above-described third prior art load-distributed computer system, however, it is unclear what the predicted load is defined by. Also, the predicted load Wt, which is calculated by the linear extrapolation method, is not a good predicted value. On the other hand, a predicted load is actually determined microscopically, not macroscopically. That is, a predicted load is dependent upon the current state of the computer, an end timing of an executing-transaction, a timing for receiving a new transaction and the like. Therefore, since the predicted load of the third prior art load-distributed computer system is dependent upon the past load, the predicted load is not appropriate.
The above-described first, second and third prior art load-distributed computer systems are intended to minimize the response time of a message requesting a transaction process which has just arrived, not to minimize the response time of the entire system.
In a fourth prior art load-distributed computer system (see: Hisao Kameda et al., “Optimal Load Balancing in Distributed Computer Systems”, Springer-Verlag, pp. 230–232, 1997), if the number of CPUs in each computer sharing a disk appartus is 1, a response time is defined by the utilization time of the CPU and the input/output time of the file apparatus. When a plurality of transactions are carried out, a queue of transactions may be generated before each computer. In the fourth prior art load-distributed computer system, the following two load indexes are defined:fi=si·(ni+1)2Fi=si·(ni+1)
where i is a computer number,                s is an average service time of the CPU of a transaction;        n is the number of transactions in the CPU. The load indexes are introduced by using an equilibrium state of an open-type queue network model where the number of CPUs is 1, to minimize the average response time. Actually, the formula Fi=si·(ni+1) represents an averge time of a transaction in the CPU, and therefore, an average response time is represented by this averge time plus an average time of the input/output time. The two load indexes have been proved to be optimum in view of a static load distribution.        
In the above-described fourth prior art load-distributed computer system, however, since a dynamic control requires control at every moment, si and ni should use current values, not average values. Note that it is possible to measure a current value of ni, but it is impossible to directly measure a current value of si, since si reflects the characteristics of a job mix executed in the computer i. Thus, si is used as an equilibrium average value. An equilibrium average value may be used if the job characteristics are the same and have a small deviation. However, since there are actually different job characteristics, a dynamic control using the equilibrium average value may deteriorate. Additionally, the above-mentioned load indexes fi and Fi are applied to computers each having one CPU, but are not applied to SMP computers.