1. Field of the Invention
The present invention relates to a system for executing transaction processing by sharing load among a plurality of computers, nodes, or processors, that are connected to each other to form a relatively close group such as a cluster, and more particularly to a system and method for balancing transaction processing loads among the computers.
2. Description of the Related Art
In a system including a plurality of processing devices (computers), a load balancing method is designed to provide the maximum performance of the system by appropriately distributing a large number of successive arrivals of messages each requiring small-scale processing over these multiple computers. Since the processing required for each message is small in scale, a computer to execute the processing of a message is generally decided on arrival and the selected computer performs and finishes the processing without migrating it to another computer in progress.
Moreover, since the processing is interactive, the ultimate objective of load balancing is to minimize the mean and variation of response times. In general, when a message arrives, a computer with lightest load is selected as a computer to execute the processing of the message.
Here, what is to be selected as “load index” is very important. Conventionally, as load index, the CPU utilization of a computer, the number of processing jobs at a computer, records of response time in the resent past and the like have been used, individually and separately, or in combination.
A conventional example has been disclosed in Japanese Patent Application Unexamined Publication No. 10-312365 (hereafter, called Prior Art 1). Here, the load states (here, CPU utilizations) of servers (computers) are measured and stored at regular time intervals. Based on the stored load states, a server with lightest load is selected ms an executing computer that is put in charge of processing a new message. Beside, a terminal monitors the response time of processing at the server and, if the response time exceeds a predetermined value, then the terminal sends a path change request to change the executing computer to anther computer.
In this case, CPU utilization is used as a first load index. The CPU utilization is certainly a good index, but the measured value is a mean value of past certain time interval and does not contain the effect of start/termination of the following processing or the like, so it is not so reliable as index representing the “actual state”. Particularly in the case of dynamic control, since the measured data at a time instant is used until the next measurement time, processing requests arriving meanwhile are sent intensively to a specific one server judged to be least loaded, resulting in seesawing load balancing.
As a second load index, the record of response time of processing in progress is used to determine whether the currently executing computer should be changed to another computer. The record of response time would be an effective index if the changeover of executing computer in progress were performed with a small overhead. However, it is impossible to use this index because there is no record of that processing itself when arriving.
Another conventional example has been disclosed in Japanese Patent Application Unexamined Publication No. 10-27168 (hereafter, called Prior Art 2). According to this conventional example, the processing time of the latest message is stored at each computer and is multiplied by the number of in-process messages on that computer to produce a load index value. When receiving a message, the load index values are calculated at all computers, and a computer with the smallest load index value is selected to process the message.
In this case, the problem is whether the processing time of the last processed message represents the processing times of messages on that computer. Such a processing time may reflect the congestion state of that computer and the job characteristics (net processing time, the ratio CPU use to I/O use time) of the last message processing. If the job characteristics are identical for all messages, they may be considered to give a yardstick as load index. However, in general situations where those of various job characteristics are mixed, there is a strong likelihood of misjudging if each processing time as it is reflects the load state.
Still another conventional example has been disclosed in Japanese Patent Application Unexamined Publication No. 7-302242 (hereafter, called Prior Art 3). Among many claims, a technique related to the present invention is described in claim 9 or paragraph Nos. 152 to 161 in the specification. Here, the load of a transaction processing section is detected periodically and the load history is memorized with the time. Then, the load trend Tr is calculated by the following expression:Tr=(W2−W1)/(T2−T1).
In the case where a server has received a transaction processing request if it is judged that the processing load prediction value will not exceed a threshold value Wt after an elapse of time Ti, that is, Tr·Ti≦Wt, then the sever accepts the transaction processing request and, if not (Tr·Ti>Wt), then the server refuses it or requests the processing of another server with lighter load. In this example, the load is detected periodically and the detected load is used to determine, but what is concretely defined as “load” is not specified clearly all through the whole Publication. Though the provision of load index is a first important step for load balancing, it is not done. Besides, it is attempted to predict the load after an elapse of Ti by extrapolating linearly the past (it is seemed insufficient by Tr·Ti but it is apart from that matter), but it does not seem to be a good prediction. This kind of macro prediction might be effective concerning the load of the whole system. However, the future load state of each server is decided by micro movement such as the actual state, in-process transaction termination timing, and whether it accepts or not new processing, so it is undesirable to extend the past trend and take it for granted as it is.
Either of the above-described Prior Arts 1, 2 and 3 decides the computer in charge of processing a new message aiming to minimize the processing time of the arrived message itself. However, there is not necessarily a guarantee that such individual optimization system is directly linked to the optimization of the whole system.
A still further conventional example has been described in pp. 225 to 232 of “Optimal Load Balancing in Distributed Computer Systems” (H. Kameda et al.) published by Springer, 1977 (hereafter, called Prior Art 4). Looking at one CPU, a premised node model may be shown in FIG. 2. An arrival job at a node repeats the use of CPU (computer i in FIG. 2) and disks, and leaves upon the termination of processing. The time required for this processing is the response time of the job. Since a plurality of jobs are processed in parallel, a queue is generated at the CPU. The other CPUs are provided similarly as the illustrated computer i, and all CPUs take the same access time for an access to any disks. Here, the following two expressions are shown as load index:fi=si(ni+1)2  (1); andFi=si(ni+1)  (2),where, f and F are load indexes, i is a computer number, s is the mean net service time in CPU for jobs, and n is the number of jobs in the CPU system. These expressions are deduces as values the smaller the better so as to minimize the mean response time in a sense, based on a relation realized concerning mean values in the equilibrium state for a CPU system, in an open queuing network model proposed in the queuing theory. In fact, the expression (2) represents the mean time in a CPU system, and the mean response time is obtained by adding the mean time in the input/output system to that in the CPU system. It has been shown that that these are optimal in a sense as an index for static load balancing.
However, the advantage of dynamic control resides in the capability of control depending on the situation of that time, and it is meaningless for si and ni unless the actual value is used in place of mean value in the equilibrium state. The actual value of ni can be measured, however, si reflecting characteristics of job mix being executed on the computer i can not be measured directly. In the evaluation described in this paper, the mean value of the whole is used as si. The mean value of the whole may well be used if the processing to be executed is of single type from the viewpoint of job characteristics and, moreover, if the variation of job characteristics is narrow. However, the merit of dynamic control will mostly be lost by the use of the mean value of the whole, if it is of single type but the variation of job characteristics is wide, or in the case where multiple types of processing providing different characteristics are mixed, which appears much closer to reality.
A first problem of the prior arts is that the load state of each computer that is used to determine the load balancing cannot be sufficiently grasped. Conventionally, the number of in-process transactions, CPU utilization and the likes have been used respectively and individually or in combination as load index. However, it cannot be said that they reflect sufficiently the system congestion state, taking into consideration job characteristics including the CPU/input output use rate in a group of in-process transactions at that time.
In a system where a very large number of small scale processing requests arrive as in the case of transaction processing, it is necessary to exactly grasp the load state in a short time interval. However, the trend has been to reduce the collection frequency because the overhead of data collection increases in the distributed computer system. Besides a device to use good data of low overhead, it is required to make good use of clustered computer's capability to collect data with high speed/low overhead.
A second problem of the prior arts is that the system-wide optimization on average of response time and minimization of variations cannot be always obtained. An optimal computer that is expected to process the arrived transaction in the shortest time at that time is selected as a computer in charge of the processing. However, such an individual optimization scheme does not necessarily ensure the system-wide optimization.