1. Field of the Invention
The present invention relates to a transaction processing system using multiple processors, and more particularly, to schemes for transaction routing and data management in such a multiple processor transaction processing system.
2. Description of the Background Art
The transaction processing system for executing some kind of processing for the transactions received from the transaction sources such as a plurality of terminals, computers, and automatic teller machines which are coupled to the transaction processing system through communication paths is widely utilized today. Such a transaction processing system has been constituted by a single general purpose computer, but it has become necessary to constitute the transaction processing system from a plurality of processors in a case the higher processing power is required such as a case of handling a large amount transactions simultaneously, so as to share the load of processing a plurality of transactions among a plurality of transaction processors.
This multiple processor transaction processing system can take either a data non-sharing type configuration as shown in FIG. 1 or a data sharing type configuration as shown in FIG. 2. In either one of these configuration, a plurality of transaction processors 6'-1 to 6'-m having application programs 7'-1 to 7'-m and data management units 8'-1 to 8'-m, respectively, are coupled together by a coupling device 5' connected with a front-end processor 3' having a transaction routing unit 4'. This transaction routing unit 4' is connected with a number of transaction sources 1'-1 to 1'-n through a communication path 2' such that the transactions received from the transaction sources 1'-1 to 1'-n through the communication path 2' are routed through the coupling device 5' to the transaction processors 6'-1 to 6'-m distributedly.
In a case of the data non-sharing type configuration of FIG. 1, the transaction processors 6'-1 to 6'-m are associated with local data memory units 9'-1 to 9'-m, respectively, in which the data for each transaction processor 6' are separately stored in each local data memory unit 9', so that the data management unit 8' of each transaction processor 6' can make accesses only to the data in the corresponding data memory unit 9' connected with this transaction processor 6'. On the other hand, in a case of the data sharing type configuration of FIG. 2, all the transaction processors 6'-1 to 6'-m are associated with a common data memory unit 9A, so that the data management unit 8' of each transaction processor 6' can make accesses to any data in this common data memory unit 9A.
In the multiple processor transaction processing system of the data non-sharing type, the data in files or databases to be processed can be managed by being distributed among a plurality of processors. For this reason, a received transaction can be processed in shortest time with lowest load by processing this transaction at a processor on which the data required in processing this transaction are present. In other words, when this transaction is received at the processor which does not have the data required in processing this transaction, it is necessary to transfer the processing to the other processor which manages the relevant data or to transfer the data from the other processor which manages the relevant data, so that the required time and the load for processing this transaction are going to be increased.
Consequently, in order to improve the processing power of the multiple processor transaction processing system as a whole, it is necessary to provide the routing of each transaction to the optimum processor which can make access to the data required in processing each transaction at the lowest cost, which is furnished by the transaction routing unit 4' in the front-end processor 3' in the configuration of FIG. 1.
Here, however, the conventionally known transaction routing scheme includes a scheme for routing the transactions according to the prescribed order by using the prescribed correspondence table which indicates which transaction should be routed to which transaction processor, and a scheme for providing additional means for checking the loading state of each transaction processor and routing each transaction to the transaction processor which is judged as being less loaded by this additional means in the random order, and both of these conventional schemes totally irrespective of the content of each transaction so that it has been difficult to take a full advantage of the available processing power of the system effectively.
In addition, in a case of the data non-sharing type configuration of FIG. 1 using a plurality of data memory devices 9', the transaction processing system can have a plurality of data storage regions and data management units for distributedly storing and managing the data in these plurality of data storage regions, and a plurality of accesses to the data in a plurality of data storage regions can be made in processing each transaction. In such a case, the time required in processing each transaction can be shortened if all the data required in processing each transaction are present in the same data storage region. In other words, in a case a plurality of data required in processing each transaction are not present in the same data storage region, even if they are stored in the different data storage regions within the same data memory device, an extra time is required for the data accesses for reading and writing of the data compared with a case in which all these data are present in the same data storage region.
Also, when a plurality of data are present in the different data storage regions on different data memory devices, the access time is further increased as it becomes necessary to request the data management unit managing the relevant data memory device to read out or transfer the data, and this in turn further increases the processing time for each transaction.
On the other hand, in a case it is possible to make accesses to a plurality of data storage regions simultaneously, the transaction processing power of the system as a whole can be improved by equalizing the access frequency with respect to each data storage region as much as possible, i.e., by loading the data memory devices as uniformly as possible. Consequently, in order to shorten the processing time for each transaction while taking a full advantage of the transaction processing power of the system as a whole, it is preferable to arrange the data such that the data necessary for each transaction are available within the same data storage region as much as possible and the access frequency with respect to each data storage region is equalized as much as possible.
Conventionally, the data arrangement in the data storage regions has been specified explicitly by a system manager. Namely, the system manager has been required to decide which data should be arranged in which data storage regions in order to take a better advantage of the system power according to the result of analysis of the data accessed by each transaction, the frequency of occurrences of each transaction, and the loading state of each data memory device to determine whether the available system power is currently utilized sufficiently or not, and if not, the cause in terms of the data arrangement which prevents the sufficient utilization of the available system power. Then, the system manager has been required to explicitly specify the correspondence between each data and the data storage region for storing each data to the data management units in order to realize the decided new data arrangement. As for those data for which the correspondence between the data and the data storage region for storing the data is not given, the data are arranged in the arbitrary memory devices by the system itself.
However, in order to analyze what kinds of data accesses are going to be made by a transaction, there is a need to analyze the source codes, but it has been quite difficult to analyze the source codes for all the transactions to be processed in the transaction processing system. Moreover, even if the source codes are analyzed somehow, some data access targets would not be apparent in cases the branching occurs or the data access target is determined according to the factors dynamically determined at a time of the transaction processing. Therefore, even when the new data arrangement is obtained to satisfy the above described requirement according to the result of analysis of the source codes, it is unlikely to be able to take a sufficient advantage of the available system power by such a data arrangement.
Furthermore, whenever a new transaction is added to the system, or the frequency of occurrences of the transaction to be processed on the system changes, it has been necessary to carry out the analysis again. Thus, this manner of changing the data arrangement requires an enormous amount of efforts. In addition, in this simple-minded scheme, the data are arbitrarily arranged once without the analysis, and then, by measuring the individual data and the loads of the data memory devices, the data are rearranged simply to make the difference in the loads of the data memory devices as small as possible, so that there is no consideration given to which sets of data are required by the same transaction processed on the system, and consequently it is unlikely to be able to take a sufficient advantage of the available system power in this regard as well.
On the other hand, instead of the scheme for rearranging the data after the arbitrary arrangement of the data to make the difference in the loads as small as possible, there are several propositions for a scheme for arranging the data according to prescribed rules so as not to make any difference in the loads from the beginning, without requiring the analysis. Examples of this type of data arrangement scheme includes a scheme for arranging the data randomly with respect to the data storage regions using the hash function, a scheme for dividing the range of values taken by the data into as many groups as a number of data storage regions and allocating each group to each data storage region, and a scheme for dividing the range of values taken by the data into a prescribed number of groups and allocating the groups to the data storage regions in the manner of round robin.
These schemes are all of the type which attempts to avoid the concentration of the loads to a particular data memory device probabilistically, so that there are cases in which they can be successful, but they do not guarantee the avoidance of the concentration of the loads for any system configuration and any frequency of occurrences of the transactions. Moreover, in this type of scheme, the data required for each transaction to be processed on the system are going to be arranged randomly, so that the processing time for each transaction can be longer not just when the concentration of the loads occurs but also when the concentration of the loads is avoided, and consequently it is unlikely to be able to take a sufficient advantage of the available system power in this regard as well.
Thus, in the conventional data management scheme, the analysis necessary for taking the sufficient advantage of the available system power has been quite difficult, and furthermore, the incompleteness of the analysis made it unlikely to be able to take the sufficient advantage of the available system power.
As described, in the multiple processor transaction processing system, in order to carry out the processing of the transaction efficiently by using a plurality of transaction processors, it is necessary to route each transaction to the transaction processor for which the cost for processing this transaction is low, and in addition when a plurality of data memory devices are used, it is also necessary to arrange the data distributedly among a plurality of data memory devices.
Conventionally, it has been a role of the system manager to analyze which data are going to be accessed by the application program executed in the system and determine the schemes for distributed data arrangement and the transaction routing appropriately, such that the transactions can be processed in parallel as much as possible, the data necessary for the processing of each transaction are present at the transaction processor for processing this transaction as much as possible, and the cost required for processing each transaction becomes as small as possible in view of the data arrangement and the frequency of occurrences of each transaction. However, in determining either one of the distributed data arrangement scheme and the transaction routing scheme, it is still necessary to analyze the dynamic characteristic of the system as a whole in order to take the full advantage of the available system power, but this analysis has been quite difficult.
Even when the transaction routing is dynamically determined somehow, if the data arrangement is fixed, it is impossible to route each transaction such that all the data required for this transaction are always present in the routed transaction processor, so that it is unlikely to take the full advantage of the available system power. To this end, in the conventional transaction processing system, it has been necessary for the system manager to make the rearrangement of the data in view of the newly determined transaction routing scheme, but this system management operation can be quite tedious.
Also, even when the data arrangement is changed according to the dynamic characteristic of the data accesses somehow, if the transaction routing is predetermined, all the data required for each transaction are not necessarily always present in the routed transaction processor as the data arrangement is no longer the same as that used in the transaction routing, so that it is still unlikely to take the full advantage of the available system power. To this end, in the conventional transaction processing system, it has also been necessary for the system manager to make the routing of the transactions in view of the newly determined data arrangement scheme and the frequency of occurrences of the transactions, but this system management operation can be quite tedious.