There have been various approaches to tackling the problem of handling and processing mass numbers of processing instructions in real time. All of these approaches have been successful in handling vast numbers of such instruction and much of this success has been down to such systems employing faster and more powerful computing. However the architecture, which underlies such approaches, has been limited in one way or another that has capped the maximum possible throughput of such systems.
There are several areas of application of such technology such as mass data communications systems and data processing systems, for example. One such data processing system, which represents a non-limiting but illustrative example of how such technology can be used is a transaction settlement system where data instruction messages representing agreements between different parties can be executed to effect the actual settlement or execution of that agreement. In such an arrangement, an example of which is described in detail below, instructions (or commands) to a trusted settlement entity computer can only be acted on if the conditions determined by the settlement entity computer (rules) allow for execution of those instructions. Therefore, as a first part of the process, checks need to be made on current status of the electronic data account files for the resources, which are the subject of the settlement agreement. These checks determine whether the effects of carrying out the instruction on the electronic data account files are acceptable.
In a second part of the process, the instruction is either: approved for execution, referred to as a ‘commit’; or rejected for execution, referred to as a ‘rollback’. More specifically, in this non-limiting example, the positions of both parties' resource account files are assessed by the settlement entity computer to determine if the instruction can be executed without resulting in unacceptable positions on the electronic data account files. If the conditions are suitable, the instruction (command) contained within the data record is executed and subsequently the data files storing the current positions of the resource accounts are updated (commit). If the conditions are not suitable, the electronic data account files are left unchanged without execution of the instruction at this moment in time. It is also possible to update all accounts as a default and reverse the update (rollback) if the first stage indicates that the resultant position on the electronic data account files would be unacceptable. This is the preferred option for high-throughput systems. Rolled back data instructions may be recycled and the associated commands tried again at a later stage, once the conditions have changed for failed instructions. All of this needs also to be reported to the computers of the parties to the agreement, often in real time.
One of the characteristics of execution of such data processing which has in the past limited the scalability of solutions proposed, is that quite often the number of different resources which need to be accessed to execute these instructions does not increase with an increasing number of instructions. This is because there is often a concentration of instruction execution on a certain number of resources with increasing numbers of instructions. This concentration is a problem because each instruction, which has an effect on the value of a resource account file, needs to update that value before the next instruction, which specifies that account can be processed. Typically, this is achieved by a particular instruction taking temporary exclusivity (locking) of a given account file whilst it is in the process or checking and then updating the values of that resource account file. This required updating before the next instruction can be processed has a direct effect on the maximum speed at which these increased numbers of instructions can be processed. This is a severe problem for an entity's resources that are common to many different types of instructions. In particular, where 50% of instructions are updating only 5% of the resource accounts, which is considered to be a high-concentration situation, this is a particularly severe problem.
A non-limiting example of this can be seen when a data file storing a cash account for an entity needs to be checked to see if the funds are available for instructions to be executed. The checking of this resource account for each instruction needs to be carried out after the previous instruction has updated the cash account position as a consequence of execution of its instruction. So these processes need to be sequential or else the action of execution of one instruction, which could change the balance of that account, may invalidate any subsequent instruction that also requires access to that account. The maximum processing speeds of prior art architectures (described below) have all been limited by this issue.
Another problem, which has arisen in prior art systems where concurrent processing architecture has been used, is that of ‘deadlock’. This problematic condition arises where two concurrent instructions access the same two accounts resulting in one instruction locking one resource account and the other instruction locking the second resource account. This prevents, for each instruction, access to the other account for checking to effect the instruction. Both instructions are caught in a wait state preventing them from executing a commit (execute) action to process the instruction. It is to be appreciated that whilst rollback of one instruction will release the processing of the other one, and is always possible, this solution dramatically decreases the throughput of the instruction processing.
Whilst the basic processes of checking and updating electronic account files appears to be relatively trivial for a single instruction, the processing of millions of such instructions each day makes the solution non-trivial. The objective of any system employed by the settlement entity is to achieve the highest throughput possible in real-time. Scalability of any solution is ultra important as is any small saving in time of any processing step, as such savings are multiplied many times for mass instruction handling.
Many prior art systems have been developed to implement such settlement procedures and a typical example of such a system is now described with reference to FIGS. 1 to 5.
Referring to FIG. 1 there is shown a general prior art central resource management system 10 which is arranged to handle millions of instruction data messages each day. The system 10 is coupled to various communications channels 12 which may be dedicated leased communication lines or may be other types of secure communications channels. Via these channels the system 10 is connected to many different parties servers. In this example and for ease of explanation, the system 10 is connected to a Party As server 14 and to a Party B's server 16. Party As server 14 and Party B's server 16 each have access to a respective database 18, 20 of records, where each record describes an agreement with another party,
The central resource management system 10 typically comprises an instruction execution server 22 which has access to a master resource database 24 which contains data records representing all parties' (namely A and B in this example) resource accounts,
Referring to FIG. 2 a schematic representation of the master resource database's structure for each party is provided. The database 24 comprises a plurality of different types of resource account files 30 for each party and an indicator 32 of an aggregated resource value for that party. In this particular example, the instruction data messages are each directed to effecting a command or instruction on these account files 30, typically transferring resources from one party's account to another party's account and updating the aggregated resource value indicators 32 accordingly. The accounts are data files 30 which represent the actual resources of the party 14,16. The resources can be any resource of a party. In this example, they represent anything, which the party owns which the party wishes to trade.
Referring now to FIG. 3, a general structure or format of an instruction data message 40 is shown. The instruction data message 40 essentially has six basic fields that are required to effect the instruction. The instruction data message 40 has a Party ID field 42 for identifying the parties to the instruction: in this embodiment, the instruction data message 40 could identify Party A and Party B. A Date of Execution field 44 is provided to define a date on which this instruction message 40 is to be executed. The remaining four fields identify the resource details, which are the subject of the instruction. A first resource type field 46 and a corresponding first resource quantity field 48 are provided for identifying the resource of the first party (e.g. Party A) and the amount of that resource which is to be involved in the instruction. A second resource type field 50 and a corresponding second resource quantity field 52 are provided for identifying the resource of the second party (e.g. Party B) and the amount of that resource which is to be involved in the instruction.
For each agreement between two parties, there will be two instruction data messages 40 in existence (one for each party).
FIG. 4 shows the components of the prior art instruction execution server 22 of FIG. 1. The components include an instruction validator 60, which is arranged to check the validity of a received instruction data messages 40, and instruction matcher module 62 which matches together two different instruction data messages 40 which relate to the same agreement. The instruction matcher also creates a settlement instruction for matched instruction data messages 40. A timing module 64 is also provided for comparing the current time with the time associated with a newly created settlement instruction. Also the timing module 64 can determine whether a timing window for access to the resource accounts files 30 of each party is currently open or closed. An instruction database 66 is also provided for storing settlement instructions for future execution. The instruction execution server 22 further comprises a reporting module 68 for communicating information messages to the relevant parties. Finally at the heart of the instruction execution server an instruction checking execution and updating engine 70 is provided. The reporting module 68 is coupled to the instruction validator 60, the instruction matcher 62 and the instruction checking execution and updating engine 70. The way in which the instructions are to be processed is determined by the specific data processing architecture of the instruction checking execution and updating engine 70, and this varies between different prior art devices (as is described below).
The way in which the instruction execution server 22 operates is now described with reference to the flow chart of FIG. 5. The steps of receiving, validating, and matching the instruction data messages 40 through executing the instructions (settlement instructions) at the positioning engine 70 and updating and reporting the outcomes of the updating are now described in greater detail. More specifically, the general operation 78 of the prior art instruction execution server 22 commences at Step 80 with the server 22 being connected to the communication channels 12 and receptive to receipt of instruction data messages 40. Party A then sends at step 82 a data instruction message 40 to the server 22 which describes an agreement with Party B. similarly, Party B sends at step 84 a data instruction message 40 to the server 22 which describes an agreement with Party A. At the server 22 itself, the messages 40 are received and the instruction validator 60 attempts at Step 86 to validate each of the received instructions 40. If the validity check at Step 88 fails, then this is communicated at Step 90 to the reporting module 68 and a reporting message (not shown) is sent at Step 90 to the source of the non-validated data instruction message 40.
Otherwise, for a validated instruction data message 40, the reporting module 68 in instructed at Step 92 to send a positive message back to the source of the validated instruction message 40 indicating receipt and validation of the instruction data message 40.
Validated instruction messages 40 are passed to the instruction matcher 62 and an attempt at Step 94 is made to match corresponding instruction data messages 40 that describe the same agreement.
The instruction matcher 62 attempts, at Step 96, to match different messages 40 together. If the matching check at Step 96 fails, then this is communicated at Step 98 to the reporting module 68 and a reporting message (not shown) is sent at Step 98 to the source of the non-matched data instruction message 40 and the process ends at Step 100. This failure is shown quite simply here in order to simplify the explanation of the prior art system. However, in practice the failure to match may be a conclusion that is reached only after many attempts and perhaps after the expiry of a set matching time period, which may be several days.
Matched instruction messages 40, determined at Step 96, are notified to the reporting module 68 which in turn reports at Step 102 the existence of a matched pair of instruction data messages 40 to the sources of the matched data instruction messages 40 (Party A and Party B in this case). Furthermore, the instruction matcher 62 then creates at Step 102 an execution instruction (settlement instruction) with an execution date. This execution date is obtained at Step 104 from the date of execution field of either one of the matched instruction data messages 40 (because they are the same). The date of execution of the settlement instruction is then compared at Step 104 to the current date and availability of an execution time window (determined by the timing module 64).
If the result of the comparison, as determined at Step 106, is that the settlement instruction is not executable now, then the settlement instruction is stored at Step 108 in the instruction database 66. The database 66 is checked at regular intervals and the process 78 waits at Step 110 until the execution date is achieved and the execution window is open. Typically, an execution window may be open for several hours each day.
Alternatively if the result of the comparison determined at Step 106 is that the settlement instruction is executable now, then the settlement instruction is not stored.
The next stage in the progression of the general operation 78 of the instruction execution server 22 is to send, at Step 112, the settlement instruction to the instruction checking, execution and updating engine 70 (also referred to as a positioning engine). The positioning engine 70 has associated with it a set of execution rules 72. These rules 72 determine whether the settlement instruction can be executed, namely it determines whether the result of the settlement instruction on the resource account files 30 and the aggregated resource value 32 will be acceptable. An example of an unacceptable condition is if a particular resource account file 30 or an aggregated resource value 32 will have, as a result of executing the command, a value below a predetermined amount. In the non-limiting transaction settlement system example mentioned above, the resource accounts could be cash and security accounts and the aggregated resource value 32 could be a credit line where the current value of the resources provides a certain amount of credit as the resources act as a guarantee against the credit provided.
The positioning engine 70 checks, at Step 114, if the execution rules 72 will still be satisfied if the command is executed, namely of resultant effects on the resource account files 30 and aggregated resource values 32 of the two parties will be acceptable.
If the execution rules are not satisfied as determined at Step 116, a prompt is sent to the reporting module at Step 118 and the reporting module generates and sends at Step 118 a message reporting the unsuccessful results to both parties to the failed settlement instruction, e.g. Parties A and B in this example. Subsequently, the failed settlement instruction remains at Step 120 in the positioning engine 70 and is retried (repeating Steps 114 to 126) for settlement at a later time/date.
If, alternatively, the execution rules 72 are satisfied as determined at Step 116, the settlement instruction is executed at Step 122. The positioning engine 70 then updates at Step 124 the current positions in the resource account files 30 with the results of the executed settlement instruction, namely the resource account files 30 and the aggregated resource values 32 are updated with the correct balances after the transfer of resources has been effected. Finally, the reporting module 68 is instructed at Step 126 to generate and send at Step 126 a message reporting the successful results of the settlement to both parties to the successful settlement instruction, e.g. Parties A and B in this example.
A successful execution of a settlement instruction brings the general operation 78 of the prior art instruction execution server 22 to a close for that single instruction. However, as millions of instructions are being processed each day, the process 78 continues for other instruction data messages 40 which are continually being received from many different party's servers.
As has been mentioned previously, the way in which the settlement instructions are to be processed is determined by the specific data processing architecture of the instruction checking execution and updating engine 70, and this varies between different prior art systems. There are essentially two different types of approaches: a batch process and a parallel input matching process which are now described with reference to FIGS. 6 and 7 respectively.
A batch process is a standard sequential update approach in which execution instructions are stored for sequential processing and are executed consecutively in an automatic manner. This process 130 is illustrated schematically in FIG. 6 where a new instructions file 132 containing a batch (sequential set) of new instructions (settlement instructions) is provided together with a master file 134 which stores the current positions of all of the resource account files 30 and any aggregated positions 32.
Each settlement instruction identifies the two parties to whom the agreement relates, the resource account files 30, the quantities of resources which are the subject of the agreement between the parties and the time/date of execution as previously described in FIG. 3. A key feature of this type of processing architecture is that these settlement instructions are required to be listed in order of the resource account 30 to which they relate. Typically, a sequence key is provided with each instruction which assists with cross referencing.
The master file 134 lists the resource data accounts 30 in order also using the abovementioned sequence key. This order correspondence between the master file and the input data file is very important for batch processing.
A sequential update program 136 is provided to determine whether each agreement can be implemented by settlement of the corresponding instructions. The sequential update program 136, implemented on a processing engine (not shown), uses a standard algorithm called a matching algorithm. As stated above, the matching algorithm requires that both input files (existing master positions file 134 and new instructions file 132) are stored in the same order of sequence keys. The keys used in the instruction file 132 are called ‘transaction’ keys and the keys stored in the existing master file 134 are called ‘master’ keys.
The sequential update program 136 defines the logic to read both files 132, 134 in sequence till the end of both files. The results of the sequential update program are stored in a new master file 138, which holds the updated positions of all of the party's resource account files 30 and aggregated positions 32.
The sequential update program 136 starts by reading the first instruction or record of each file 132, 134. All of the instructions relating to a given resource account file 30 are executed sequentially, with each change in the value of the resource account file 30 being updated in memory of the processing engine running the matching algorithm. Once the updates for a particular resource account file 30 have been completed (sensed by a change in the transaction key for the next instruction), the resultant value for that resource account file 30 is then written to a new master file 138 together with any updated aggregated positions 32. This process is repeated for each different resource account file 30 until the end of the transaction file 132 is reached.
Where multiple resource accounts need to be updated to execute an instruction, a more sophisticated approach is used. In order to handle the updating of multiple accounts, the updating is broken down into stages. The solution is to execute the debit of resource account values only in a first run, report the instructions where the debit was successful in order of the credit of resource accounts before applying the credit to the corresponding resource accounts. Whilst there will be problems due to failed instructions because the crediting of resource accounts was delayed, this can be solved by doing multiple runs.
The sequential update program 136 typically defines the logic to handle the following cases:
Transaction key=Master key =>Apply the current instruction to the current master data record of a resource account file                =>Store the new positions of the current resource account as an updated master record in memory        =>Read the next instruction        
Transaction key>Mater Key =>Write updated master data record to the new master file 138                =>Restore master data record (if available) or read master file for next master data record        
Transaction key<Master Key =>Store current master record                =>Create default master record        =>Apply the instruction to the master record        =>Read the next instruction from transaction file        Or        =>Reject the instruction because corresponding master file information does not exist, namely no account resource file 30 found in master file 134        =>Read the next instruction        
When this is done, the next instruction record is read from the transaction file 132 and the same process is reapplied, until the transaction key becomes greater that the current master key.
In this algorithm, a single process in memory nets multiple updates to the new master file 138, which clearly provides faster processing of the instructions. The limitation is that all the instructions need to be grouped and sorted before running the process (batch). Also all instructions need to be processed before being able to return a first reply message to the parties confirming execution of the instruction. Furthermore in a batch process, the data is not available for other processes while running the matching algorithm. To provide access to the data in real time, database updates are required. If done directly, these database updates kill overall throughput of the process. If implemented in an extra step after the execution (for example a DB2 load utility), it disadvantageously blocks all the accesses to the data during that time. The result of this is that batch processing is very efficient when being executed but it cannot be executed in real-time because of the requirement for grouping and sorting prior to execution.
Another alternative approach is shown in FIG. 7, namely the previously mentioned parallel input matching approach. Under this approach the settlement instructions that are generated by the instruction matcher 62 are handled by a plurality of individual instruction handling computers or processes 140, 142. Also, a sequence file handling process 144 can handle a batch of instructions, stored in the instruction database 66. Each process 140, 142, 144 has its own version of a direct update program 146 which can read the value of a current resource account 30 and an aggregated resource value 32 and create a direct update instruction for the party's resource account files 30 in the master database 24. A single instruction execution manager 148 is provided to make the final decision on the updating of the database 24 by the received instructions from the processes 140, 142, 144. The instruction execution manager 148 uses the set of execution rules 72 (see FIG. 4) to either commit an update instruction for execution or roll it back for later execution.
One of the execution rules 72 which the instruction execution manager 148 has to implement deals with the issue of lockout. As has been mentioned previously, lockout is where access to a resource account file 30 is prevented as it is in the process of being used for another update instruction. In practice what this means is that contention for updating a particular resource account 30 is handled by insertion of wait states until the resource account file 30 has been released from a previous update instruction, namely the previous update has been completed. In this way, the instruction execution manager 148 prevents two different update instructions from modifying the same resource account file 30 in parallel.
Each process 140, 142, 144 runs in parallel and provides greater bandwidth than a single process and this should lead to a greater potential throughput. More specifically, when the updates are distributed over a large number of different resource account files 30, a parallel system is scalable. Under these circumstances it is possible to increase the throughput of the system by running many updating processes 140, 142, 144 in parallel. However, when the updates are not well distributed, the implementation of the lockout process to ensure data integrity quickly caps the maximum throughput the system can reach. When this limit is reached, running one more update process does not increase the global throughput because it also increases the ‘wait state’ of the other update instructions for unavailable (locked) resource account files 30.
Whilst this approach is excellent for real-time updating, it suffers from poor throughput under most conditions. This is because normally updates are not usually distributed over a large number of different resource files 30. Rather, it is common in many applications for certain resource accounts to be heavily used by many different instructions. For example, in the field of transaction settlement, it is common for 50% of the instructions to be concentrated on 5% of the available resource account files 30. Under these circumstances the real-time processing technique of FIG. 7 has poor performance.
Another issue which is important to consider is that of failed instruction recycling. Here any instruction, which cannot at a particular moment in time be executed, because the resource accounts do not have the required values to meet the execution rules, is simply stored for another attempted execution at a later time. Each temporary failure can be reported to the instructor, indicating that the instruction may be executed shortly once the resource account conditions have changed. Multiple failed attempts or the expiry of a time out period may cause the instruction to be reported as finally failed.
This prior art recycling process is useful in reducing the number of new instruction submissions to the execution engine. By retaining the instruction as ‘pending’ there is a greater chance it will be processed when the conditions change. Given the volume of instructions being processed, there is a strong likelihood that most recycled instructions will be executed without having to be reported as finally failed.
This recycling however, leads to a problem in the throughput of the execution engine in that it is slowed down by the recycling process. In particular, where there are parallel inputs the instructions describing large resource account movements are often failed as the conditions for their execution are not reached within the recycling period (before time out occurs). Where there are sequential inputs, the failure can lead to a greater number of runs being required to handle the failed instructions.
There have been many efforts to attempt to overcome these problems. Also the amount of money and resources dedicated to finding a solution to these issues is sizable. Despite these efforts, the problem of throughput verses real-time processing in instruction processing architecture remains.
Various prior art techniques have been described in the following papers which serve to illustrate the problem and the length of time it has been known without a viable solution being found.
1) KAI LI ET AL:‘Multiprocessor Main Memory Transaction Processing” 19881205; 19881205-19881207, 5 Dec. 1988, pages 177-187, XP010280434.
2) GARCIA-MOLINA H ET AL: ‘MAIN MEMEORY DATABASE SYSTEM: AN OVERVIEW’ IEEE TRANSACTIONS ON KNOWEDGE AND DATA ENGINEERING, IEEE SERVICE CENTRE, LAS ALAMITOS, Calif., US, vol. 4, no. 6, 1 Dec. 1992, pages 509-516, XP001167057 ISSN: 1041-4347.
3) GRAY J ET AL: “Transaction processing: concepts and techniques, PASSAGE’ 1 Jan. 1993, TRANSACTION PROCESSING: CONCEPTS AND TECHNIQUES,
PAGE(S) 249-267, 301, XP002323530.
4) GRAY J ET AL: “Transaction processing: concepts and techniques’ 1 Jan. 1993, TRANSACTION PROCESSING: CONCEPTS AND TECHNIQUES, PAGE(S) 333-347, 496, XP002207245.
The present invention seeks to overcome at least some of the above mentioned problems and to provide an improved system for processing and handling very high numbers of processing instructions in real time.
Before considering the further more detailed objectives the present invention, it is important to understand some important characteristics of any instruction processing architecture and these are set out below.
Each new system will have a specific throughput requirement. The throughput represents the number of instructions a system should execute in a predefined period of time to comply with the system's objective. The throughput can be represented by a number of processed instructions per day, per hour, per minute or per second. The throughput is qualified as ‘high throughput’, when it is greater than 100 instructions per second.
In systems implementing an processing architecture with a single instance of the instruction processor, this processor should be able to achieve the required throughput. However, where the single processor cannot achieve this, having multiple instances of processors processing instructions in parallel should allow the objective to be achieved. In this latter case, the global throughput is the sum of the throughput reached by each of the instances. If the instruction execution process is composed over multiple subsequent steps, the throughput of the overall instruction execution process is determined by the throughput of the weakest (slowest) step (such as a bottleneck).
The response time of a new system represents the elapsed time between the receipt of an incoming instruction execution request from a third party server and the sending of a related reply back to that server. An instruction processing system having an average response time below five seconds can be qualified as a ‘real time’ system. When running a single instance of the instruction processor, the response time can be measured as the time to process the request (read the request and execute the instruction) and the time to send the reply (generate the reply and send it). If requests arrive at a rate above the throughput of the system, queuing of requests occurs. In this case, the time a request spends in a queue has to be considered as part of the overall system response time to that request. When the instruction processor is composed of multiple processing steps, the overall response time of the system is calculated as the sum of the response times of each of the multiple steps.
Typically, each instruction processing system operates with hundreds of parties' servers and has corresponding resource accounts for each party stored locally. As each party can have many resource accounts (tens and hundreds are not uncommon), it is possible that the resource accounts to which instructions relate are uniformly spread across these many resource account files. However, in some specific applications of the instruction processing system, the requirements are such that a small set of resource account files are frequently involved in instructions such that they are updated with a high frequency. The concentration of an instruction processing system determines the degree to which a small set of resource account files are frequently involved in processed instructions. An instruction processing system having 50% of the instructions updating 5% of the resource account files is defined as having a ‘high concentration’.
Given the above described characteristics, another more specific objective of the present invention is to provide an instruction processing system which operates in real time (as defined above), has a high throughput (as defined above) and which can handle a high concentration (as defined above).