The present invention relates, in general, to processing a batched unit of work, and, in particular, to processing a batched unit of work associated with a plurality of messages for use with a data store.
In message-driven transactional applications, a server performs a disk force for a message transaction when a commit decision has been computed. Disk forces are expensive and cause delay. With message-driven transactions, it is possible to improve efficiency, by batching, such that a single transaction is associated with a predefined number of messages. As such, instead of performing a disk force for a commit decision for each message, a disk force is performed for a number of commit decisions associated with the predefined number of messages. However, batching causes problems e.g., if there is an error that causes one message to fail, work associated with all messages in the (batched) transaction will be backed out.
With reference to FIG. 1, there is shown a system 100 comprising an environment 120 having underlying messaging support comprising a queue manager 130, an input queue 135, an output queue 140, a failure queue 145, and a computer program, e.g., a database manager 125 with database tables 127, of a data processing system.
The system 100 also comprises a message broker 105 hosting an execution group 110 that in turn hosts a message flow 115. A message flow is a visual representation of the sequence of operations performed by the processing logic of a message broker as a directed graph (a message flow diagram) between an input and a target (for example, an input queue and a target queue). The message flow diagram comprises message processing nodes, which are representations of processing components, and message flow connectors between the nodes.
In the example herein, the message flow 115 represents the processing logic of the input queue 135, the output queue 140, the failure queue 145, and the database manager 125.
In a first example, typically, the processing logic gets a message from the input queue 135 (this starts a transaction), updates a database table 127, and puts a message to the output queue 140. This work is normally executed in the transaction, and the work is committed before the next message is obtained from the input queue 135. If there is an error, the work is backed out and the message is “put back” on the input queue 135, in this case, a Backout Count is incremented by 1.
Typically, there is also defined a Backout Threshold for queues which represents the number of times a message is allowed to be backed out before it is put to the failure queue 145. In other words, e.g., if the Backout Count is greater than the Backout Threshold, the message is put to the failure queue 145.
The problem with this example is performance. Committing work causes an associated transaction coordinator and resource owners, in this example, the resource owners are the queue manager 130 and the database manager 125, to force writes to a log which causes delay.
A solution to the above example is provided by a second example. It is possible to improve efficiency, by batching, such that a single transaction is associated with a predefined number of messages. In this way, instead of committing work for each message, the transaction is committed after commits for a predefined number of messages. In one implementation, a message flow attribute termed Commit Count is used, in which, prior to processing, a Message Count is initially set at zero.
Next, the processing logic gets a message from the input queue 135 (this starts a transaction). The Message Count is incremented by 1. A check is made to determine whether a Backout Count is greater than a Backout Threshold, and, if so, the message is put to the failure queue 145. If the Backout Count is not greater than a Backout Threshold, the database table 127 is updated and the message is put to the output queue 140.
A further check is made to determine whether the Message Count is greater than or equal to the Commit Count. If the Message Count is not greater than or equal to the Commit Count, processing logic gets another message from the input queue 135, and the above steps are repeated. If the Message Count is greater than or equal to the Commit Count, the transaction is committed for each message (a log write is forced for the transaction), and the Message Count is set to zero. There are a number of problems associated with this example.
First, although Commit Count, as implemented, speeds up the processing considerably, it has the drawback that if there is an error that causes the transaction to be backed out, the whole batch of messages is affected. For example, if Commit Count is set at “300” and the 200th message causes an exception, all 200 messages in the current batch will be backed out, and later put in the failure queue. Second, database managers may resort, under heavy load, to “lock escalation,” which can result in commits failing, even when there are no application errors. In particular, commits may fail when a large batch of messages is processed, even when smaller batches of messages that are processed having the same information will succeed.
A solution to the above problem is provided by a third example. In an implementation, there is provided a second, “cloned” message flow that runs with Commit Count=1. When the first message flow reads a message with Backout Count>Backout Threshold, instead of putting it to the failure queue, it puts it to the second, “cloned” message flow's input queue (the message is put to the input queue with Backout Count=0). When the second, “cloned” message flow reads a message with Backout Count=0, it processes it in the same way as the first message flow. When the second, “cloned” message flow reads a message with Backout Count>Backout Threshold, it puts the message to the failure queue. Referring back to the earlier example, if the 200th message of a batch failed, the second, “cloned” message flow will successfully process messages 1 to 199, and the 200th message will fail again and be put to the failure queue.
The third example introduces more problems. First, an extra administration overhead occurs in that there are two copies of the message flow to deploy and operate. Second, message sequence is lost, e.g., if a message fails in the middle of a large batch, each of the messages in the batch will be processed by the second (clone) flow in parallel with the first (original) flow. This may not be acceptable for applications that have to process messages in order.