Asynchronous transfer of messages between application programs running on different data processing systems within a network is well known in the art, and is implemented by a number of commercially available messaging systems. A sender application program issues a command to send (put) a message to a target queue, and a queue manager program handles the complexities of transferring the message from the sender to the target queue, which may be remotely located across a heterogeneous computer network. The target queue is a local input queue for another application program, which retrieves (gets) the message from this input queue asynchronously from the send operation. The receiver application program then performs its processing on the message, and may generate further messages.
Thus the receiver application program services requests which are instigated by the messages that it retrieves and consumes (typically under a transaction). Such an application will however occasionally be unable to process a request/message successfully. Generally such applications are transacted, i.e. they consume each request message inside a transaction and on successful completion of the request the transaction is committed. When the transaction commits the message is removed from the queue. However, if the consuming application fails to process the request the transaction may be rolled back. Rolling back a transaction will make the message re-available on the queue, generally at the head of the queue (if the queue works in a FIFO way) resulting in the consuming application being given the same message when they ask for the next message on the queue. If the application is still unable to process the request another roll back will occur and the whole process repeats.
Messaging systems provide the ability to break out of this eternal loop in one of two manners:
a) Provision of a ‘dead letter queue’ (DLQ) or ‘exception destination’ and the detection by the messaging system of a message being re-delivered repeatedly. Once the consumption of a message has been rolled back a certain number of times (past a defined threshold) the messaging system will automatically move the message to the dead letter queue or exception destination so that it is no longer seen by the consuming application. The consuming application will now be able to process the next message in the queue. Messages on the DLQ can be the subject of administrator attention.
b) Rather than moving a problem message to another queue in the event of that message being rolled back past a certain threshold, the consuming application is stopped. The consuming application may be managed by an application server, in which case the application server is able to stop the consuming application. At this point the administrator must step in to restart the application once the problem has been resolved.
These two solutions address two different situations.
Situation 1
A so called ‘poison’ message is introduced, for instance a badly formed message that the consuming application will never be able to process successfully. Solution (a) solves this by automatically moving such a message off to the side as soon as possible so that subsequent messages in the queue can be processed. However, solution (b) will immediately stop the application on this poison message. This is not so ideal, since the problem does not lie with the application and therefore prevents timely processing of further, correctly formed messages. The administrator is forced to intervene to remove the offending message and restart the application to process any subsequent messages.
Situation 2
The consuming application experiences a transitory problem that prevents it from processing any messages for an unknown period, for instance its backend database connection goes down for ten minutes. In this situation solution (a) can cause the entire queue of messages to be transferred to the dead letter queue as fast as they arrive before the administrator notices and stops the application by hand, fixes the problem and moves all the messages back from the dead letter queue onto the original queue to be consumed. Solution (b) however, stops the application on the first message and waits for the administrator to intervene, hopefully after they've re-established the database connection, requiring no messages to be moved from one queue to another and back again.
Unfortunately neither of these solutions satisfactorily protects a system from both of these potential problem situations.