Various aspects of the present invention relate generally to preserving message order in an asynchronous messaging system.
With reference to FIG. 1, in an asynchronous messaging system (100), a sending application (termed “producer” (105) herein) sends a message to a queue (115) of a receiving application (termed “consumer” (110) herein) for processing.
When using an asynchronous messaging system, there is often a need for messages from a producer to be processed in the order in which they were sent. For example, if a set of messages represent inserts, updates, deletes etc. for a database, the results can be disastrous if the messages are processed out of order. For example, if an operation to insert a row, followed by an operation to delete a row is processed in the wrong order (i.e. the delete operation is processed before the insert operation), the result leaves a row that should have been deleted as remaining inserted.
A producer naturally orders messages as they are produced. To assure the order is maintained, the most common solution is to configure a producer to send all its messages to a single queue associated with a single consumer. The single consumer processes all messages from the queue in the order in which the messages were placed in to the queue.
Unfortunately, the consumer of such messages becomes a single point of failure. Furthermore, since only a single consumer is used, the system is not scalable. This solution restricts the ability to build dynamic, flexible, messaging architectures which can support different numbers of consumers. Advantageously, multiple consumers can aid with load balancing requirements, can be utilized to route around a failed consumer etc.
Thus, to provide scalability in some asynchronous messaging systems, multiple consumers are employed to process messages stored in a shared queue. There is no guarantee that different consumers will process messages taken from a shared queue at the same rate. With reference to an asynchronous messaging system (200) in FIG. 2, if two consumers (210, 220) listen on a single queue (215), one consumer may process messages faster than the other consumer.
For example, if Consumer 1 (210) executes operations at half the speed of Consumer 2 (220), Consumer 1 (210) can take twice as long as Consumer 2 (220) to process some messages. Furthermore, typically, Consumer 1 has no knowledge of the existence of Consumer 2. Thus a message sequence m1, m2, m3 may actually be processed in the order m1, m3, m2 if Consumer 2 (220) retrieves and processes message m1; Consumer 1 (210) retrieves m2; Consumer 2 (220) retrieves and processes message m3 before Consumer 1 (210) completes its processing of m2.
In one prior art solution, a system is disclosed that can be used when multiple consumers process messages from a shared queue. In the prior art, messages of a particular kind are marked (e.g. by a producer, a filtering application etc.) with a globally unique sequence number (GUS). The multiple consumers must have access to a first relational database that stores data (e.g. GUS, message payload) associated with a last message that was processed and a second relational database that stores data (e.g. GUS) associated with an out of sequence message. Thus if produced messages m1, mn2 and m3 are received in the order m1, m3″ m2, then m1 is processed first and is inserted in the first relational database. When m3 is received, a query against the first relational database determines that m3 is out of order. Message m3 is inserted in the second relational database until m2 is received. When m2 is received, a query is executed against the first relational database to determine that m1 has been processed. Thus, m2 and m3 can then be processed. The first relational database is then updated to reflect that m2 and m3 have been processed.
The prior art solution allows messages to become out of order in transit and then allows for the order to be re-established by utilizing GUS and relational databases. However, there is a requirement for message payload to be stored, which creates resource-consuming overhead. There is also a requirement for consumers to share access to the relational databases—this can cause overhead and reduce scalability, since the number of consumers that can share access to the relational databases is limited.
In some asynchronous messaging systems, multiple consumers are “clustered”, wherein each consumer has an associated queue. To preserve message ordering, some systems typically provide a feature wherein a producer can select a single instance of a consumer to which to send messages. However, if the message is to be sent via a chain of different, clustered, consumers, either the message ordering requirement must be sacrificed (i.e. allow later messages to overtake earlier messages so that each consumer in the chain can be selected dynamically for each message) or the ability to build dynamic, flexible, messaging architectures is sacrificed (i.e. wherein each component explicitly specifies the next component to which the message must be sent).