Information sharing is becoming more important as businesses grow and become more global. Not only must companies share information within the company, but also with customers and partners. Information is bound to become more widely dispersed and shared as companies migrate to a grid computing model. Grids are dynamic in nature, which requires that information can be easily and quickly moved from a source to a destination system that will perform the computing.
In the context of an information sharing system, a queue is a structure that holds messages and provides access and ordering functionality. Because the primary usage of such a queue is for messaging, the queue is at times referred to herein as a message queue. A “queue table” is the table where data for a queue is stored. “Dequeue sort order” is a property of a queue table that specifies the ordering of messages.
When Web-based business and other applications communicate with each other, producer applications enqueue messages and consumer applications dequeue messages. At the most basic level of queuing, one producer enqueues one or more messages into one queue, where each message is dequeued and processed once by one of the consumers. A message stays in the queue until a consumer dequeues it or the message expires. A producer may stipulate a delay before the message is available to be consumed, and a time after which the message expires. Likewise, a consumer may wait when trying to dequeue a message if no message is available. An agent program or application may act as both a producer and a consumer. Producers can enqueue messages in any sequence. Messages are not necessarily dequeued in the order in which they are enqueued. Messages can be enqueued without being dequeued.
At a slightly higher level of complexity, many producers enqueue messages into a queue, all of which are processed by one consumer. Or many producers enqueue messages, each message being processed by a different consumer depending on type and correlation identifier.
A system, such as a database system, can form messages by mining transaction logs or data in a database. A system may also form messages when events are fired. Heterogeneous systems may share information by enqueuing messages into a queue using APIs or a messaging gateway. Users of a message queue can dequeue messages manually for processing by a client application.
For example, a point-of-sales system sharing information with a reporting database in real-time is a scenario in which concurrent message capture may occur. A store may have multiple point-of-sale terminals acting as clients of the store inventory database and sending order processing information to the store database. Corporate headquarters may have a system configured for handling near real-time reports using data streaming from the stores. This system may even be a grid if the corporation desires a dynamic system that sizes based on load. The store database can provide near real-time data by forming messages based on triggers fired when the terminals process sales. Thus, message queues can be integral components of a database information sharing system.
A database system that adheres to the ACID (Atomicity, Consistency, Isolation, Durability) model provides functionality for grouping operations in transactions that are atomic. Changes to data via a transaction are not visible to other transactions until the system has atomically committed the changes. If a transaction reads or modifies data written by a previously committed transaction then the transaction is said to have a data dependency on the prior transaction. Furthermore, transactional-level locks may also introduce transactional dependencies. Messages are often used to implement database transactions and, therefore, such messages may have transactional dependencies. For example, message X has a dependency on message Y if part of message X's data is derived directly or indirectly from message Y's data.
One approach to a message queue provides read ordering based at least in part on the enqueue-time of messages. Read order is the order in which messages are presented for browse and dequeue operations. This approach does not support data-dependency ordering because there is no way to read messages from the queue based on transactional dependencies. Users would have to build custom enqueue code that tagged messages with extra metadata, which is used by custom logic in the dequeuing application to enforce transactional ordering. Depending on the requirements of the application and the potential workload, this custom code can be non-trivial or infeasible to implement.
FIG. 1 is a block diagram that illustrates a data dependency violation in the context of a message queue, where messages are dequeued independent of transactional dependency ordering. This example shows that data dependency ordering may be violated with enqueue-time ordering. The arrow from transaction T2 to transaction T1 shows that T1 has a dependency on T2, because T1 is dependent on the value written to table TAB2 by T2.
FIG. 1 shows messages with data dependencies being shared between a source and destination database. At the source database, two sessions are enqueuing messages in the following sequence:
Session 1 enqueues message M1 as part of transaction T1. For example, message M1 contains an insert of a row into an ‘hr.departments’ table;
Session 2 enqueues message M2 as part of transaction T2. For example, message M2 contains an insert of a row into the ‘hr.employees’ table for the employee with an employee_id of 207; and
Session 1 enqueues message M3 as part of transaction T1. For example, message M3 contains an update to a row in the ‘hr.employees’ table for the employee with an employee_id of 207.
Session 3 dequeues messages from the source database to the destination database. The messages are dequeued in the following order:
Message M1 is dequeued, and the change is applied successfully.
Message M3 is dequeued, and an error results because no data is found for an employee with an employee_id of 207.
Message M2 is dequeued, and the change is applied. The result is that incorrect information is in the ‘hr.employees’ table for the employee with an employee_id of 207.
The correct dequeue order that obeys data dependencies and transaction grouping is (M2, M1, M3). Instead, enqueue-time ordering results in dequeue order (M1, M3, M2) because T1 was the first transaction to enqueue a message. An apply error results when message M3 is applied since the update depends on data that is populated by M2. Thus, after all messages have been applied, the state of TAB2 at the destination is not consistent with the state of TAB2 at the source.
Additionally, the message ordering approach based on enqueue-time does not provide repeatable reads. A queue supports repeatable reads if messages are always seen in the same order for any set of reads.
FIG. 2 is a block diagram that illustrates a non-repeatable read in the context of a message queue, where a set of read operations (i.e., “browse” operations) performed twice without any intervening dequeues may result in two different sets of messages. This example shows that a client performing multiple browse operations is not guaranteed a well-defined read order. If the client operation is dependent on a deterministic ordering, then the client operation may fail.
FIG. 2 shows messages being enqueued and browsed within a database. Two sessions are enqueuing messages in the following sequence:                Session 1 enqueues message m1 as part of transaction T1.        Session 2 enqueues message m2 as part of transaction T2.        Session 1 enqueues message m3 as part of transaction T1.        Session 2 commits transaction T2.        Session 1 commits transaction T1.        
Session 3 browses messages in the queue at two different times. The first time session 3 browses messages, session 2 has committed, but session 1 has not yet committed. For this browse, the browse set shows messages in the order (m2, m1, m3). The second time session 3 browses messages, both session 1 and session 2 have committed. For this browse, the browse set shows messages in the order (m1, m3, m2). This could be a problem, for example, if the client application performs a set of browse operations to set up program state and then performs a set of dequeues which results in a different result set.
Based on the foregoing, this approach is not ideal for systems with concurrent enqueuing of dependent messages.
The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.