The present invention relates to information processing systems, and in particular to techniques for the delivery of a stream of events from an information producer to an information consumer.
Messaging middleware, by supporting general messaging and message transformation between network nodes, facilitates the integration of distributed components of a processing application. Middleware collects messages from information producers (xe2x80x9cpublishersxe2x80x9d), filters and transforms the messages as necessary, and disseminates applicable messages to interested information consumers (xe2x80x9csubscribersxe2x80x9d). This type of system is known as a publish/subscribe system. Publish/subscribe middleware systems are therefore important to the inter-connection of heterogeneous and distributed components of large information networks, which rely on message communication, in domains such as finance, process automation, and transportation. In some embodiments of such systems, xe2x80x9ceventsxe2x80x9d are reduced into messages, to be provided to the subscribers as event message streams.
An exemplary publish/subscribe system may be represented by an Information Flow Graph (xe2x80x9cIFGxe2x80x9d) as a set of information publishing nodes which emit event messages, a set of information subscribing nodes which read event messages, a set of intermediate nodes where messages coming from streams along different input arcs are interleaved into an outgoing stream of messages, i.e., a replica of the stream is sent along each outgoing arc. The nodes may be connected by arcs of at least two kinds:
Selection arcs, which xe2x80x9cfilterxe2x80x9d messages flowing from one node to another by only passing along messages which satisfy a predicate; and
Transformation arcs, which xe2x80x9ctransformxe2x80x9d messages flowing from one node to another by adding, dropping, and/or recomputing the fields of each message.
A publish/subscribe system may include a large network or xe2x80x9cgraphxe2x80x9d of nodes and arcs, with many possible paths from each publishing node to each subscribing node, each path comprised of possibly many selection and transformation arcs, resulting in a particular sequence of operations between each publisher/subscriber node pair.
Subscribers therefore receive information in the form of event message streams, possibly filtered with transforms and/or selects, where each event is described in a message, such as the price and volume of a new stock trade. However, a subscriber""s access to its event message stream may be interrupted for a number of reasons, e.g., because the subscriber may only be intermittently connected to the network, or because a temporary link or network failure has caused an interruption in the path from the publisher of the events to the subscriber.
What is required are techniques for addressing the problem of delivering equivalent, optimized event sequences to subscribers for, e.g., effectively recovering from an interruption of a subscriber""s stream of event messages, by delivering a potentially shorter sequence of messages having an equivalent effect on a subscriber""s application.
When a subscriber""s access to its network is resumed after a period of interruption, there is likely to be a set of missing events, i.e., the events which would have been delivered to the subscriber had the interruption not occurred. To satisfy the subscription, these missing events would normally have to be delivered in their entirety when access is resumed. Often, however, the nature of the subscriber""s application allows it to tolerate receiving a much smaller set of events, i.e., a summary of the missing events. For example, a subscriber application interested only in the maximum, minimum, and current price of XYZ""s stock can reconstruct this information after receiving just a few messages, rather than all missing messages. Replacing a possibly large set of missing events by a smaller summary is advantageous, since fewer messages have to travel through the network, and the subscriber need not process all missing events. By providing summaries, load on the message delivery system, and processing time in the subscriber, can be reduced.
To inform the system of the type of summary appropriate for a particular subscriber, the subscriber must specify what it considers to be the essential content of an event stream. This specification may be referred to as the event stream interpretation. An event stream interpretation is a rule that, e.g., converts an event stream into a state. Two different event streams are considered equivalent for a particular subscriber if according to the client""s event stream interpretation rule, the two streams yield the same state. In the two examples above, one subscriber would define a state having a tuple  less than max, min, current greater than , and the other would define a state having a count of number of drops below 120. Therefore, associated with each subscriber""s respective state is a rule for deriving that state from the event stream.
The present invention provides a technique for exploiting the fact that the subscribers have defined event interpretation rules, to deliver optimized sequences, that may differ from the published sequences, but which are equivalent under the interpretation rule. One such technique involves monitoring an event stream for each subscriber, and computing the states before and after a subscriber""s interruption using the respective event stream interpretation rule. After the interruption, the technique determines and delivers a (usually smaller) equivalent, but optimized summary of events which according to the given rule, converts the state before the interruption to the state after the interruption. Since these summary events yield the same state according to the subscriber""s event stream interpretation, they are equivalent to the missing events in their entirety, and may be delivered to the subscriber in place of the (usually larger) set of missing events.
The invention includes (1) converting the event stream interpretation rules to a state transition automaton in which each event corresponds to a state transition; (2) using the automaton to generate a xe2x80x9cdistance estimator function,xe2x80x9d estimating the cost in messages to traverse between two states; and (3) using the automaton and the distance estimator together with a shortest-path graph search technique, treating the set of possible states as the vertices of the graph and the events as arcs between vertices. The arcs in the resulting shortest-path between the state before the interruption and the state after the interruption constitute the desired, optimized set of summary events.
Another technique involves detecting that certain events do not affect the state, and need not be delivered.
In this regard, the present invention provides, in one aspect, an event message handling system in which a subscriber node specifies an event interpretation rule for mapping an event message stream into a state. An optimized version of the event message stream is delivered to the subscriber node which correctly maps to the first state using the event interpretation rule. A sequencer node elsewhere in the system may receive the event message stream, and use the event interpretation rule to update a state therein as a function of incoming messages also sent thereto of the event message stream. The sequencer node delivers the optimized version of the event message stream sufficient to correctly update the subscriber""s state such that the subscriber""s state is equivalent to the sequencer node""s representation of that state.
The invention may be used to recover the subscriber node from missing a portion of the event message stream in which case an optimized version of the missing portion of the event message stream is delivered to the subscriber. The missing portion of the event message stream may result from the subscriber temporarily losing, and then regaining, its access to the event message stream.
A shortest path graph search technique may be employed between a start and end of the missing portion of the event message stream to generate the optimized version thereof. If the event interpretation rule is in replacement form such that fields in the first state are replaced by parameters of the incoming messages, optimizations of the shortest path graph search technique are possible. In one optimization, the event interpretation rule in replacement form is extended to an unconstrained replacement form, and the extended unconstrained replacement form rule is used as an estimator function in the shortest path graph search technique. If the interpretation rule is an unconstrained replacement form the rule is extended into uniform unconstrained replacement form, and is thereafter used as the estimator function in the shortest path graph search technique.
A plurality of sequencer nodes may retain their own versions of a plurality of subscribers"" event interpretation rules such that the sequencer nodes can deliver to the subscriber nodes optimized version of their respective event message streams, in accordance with the appropriate rules, to update the respective states in the subscriber nodes.