This invention relates, in general, to event computing systems and, more particularly, to a content-based multicast routing technique which delivers events to consumers of an event computing system interested in a particular set of events.
A common practice for integrating autonomous components within a computing system has been to utilize events. Events are, for example, data generated by a provider and delivered through a communication medium, such as a computer network, hard disk, or random access memory, to a set of interested consumers. The providers and consumers need not know one another""s identity, since delivery is provided through intermediary software. This independence between provider and consumer is known as decoupling.
One example of an event computing system is a database event system. Modern database systems include support for event triggers. Event triggers associate a filter, which is a predicate that selects a subset of events and excludes the rest, with an action to take in response to events on the database. An event on a database is any change to the state of the database.
In database event systems, gating tests have been used to determine which consumers of a system are interested in a particular event. That is, gating tests have been used to match filters in event triggers to events. As described in xe2x80x9cA Predicate Matching Algorithm for Database Rule Systems,xe2x80x9d by Hanson et al., Proceedings of SIGMOD (1991), pp. 271-280, gating tests identify a single predicate for each filter as primary, and tests are organized in a data structure based on this primary predicate. Additionally, the data needs to be organized based on the primary predicate.
Another example of an event computing system is a distributed event system, also known as a publish/subscribe system. A publish/subscribe system is a mechanism where subscribers express interest in future information by some selection criterion, publishers provide information, and the mechanism delivers the information to all interested subscribers. Current publish/subscribe systems organize information around groups (also called channels, subjects or streams). Providers or publishers publish events to groups and consumers or subscribers subscribe to all data from a particular group.
One example of a publish/subscribe system is described in detail in U.S. Pat. No. 5,557,798, issued to Skeen et al. on Sep. 17, 1996, and entitled xe2x80x9cApparatus And Method For Providing Decoupling Of Data Exchange Details For Providing High Performance Communication Between Software Processesxe2x80x9d, which is hereby incorporated herein by reference in its entirety. In U.S. Pat. No. 5,557,798, the publisher of an event annotes each message with a group identifier called a subject and a subscriber subscribes to a particular subject. Thus, if a subscriber is interested in just a portion of the events having a given subject, it would have to receive the entire subject and then discard the unwanted information.
Based on the foregoing, a need exists for a matching capability that does not require the partitioning of data into subjects. A further need exists for a matching capability that enables a consumer to use any filtering criterion expressible with the available predicates. Additionally, a need exists for a mechanism that allows a consumer to receive only the information that it desires, such that the filtering is done independent of the consumer.
One approach to addressing the above-noted needs is described in the above-incorporated, co-pending U.S. patent application Ser. No. 08/975,280, entitled xe2x80x9cMethod and System for Matching Consumers to Events.xe2x80x9d In this approach, referred to herein as a content-based event computing system, the matching facility includes a search data structure (e.g., a search tree or search graph), which is used to determine the consumers interest in a particular event. Content-based subscription is the ability of subscribers to specify interest in events based on operations limited only by the structure of the events and the operation supported by the pattern language.
Applicants have identified a problem arising with content-based subscription which arises when using internet protocol (IP) multicasting of an event. In a practical content-based subscription system, there will typically be too many groups of clients or consumers to use a multicast facility.
As one example, the environment of this invention may include content-based, publish/subscribe systems deployed over IP networks such as the Internet. Clients are either publishers or subscribers, and are attached to machines referred to herein as brokers. The publisher""s broker receives a published message (also referred to herein as an xe2x80x9ceventxe2x80x9d) and delivers it to subscriber brokers at least one of whose attached clients has a subscription matched by the message. These subscriber brokers then forward the message to the at least one attached client. Content-based systems are more flexible and provide more selectivity than subject-based systems. However, the multicast problem for content-based message delivery middleware is more complex than for subject-based delivery once the number of destinations for messages becomes large. It may no longer be straightforward or efficient to use IP multicast groups to distribute messages of a content-based system over a network because the number of such groups required grows rapidly with the number of subscriptions. This number eventually becomes so large that either the supported range of multicast addresses is exceeded or the overhead of setting up and listening to such a large number of multicast addresses becomes excessive.
The goal of the present invention is to replace a potentially large set of IP multicast addresses with a much smaller set, which will be a conservative approximation. That is, every broker which might require a message will receive it, but certain brokers may possibly receive messages they do not actually need. These extra messages cause wasted bandwidth in the network and wasted processing time in the receiving broker, which can be quantified by a penalty function. Thus, the present invention collapses a large number of groups to a smaller number of groups, while approximately minimizing the expected penalty. The invention disclosed herein uses dynamic information about the existing set of subscriptions and the expected probabilities of different events in order to perform this collapse.
To summarize, the shortcomings of the prior art are overcome and additional advantages are provided through the provision of a method for distributing events to consumers in a content-based publish/subscribe system. The method includes: deriving a set of g approximate multicast groups from a larger set of G possible multicast groups in the publish-subscribe system, the deriving including exploiting knowledge of subscription predicates of at least some consumers of the publish-subscribe system; and using the set of g approximate multicast groups to forward an event to at least one consumer withing the publish-subscribe system.
In another aspect, the present invention provides at least one program storage device readable by a machine, tangibly embodying at least one program of instructions executable by the machine to perform a method for distributing events to consumers in a content-based publish-subscribe system, wherein the consumers each have at least one subscription. The method includes: deriving a set of g approximate multicast groups from a larger set of G possible multicast groups in the publish-subscribe system. The deriving including exploiting knowledge of subscription predicates of at least some consumers of the publish-subscribe system; and using the set of g approximate multicast groups to forward an event to at least one consumer within the publish-subscribe system.
In a further aspect, an article of manufacture is provided herein which includes at least one computer usable medium having computer readable program code means embodied therein for causing the distributing of events to consumers in a content-based publish-subscribe system, wherein the consumers each have at least one subscription. The computer readable program code means in the article of manufacture includes: computer readable program code means for causing a computer to effect deriving a set of g approximate multicast groups from a larger set of G possible multicast groups in the publish-subscribe system. The deriving including exploiting knowledge of subscription predicates of at least some consumers of the publish-subscribe system; and computer readable program code means for causing a computer to effect using the set of g approximate multicast groups to forward an event to at least one consumer within the publish-subscribe system.
In a further aspect, a system is provided for distributing events to consumers in a content-based publish-subscribe system. The system includes means for deriving a set of g approximate multicast groups from a larger set of G possible multicast groups in the publish-subscribe system. The means for deriving includes means for exploiting knowledge of subscription predicates of at least some consumers of the publish-subscribe system. The system further includes means for using the set of g approximate multicast groups to forward an event to at least one consumer within the publish-subscribe system.
To restate, there are numerous advantages of a distribution approach such as presented herein in comparison with other techniques for distributing events in a content-based publish-subscribe system. In the following discussion, the term xe2x80x9cexact multicast groupxe2x80x9d refers to a subset of brokers that share a common subscription predicate but does not overlap with any other subscription predicate (also referred to herein as xe2x80x9cinitial groupsxe2x80x9d). The term xe2x80x9cIP multicast groupxe2x80x9d refers to an assignment of a multicast system address to a collection of brokers, irrespective of predicates.
Naive use of multicast techniques: Using multicast requires two preparatory steps before messages may be sent: step one is to actually construct all the groups required for a particular application, step two is to instruct each broker to listen to one or more groups. Naive multicast refers to the technique of creating one IP multicast group for every exact multicast group. Relative to naive multicast, the approach presented herein has the following advantages:
1. Address Space Limits: In the present approach, a multicast solution is provided even though the number of exact multicast groups exceeds the number of IP multicast groups allowed by the system.
2. Group Creation Cost: The number of actual multicast groups used may be adjusted in order to minimize the initial cost of group setup (step one above). In contrast, group creation under the naive approach may be extremely expensive due to the large number of groups.
3. Broker Overhead: the number of actual multicast groups used may be adjusted in order to minimize the overhead experienced by each broker (relating to step two above). In contrast, broker overhead under the naive approach may be unusually high due to the large number of groups each broker must listen to.
Broadcast or xe2x80x9cfloodingxe2x80x9d techniques: Flooding is a variant of multicast where one group is created and every broker listens to the same group. Thus, each message sent to this group is received by every broker (hence xe2x80x9cfloodingxe2x80x9d), regardless of whether or not a broker has a client wishing to receive the event. Relative to flooding, the approach presented herein has the following advantages:
1. Waste Messages: Depending on the distribution of client subscriptions, the flooding approach tends to generate large numbers of wasted messages: messages which are sent to brokers that have no client wishing to receive the message. Waste messages cause brokers to waste CPU time on handling the unwanted message. Although waste messages are possible using the present approach, on average far fewer waste messages are generated and thus the present invention imposes lower overhead on broker CPU time.
2. Bandwidth Utilization: Waste messages also waste network bandwidth because messages may be unnecessarily sent down certain network links. Thus, waste messages clog up network links and routers with unnecessary traffic. In the present approach, fewer waste messages are generated and thus network bandwidth is utilized more efficiently.
Point-to-point techniques: A publishing broker using the point-to-point approach distributes events either by directly sending the event to each destination broker one at a time, or by sending the event through a xe2x80x9ctreexe2x80x9d of brokers, where each broker in the tree may forward the event to one or more of its neighbors (also by sending the event to its neighbors one at a time). For example, reference the above-incorporated United States patent application entitled xe2x80x9cRouting Messages Within A Network Using the Data Content of the Message.xe2x80x9d Relative to point-to-point, the present approach has the following advantages:
1. CPU Utilization: A broker using the present invention only needs to make one xe2x80x9ctransmitxe2x80x9d call to distribute an event; the multicast system performs all the routing necessary to ensure the correct delivery of the event. On the other hand, a broker using point-to-point distribution must make a separate xe2x80x9ctransmitxe2x80x9d call for each neighboring broker the event should be delivered to. The latter approach imposes higher overhead on the broker because it must spend time making additional xe2x80x9ctransmitxe2x80x9d calls (when it could be spending that time servicing other messages and hence increasing system throughput).
2. Bandwidth Utilization: Brokers using the point-to-point approach may send a single event multiple times over the same network link in cases where neighboring brokers share a common subset of network links. This approach wastes network bandwidth because from a network perspective, the additional sends down common links are unnecessary traffic. In contrast, multicast ensures that each network link is used at most once. Therefore, in general, the present approach wastes less bandwidth than point-to-point approaches. Moreover the amount of bandwidth wasted is a tunable property under our invention.
3. Latency: Multicast approaches exploit fast internet routers to efficiently deliver events. The point-to-point approach using a tree requires intermediate brokers to process messages and forward them to other brokers. The extra step of receiving a message, processing it, and passing it on to other brokers introduces delays and end-to-end message latency. Because the present approach utilizes multicast, a much lower overall latency is experienced than these multi-hop point-to-point approaches.
Additional features and advantages are realized through the techniques of the present invention. Other embodiments and aspects of the invention are described in detail herein and are considered a part of the claimed invention.