This invention relates, in general, to event computing systems and, more particularly, to a content-based multicast routing technique which delivers events to consumers of an event computing system interested in a particular set of events.
A common practice for integrating autonomous components within a computing system has been to utilize events. Events are, for example, data generated by a provider and delivered through a communication medium, such as a computer network, hard disk, or random access memory, to a set of interested consumers. The providers and consumers need not know one another""s identity, since delivery is provided through intermediary software. This independence between provider and consumer is known as decoupling.
One example of an event computing system is a database event system. Modern database systems include support for event triggers. Event triggers associate a filter, which is a predicate that selects a subset of events and excludes the rest, with an action to take in response to events on the database. An event on a database is any change to the state of the database.
In database event systems, gating tests have been used to determine which consumers of a system are interested in a particular event. That is, gating tests have been used to match filters in event triggers to events. As described in xe2x80x9cA Predicate Matching Algorithm for Database Rule Systems,xe2x80x9d by Hanson et al., Proceedings of SIGMOD (1991), pp. 271-280, gating tests identify a single predicate for each filter as primary, and tests are organized in a data structure based on this primary predicate. Additionally, the data needs to be organized based on the primary predicate.
Another example of an event computing system is a distributed event system, also known as a publish/subscribe system. A publish/subscribe system is a mechanism where subscribers express interest in future information by some selection criterion, publishers provide information, and the mechanism delivers the information to all interested subscribers. Current publish/subscribe systems organize information around subjects (also called channels or streams). Providers or publishers publish events to groups and consumers or subscribers subscribe to all data from a particular group.
One example of a publish/subscribe system is described in detail in U.S. Pat. No. 5,557,798, issued to Skeen et al. on Sep. 17, 1996, and entitled xe2x80x9cApparatus And Method For Providing Decoupling Of Data Exchange Details For Providing High Performance Communication Between Software Processesxe2x80x9d, which is hereby incorporated herein by reference in its entirety. In U.S. Pat. No. 5,557,798, the publisher of an event annotes each message with an identifier called a subject and a subscriber subscribes to a particular subject. Thus, if a subscriber is interested in just a portion of the events having a given subject, it would have to receive the entire subject and then discard the unwanted information.
Based on the foregoing, a need exists for a matching capability that does not require the partitioning of data into subjects. A further need exists for a matching capability that enables a consumer to use any filtering criterion expressible with the available predicates. Additionally, a need exists for a mechanism that allows a consumer to receive only the information that it desires, such that the filtering is done independent of the consumer.
One approach to addressing the above-noted needs is described in the above-incorporated, co-pending U.S. patent application Ser. No. 08/975,280, entitled xe2x80x9cMethod and System for Matching Consumers to Events.xe2x80x9d In this approach, referred to herein as a content-based event computing system, the matching facility includes a search data structure (e.g., a search tree or search graph), which is used to determine the consumers interest in a particular event. Content-based subscription is the ability of subscribers to specify interest in events based on operations limited only by the structure of the events and the operation supported by the pattern language.
Applicants have identified a problem arising with content-based subscription which arises when using group based multicast such as internet protocol (IP) multicasting of an event. In a practical content-based subscription system, there will typically be too many groups of clients or consumers to use a multicast facility.
As one example, the environment of this invention may include content-based, publish/subscribe systems deployed over IP networks such as the Internet. Clients are either publishers or subscribers, and are attached to machines referred to herein as brokers. The publisher""s broker receives a published message (also referred to herein as an xe2x80x9ceventxe2x80x9d) and delivers it to subscriber brokers at least one of whose attached clients has a subscription matched by the message. These subscriber brokers then forward the message to the at least one attached client. Content-based systems are more flexible and provide more selectivity than subject-based systems. However, the multicast problem for content-based message delivery middleware is more complex than for subject-based delivery once the number of destinations for messages becomes large. It may no longer be straightforward or efficient to use IP multicast groups to distribute messages of a content-based system over a network because the number of such groups required grows rapidly with the number of subscriptions. This number eventually becomes so large that either the supported range of multicast addresses is exceeded or the overhead of setting up and listening to such a large number of multicast addresses becomes excessive. The present invention addresses this problem.
To summarize, provided herein is of a method for implementing a content-based publish-subscribe system using a group-based multicast. The method includes: mapping possible groups of the publish-subscribe system to a smaller number of multicast groups, wherein the smaller number of multicast groups includes brokers, and the brokers have consumers; and using the smaller number of multicast groups to forward an event to consumers within the content-based publish-subscribe system.
In another aspect, the present invention includes at least one program storage device readable by a machine, tangibly embodying at least one program of instructions executable by the machine to perform a method of implementing a content-based publish-subscribe system using a group-based multicast. The method includes: mapping possible groups of the publish-subscribe system to a smaller number of multicast groups, wherein the smaller number of multicast groups comprise brokers, the brokers having consumers; and using the smaller number of multicast groups to forward an event to consumers within the content-based publish-subscribe system.
In a further aspect, a system for implementing a content-based publish-subscribe system using a group-based multicast is provided. The system includes means for mapping possible groups of the publish-subscribe system to a smaller number of multicast groups, wherein the smaller number of multicast groups comprise brokers, and the brokers have consumers. The system further includes means for using the smaller number of multicast groups to forward an event to consumers within the content-based publish-subscribe system.
To restate, the present invention applies clustering to group multicast-based implementations of a content-based publish-subscribe system. Furthermore, as an enhancement, the invention employs thresholding to further reduce the number of groups required. These processes, referred to herein as cluster group multicast (CGM), provide multiple advantages over existing art. For example, under conditions of high match rate (i.e., very few subscribers are interested in any given event) and high regionalism (i.e., subscribers interested in an event are geographically co-located), CGM is superior to flooding (described herein below). In addition, when the cost of fringe-links (i.e., links connecting brokers to the network) is highest, CGM is superior to other group multicast techniques. Advantageously, group assignments in CGM are static and can be created independent of the subscriptions. Furthermore, it is possible to apply CGM to reasonably sized broker networks in Internet protocol (IP) version 4 and version 6.
Additional features and advantages are realized through the techniques of the present invention. Other embodiments and aspects of the invention are described in detail herein and are considered part of the claimed invention.