In content-based publish/subscribe systems, publishers post or publish information through the system, and subscribers specify interest in receiving certain information. Subscriber interest is specified using subscriptions that define predicates on the posted information. To improve performance in the delivery of information in the publish/subscribe system, broadcasting every message throughout the entire publish/subscribe system is avoided. In a content-based publish/subscribe system, the information requested varies by subscriber, and any one subscriber may only be interested in a very small portion of the overall amount of information published. For example, when the subscriptions in the publish/subscribe system overlap significantly and match only a small subset of all of the published information, most of the published information does not need to be broadcast throughout significant portions of the publish/subscribe system.
Publish/subscribe systems typically prevent this unnecessary flooding of published information throughout the system by propagating the subscriptions through the system to the publishers of the information and by using the propagated subscriptions to direct the routing of information through the system and to filter out published information that does not have to be routed. This routing and filtering is facilitated by the use of brokers disposed between the publishers and subscribers in the publish/subscribe system. Each broker contains the propagated subscriptions that are relevant to the subscribers to which the broker can route published information. In particular, each broker contains subscriptions for neighboring brokers and subscribers in the system. The broker uses the propagated subscriptions to filter the published information in accordance with the subscriptions as the information passes through the publish/subscribe system. This process is referred to as message filtering and can be used anywhere in the network, even in brokers that are in close proximity to the publishers that are posting the information. Therefore, only information that is relevant to downstream subscribers connected to the brokers is forwarded, and published information is not forwarded to brokers and subscribers in the publish/subscribe system that are not associated with matching subscriptions.
As more subscriptions are added to and propagated through the publish/subscribe system or as proximity to the publishers increases, the amount of information about subscriptions that is maintained in each broker for the purpose of information filtering and routing grows and can become cumbersome, adversely affecting the performance of the publish/subscribe system. Conventionally, publish/subscribe systems attempt to avoid this problem by using subscription aggregation or subscription consolidation. In subscription aggregation, for example, if information that matches a first subscription also matches a second subscription and both subscriptions are routed in the same direction with respect to a given broker, then only the second subscription is propagated. This aggregation process utilizes covering relationships among the subscriptions, and many publish/subscribe systems that implement subscription propagation utilize covering relationships among the subscriptions to reduce the volume of information propagated throughout the system and maintained at each broker.
In addition to attempting to match published information with subscriptions as efficiently as possible, publish/subscribe systems are operated to provide in-order, gapless delivery of published information. The need for providing in-order, gapless delivery of information, even in the presence of system failures, arises from service level agreements that dictate the need for an uninterrupted flow of information, e.g., it is unacceptable for certain stock traders not to be able to access a trade event that others can access, and from message interdependencies, for example when messages are used by a subscribing application to accumulate a view of an event and missing or re-ordered messages can cause an incorrect state to be displayed. Achieving the requirements within a single system for in-order, gapless delivery, high performance, scalability and high availability using conventional methods is very difficult.
Loss of connectivity by subscribers, publishers and brokers is common in wide-area network applications due to hardware and software failures and network mis-configurations. To increase system availability, some publish/subscribe systems are built on a redundant overlay network, which provides redundancy in the underlying network links. However, current systems do not efficiently exploit the available redundancy in the overlay network to recover from hardware and software failures in a timely and efficient process. In a typical redundant overlay network of brokers, multiple paths may exist between any two brokers in the network, and the publish/subscribe system automatically load balances published information traffic across these paths. When one of these paths is broken, for example due to a broker or link failure, the publish/subscribe system redirects the published information traffic to available alternate paths.
Conventional methods used to provide reliable delivery in redundant overlay networks, however, store persistently any messages or message meta-data on the routing path between publishers and subscribers. An approach to supporting reliable delivery without persistently storing messages or message meta-data on the routing path is described in (reference). However, that approach does not consider dynamic subscription changes caused by subscribers connecting or disconnecting from the system.
Known publish/subscribe systems that can handle dynamic subscription changes do not provide gapless, in-order delivery and do not utilize redundant paths existing in the broker networks. Therefore, the known system are not highly scalable and available.
Examples of publish/subscribe systems that support subscription aggregation to achieve scalability are found in A. Carzaniga, D. S. Rosenblum, and A. L. Wolf, Design and Evaluation of a Wide-Area Event Notification Service, ACM Transactions on Computer Systems, 19(3):332-383, August 2001 and R. Chand and P. A. Felber, A Scalable Protocol for Content-Based Routing in Overlay Networks, Proceedings of the IEEE International Symposium on Network Computing and Applications (NCA '03), Cambridge, Mass., April 2003. These applications also support a topology with multiple routes between servers; however, the subscriptions are only propagated along a single selected “best route” in a spanning tree. This limitation of propagating subscriptions along a single selected route makes the system slow and recovery from a spanning tree link failure by dynamically switching to another route difficult. In addition, these publish/subscribe systems do not provide a mechanism to share the load among multiple available paths and do not support reliable delivery.
In B. Segall, D. Arnold, J. Boot, M. Henderson, and T. Phelps, Content Based Routing with Elvin4, AUUG2K, Canberra, Australia, June 2000, the publish/subscribe system is architectured around a single server that filters and forwards messages directly to consumers. The system, however, does not address the issues of scalability or availability.
The publish/subscribe system discussed in A. Snoeren, K. Conley and D. Gifford, Mesh-Based Content Routing using XML, Proceedings of the 18th ACM Symposium on Operating System Principles (SOSP 2001), Alberta, Canada, October 2001 attempts to improve reliability with low latency by sending messages simultaneously over redundant links in a mesh-based overlay network. The protocol uses content-based routing and provides a high level of availability. However, there is no guarantee of in-order, gapless delivery when subscriptions are dynamically added and removed from the system.
G. Cugola, E. Di Nitto, and A. Fuggetta, The JEDI Event-Based Infrastructure and Its Application to the Development of the OPSS WFMS, IEEE Transactions on Software Engineering, 27(9):827-850, September 2001 discusses a publish/subscribe system that guarantees causal ordering of events, as a special case, The ordering of events is published by an entity called the Active Object. This system provides two implementations of the event dispatcher. The first version is a centralized version constituting a single process and addressing the requirements of simple systems. The second version is a distributed version constituted of a set of dispatching servers interconnected into a tree structure. This distributed version, while addressing part of the needs of Internet-wide distributed applications engaging in intense communication, does not accommodate and utilize redundant links between dispatching servers and hence is neither highly available nor easily used for load sharing.
The publish/subscribe system illustrated in B. Zhao, L. Huang, A. Joseph, and J. Kubiatowicz, Exploiting Routing Redundancy Using a Wide-area Overlay, Technical Report UCB/CSD-02-1215, University of California, Berkeley provides fault tolerant routing by dynamically switching traffic onto pre-computed alternate routes. Messages in this system can be duplicated and multicast “around” network congestion and failure hotspots with rapid re-convergence to drop duplicates. However, this system does not support content routing.
A. Rowstron, A. Kermarrec, M. Castro, and P. Druschel, SCRIBE: The design of a Large-Scale Event Notification Infrastructure, Proceedings of 3rd International Workshop on Networked Group Communication (NGC 2001), UCL, London, UK, November 2001 describes a large-scale and fully decentralized event notification system built on top of a peer-to-peer object location and routing substrate overlaid on the Internet. The event notification system leverages the scalability, locality, fault-resilience and self-organization properties of the object location and routing substrate. However, the event notification system does not support content-based routing. In addition, the event notification system builds a separate multicast tree for each individual topic. This multicast tree is created using a scheme similar to reverse path forwarding, a description of which can be found in Y. Dalal and R. Metcalfe, Reverse Path Forwarding of Broadcast Packets, Commnunications of the ACM, 21(12):1040-1048, 1978, so the route on which subscription messages were forwarded are inverted to become the route by which events are later distributed. This makes it impossible to add a redundant node to the multicast tree to share the load without requiring the total multicast tree to be rebuilt. Although the system can recover from multicast node failures by building a new multicast tree, this is done at a cost of reliable, in-order, gapless delivery. The applications must implement higher quality of service by themselves. In addition, an un-subscription in the event notification system has to be delayed until the first event is received.
Therefore, a need exits for a publish/subscribe system that provides for a guaranteed in-order, gapless content-based routing of messages while also achieving high performance, scalability and high availability. In addition, the publish/subscribe system should not require consensus or agreement between the redundant routing members, enabling them to serve as routing and processing alternatives to each other for fault tolerance and load sharing.