In content-based networks, the flow of messages from senders to receivers is driven by the content of messages, rather than by explicit addresses assigned by senders and attached to the messages as in the traditional Internet. Content-based networks are suited to a variety of services, including news distribution, network intrusion detection, distributed electronic auctions and the like.
A content-based routing scheme is described in Carzaniga, M. J. Rutherford, A. L. Wolf; A routing scheme for content-based networking; Department of Computer Science, University of Colorado, June 2003, the contents of which are herein incorporated by reference.
The biggest challenge associated with content routing is scalability. The address (subscription) used for routing data in a content network is generally much larger than the IP address used in explicitly addressed networks. As a result, the memory requirement for storing all subscriptions across all routers in the network quickly becomes unattainable for an internet scale network. The other difficulty with content routing lies in efficiently disseminating traffic throughout the network without introducing cycles and without any loss of data. The routing algorithm must ensure perfect delivery and optimal bandwidth utilization.
Previous schemes for implicit or content routing do not address the requirements of scalability and robustness required for deployment in a real life carrier or enterprise network.
In “Mesh-Based Content Routing using XML”, Alex C. Snoeren, Kenneth Conley, David K. Gifford, 18th ACM Symposium on Operating System Principles (SOSP 2001), pages 160-173, a scheme is described for reliably multicasting time-critical data in mesh-based overlay networks. The technique described focuses on the use of redundant multiple paths to reassemble data content, and does not describe methods for subscription management.
In “A Scalable Protocol for Content-Based Routing in Overlay Networks”, R. Chand, P. A. Felber, Institute EUROCOM Research Report Number 74-RR-03-074, Feb. 26, 2003, a protocol is described for content-based routing in overlay networks, including taking advantage of subscription aggregation to reduce the size of routing tables. The protocol is specifically designed to handle XML-based data dissemination. This prior art suffers from a number of issues that are handled by the current invention, namely: the requirement to compute subscription aggregation at each node in a network through which a subscription passes (when being added or removed), the requirement to re-compute the subscription aggregation when the network topology changes, due to a link failure or recovery or the addition or removal of a link in the overlay network. In addition, the scheme does not describe how the routing protocol is made robust in the face of lost messages. In particular, it makes the assumption that node and links do not fail. It also assumes that the number and location of producer nodes (nodes with attached publishers) is known.
In “A Routing Scheme for Content-Based Networking”, Anonio Carzaniga, Matthew J. Rutherford, Alexander L. Wolf, University of Colorado, June 2003, a scheme is proposed for content-based networking in an overlay network, using two routing protocols: a broadcast routing protocol and a content routing protocol. The broadcast protocol processes topological information and maintains the forwarding state necessary to send a message from each node to every other node. This can be done using a global spanning tree (e.g. minimal spanning tree), per-source trees (e.g. shortest-path trees), or other broadcast method such as reverse path forwarding. These schemes are not described further. The content-routing scheme supports predicate-based messages only (as opposed to supporting XML messages), and supports subscription aggregation. It suffers from the same issues of the subscription aggregation having to be performed at each router that a subscription propagates to, and these calculations have to be completely re-done when the topology changes. In addition, the scheme described suffers from a number of other shortcomings: when a subscription is removed, the removal may not take effect immediately, and documents may continue to flow on paths where they are no longer needed, and an inefficient method is used to repair the routing tables on a timed basis due to the above first shortcoming. Its broadcast function also requires the property of “all-pairs path symmetry”, which restricts the paths that can be taken in a given network topology.