Publish/subscribe data processing systems have become very popular in recent years as a way of distributing data messages. Publishers are typically not concerned with where their publications are going, and subscribers are typically not interested in where the messages they receive have come from. Instead, a message broker typically assures the integrity of the message source, and manages the distribution of the message according to the valid subscriptions registered in the broker.
Publishers and subscribers may also interact with a network of brokers, each one of which propagates subscriptions and forwards publications to other brokers within the network. Therefore, when the term “broker” is used herein it should be taken as encompassing a single broker or multiple brokers working together as a network to provide brokering services.
An overview of a typical pub/sub system (e.g. WebSphere(R) MQ Integrator available from IBM Corporation) is described with reference to FIG. 1. Such a system comprises a number of publishers 10, 20, 30 publishing messages to a broker 70 on particular topics (e.g. news, weather, sport). Subscribers 40, 50, 60 register their interest in such topics via subscription requests received at the broker 70. For example, subscriber 40 may request to receive any information published on the weather, whilst subscriber 50 may desire information on the news and sport.
Note, broker 70 might be an identifiable process, set of processes or other executing component, or instead might be “hidden” inside other application code. The logical function of the broker will however exist somewhere in the network.
When broker 70 receives a message on a particular topic from a publisher, the broker determines from its list of subscriptions to whom that message should be sent. The broker then transmits the message to such subscribers.
A problem with typical pub/sub is scalability. One copy of a message is sent by the broker to each subscriber who has registered an interest in the topic to which the message relates. Thus if one hundred subscribers desire to receive information on the topic of sport, one hundred copies of each message relating to sport are sent out. Thus the whole network of subscribers might be flooded.
For this reason, multicast pub/sub was invented. This scales much better since the network determines the minimum/most efficient number of message copies necessary in order to fulfil subscribers' requests.
Unlike point-to-point TCP/IP socket-based pub/sub (where each subscriber listens on its own IP address for messages), subscribers in a multicast system listen on specific multicast addresses. Any number of subscribers may listen on the same multicast address.
In a pub/sub system, there is potentially an infinite number of topics. However the range of multicast addresses available is limited. Further, most systems typically support only a subset of this limited range. Thus there is the very real problem of how to map the “topic space” to the available multicast addresses.
One well-known pub/sub system (Tibco's Rendez-Vous) avoids the problem by having all subscribers listen on a single multicast address (whatever their subscription requests). Thus each subscriber receives all publications. Software on each subscriber is then used to filter out information on topics in which the subscriber has no interest. Such a system places a very heavy workload on the network and also the subscribers themselves.
Thus there is a need in the industry for an efficient way of mapping the limited range of multicast addresses available to an infinite topic space and for communicating a single multicast address per subscription request.