In a computing environment where large amounts of data are moved between various locations, for example in connection with stock trading, it is desirable to move the data as efficiently as possible. One early method for doing so, as illustrated in FIG. 1, was to transfer the data from a main data source 100 as a whole data file 102 via File Transfer Protocol (FTP) to routers 110, 112, 114 located in different areas where the data would need to be distributed. (The geographic locations noted in FIG. 1 are for illustrative purposes only, to show how widely dispersed the data destinations may be.)
Each of the routers 110, 112, 114 contains a local network file server that parses the data file 102 and generates a plurality of smaller data files 116, which are distributed to local destinations 120a, 120b, 122a, 122b, 124a, 124b. The number of local destinations shown in FIG. 1 can be any number of destinations that need to access data from the file 102.
There are two major disadvantages to the arrangement shown in FIG. 1. First, the data is not sent in real time, leading to an undesired delay in processing the data. Second, the entire data file 102 had to be sent to multiple locations 110, 112, 114 in order to be distributed to the ultimate destinations 120a-124b, resulting in large amounts of unnecessary computer network traffic. Because of these disadvantages, the data file 102 was actually parsed and divided multiple times, as opposed to as few as once, thereby creating a process that was inefficient, processor intensive, and not in real time.
In a setting like stock trading, access to data in real time is critical in order to be able to make the best possible trades at a given point in time. In an effort to overcome the inefficiencies using an FTP-based data transfer, a similar arrangement was used on top of a messaging platform which could distribute the data in real time, as shown in FIG. 2.
Modern computer networks are rarely homogeneously constructed; they are often a collection of old and new systems from a variety of vendors and operate on a variety of platforms. Across an enterprise, it is critical that the disparate parts of a computer network communicate with each other in some form. One solution to this problem is to utilize a messaging platform that runs across various systems while providing a common message format. A common messaging platform typically involves a publish-subscribe metaphor, in which information is published to a particular subject or topic, and any party interested in receiving that information subscribes to that subject (this may also be referred to as consuming off a particular subject). In this environment, a consumer only receives information that is of interest; any other, non-relevant information is not published to the subject. Examples of such a messaging platform include ETX from TIBCO Software, Inc. and as MQ Series from International Business Machines Corporation.
To route the data to its final destination, it must be published to a subject that the destination subscribes to. Since there is some overhead in terms of time in determining the proper subject on which to publish a message, a message can be published to a “general” subject and the specific subject of the message can be determined thereafter. One solution to this problem is to use a router to examine the message and to determine the specific topic on which the message should be published.
As shown in FIG. 2, a data source 200 publishes messages 202, all of which are consumed by a general data router (GDR) 210. The router 210 parses the messages 202 and publishes the parsed messages on new subjects 212, 214, 216, which are destined for second-level routers 220, 222, 224, respectively. The second-level routers 220, 222, 224 examine the message a second time, and republish the message on a specific subject 226 for a particular end destination 230a, 230b, 232a, 232b, 234a, 234b. 
The router 210 parses a message 202 by examining the contents of the message 202, evaluating a particular key contained within the message 202, and based upon the value of the key, determines the proper second-level router 220, 222, 224 to which it should publish the message 202. The second level routers 220, 222, 224 examine the message in the same manner as the router 210, but with a finer level of granularity, in order to determine the specific destination 230a-234b for the message. Simply stated, the message 202, when published, does not have a destination address associated with it, but that address can be built dynamically by the routers 210 and 220, 222, or 224, by looking up what is in the message 202, building the address for the message 202, and publishing the message 202 to its final destination 230a-234b. 
One of the goals in using a messaging platform and the multiple routers is to extract some of the complexity from both the publisher and the consumer and placing that logic into a centralized layer, such that it is essentially considered by both end publishers and end consumers to be part of the messaging platform. This is one of the focus points of enterprise application integration (EAI), making it easier for disparate systems to communicate with one another. By placing the routing logic in a centralized location, the administration of the logic is simplified, since only one location needs to be updated when changes are made.
In order to simplify what a particular second-level router 220, 222, 224 needs to understand, it can be specified what is unique about an instance of the application that can be found in the message. But there is still the problem, from the publisher's (200) perspective, of how to identify which specific destination 230a-234b to send the message. In a publish-subscribe environment, this problem is solved by publishing to a subject subscribed to by the specific destination. If the router 210 was not present, each of the second-level routers 220, 222, 224 would need to discard any messages that were not intended for them; this would merely replicate one of the disadvantages of using FTP as noted above, but in connection with a messaging platform. The router 210 helps to reduce the amount of unnecessary data traffic by reducing the number of messages that need to be sent. Ideally, no message is duplicated, nor is a message sent to more than one location.
One disadvantage of this use of the messaging platform is that there are multiple instances of routers operating at the same time, which creates management issues of having to coordinate several pieces of software. While the routers are executing the same code base, each router is applying different routing rules, depending upon the router's location in the message flowpath. Furthermore, each router is only able to apply one routing rule. To apply multiple routing rules to one message, multiple routers need to be arranged in sequence, necessarily creating a complicated network design. The design shown in FIG. 2 is also a single thread of execution, which limits the throughput of the routing system to about 35 messages per second (assuming an average message size of two kilobytes). In the example noted above of a large stock trading system, a real-time flow of data easily exceeds 35 messages per second.
It is desirable to create a routing system that utilizes a single application to execute multiple routing rules on a single message, that is multithreaded in order to increase the throughput of the system, and is messaging platform agnostic such that disparate messaging platforms can be used on either side of a publish-subscribe or a point-to-point transaction.
FIG. 3 shows how a single router of the prior art operates while processing a message. A router 300 accepts an inbound message 302, processes the inbound message 302 and outputs an outbound message 304. The contents of the inbound message 302 and the outbound message 304 are going to be identical. The goal of the router 300 is to examine the contents of the inbound message 302, which is published to a general subject, and from those contents determine the specific subject on which the outbound message 304 should be published for consumption by the ultimate recipient of the outbound message 304.
The inbound message 302 is first examined at block 310, where an introspection module is called. The particular introspection module to be called is dependent upon the subject of the inbound message 302 and is retrieved from an introspection module library 312. An introspection module (a/k/a key extraction routine) is a customized routine that complies with a particular interface. It can be loaded dynamically according to a configuration of a particular routing instance and it contains the logic for examining a specific type of message. This code will read the inbound message 302 and extract the information needed to determine how to route the message 302 to the proper specific subject, namely a routing key. The information to be extracted and used as the routing key is defined in the introspection module, which is why a different introspection module is required for each different routing rule to be applied. For example, in the stock trade example, the account number associated with the trade can be used as the routing key.
At block 320, the routing key is extracted from the inbound message 302 and the value of the routing key is evaluated. This value is matched against a keymap table 322 to determine the routing tag or target for the inbound message 302. The keymap table 322 is a two column table that lists the values of the routing key in one column and the matching routing tags for those values in another column. Because the router 300 can only operate on one routing rule, the keymap table 322 will be the same for all inbound messages 302. The data in the keymap table 322 can be cached locally within the router 300 for rapid access to the data. During the initialization of the router 300, the keymap table 322 is loaded into the router's memory from an external routing information database 324.
Once the routing tag of the inbound message 302 has been identified, at block 330, the routing tag is used to access an outbound routing table 332 to identify the outbound subject for the inbound message 302. The outbound routing table 332 is a two column table that lists the values of the routing tag in one column and the outbound subjects for those values in another column. As with the keymap table 322, the outbound routing table 332 can be cached in local memory during the initialization of the router 300 by loading the outbound routing table 322 from the routing information database 324. In block 340, the inbound message 302 is published to the new subject as outbound message 304.
FIG. 4 shows how the prior art applied multiple routing rules to a single inbound message 400. Because each router of the prior art was only capable of applying a single rule, it was necessary to string multiple routers together to be able to apply multiple rules to a single message. (The concept of multiple routing rules will be discussed below in connection with FIG. 5.) As shown in FIG. 4, an inbound message 400 is examined by a first router 410, which applies a first rule to the inbound message 400 and then, if the inbound message 400 meets the criteria of the first rule, publishes the inbound message 400 as an outbound message 412 for a first consumer 414. The inbound message 400 is then passed to a second router 420, which applies a second rule to the inbound message 400 and then, if the inbound message meets the criteria of the second rule, publishes the inbound message 400 as an outbound message 422 for a second consumer 424, and so on.
Some solutions to the general problems posed by the complexities of enterprise application integration have been proposed by various U.S. patents. For example, U.S. Pat. No. 6,256,676 to Taylor et al. relates to a system for integrating a plurality of computer applications, including an adapter configured for each of the applications, the adapter controlling the communication to and from the associated application. The system of Taylor et al. permits communication across a variety of different messaging modes, including point-to-point, publish-subscribe, and request-reply messaging, utilizing message definitions for each type of object to be passed through the system. A number of different types of adapters are required for each application, and for each message definition. While the architecture of this system permits flexibility in system construction, it requires a significant amount of work by the user to properly construct the system. This system adapts to the applications to be connected, rather than requiring the applications to adapt themselves to the system.
U.S. Pat. No. 5,680,551 to Martino, II describes a system for connecting distributed applications across a variety of computing platforms and transport facilities. To implement this system, it is necessary to modify each of the applications to be connected to include the basic operating core (i.e., the application programming interface) of the system. This system does not support a publish-subscribe messaging platform, and any application desiring to receive messages must actively seek out new messages. In order to use this system, a messaging user interface to each application is designed, then the messaging system is integrated into each application to be connected, and finally the system is configured and tested. Following these steps for each application to be connected is both labor-intensive and time intensive.
In regard to content processing and routing, U.S. Pat. No. 6,216,173 to Jones et al. discloses a method and apparatus for incorporating such intelligence into networks. The system of Jones et al. associates attributes with each service request which allows the system to obtain knowledge about the content and requirements of the request. Using this knowledge, along with knowledge of the available services, the system can route the request to a suitable service for processing. This system also permits communication across disparate networks, by converting the data for transmission across each type of network. The conversion process occurs while the data is being sent from, for example, Node A to Node C. An intermediate stop is made at Node B to convert the data from the format at Node A to the format at Node C. The data conversion occurs during the routing process, not once routing is completed.
While these patents address various problems existing in the prior art, none contemplate use of a single application to handle all of the routing, allowing the applications at either end of a publish-subscribe or a point-to-point messaging system to run as-is without modification, and to run in any messaging environment regardless of the specifics of the messaging platform (i.e., to be messaging system agnostic).