In the prior art, many message delivery systems exist which offer assured message delivery between endpoints, such as between different applications. Assured (sometimes also called guaranteed or persistent or durable) message delivery offers a “deliver at least once” delivery semantic, although other delivery semantics can also be offered, such as deliver at most once, deliver once and only once, etc. The messages are delivered to destination endpoints based on topics, queues, characteristics of the message content or a combination of criteria such as a topic to queue mapping; an exemplary system is described in U.S. Pat. No. 7,716,525 (Buchko), the contents of which are herein incorporated by reference. Such message delivery systems provide for loosely coupled message delivery between the message source and the receiving application (for one-to-one delivery) or receiving applications (for one-to-many delivery). When a message is sent a receiving application (or multiple receiving applications) may be offline or part of the network may be unavailable. The messaging system must persist (or store) the message so that it may deliver it to the receiving application when it comes back online or when a communications path to it is restored. As well, the system ensures message delivery to the receiving application(s) even in the presence of message loss between network elements, as may occur due to events such as communications errors, power outages, equipment failures, etc.
Of the assured message delivery systems known in the art; some are broker based where clients communicate via an intermediate system (or broker) and in other implementations the clients speak directly to each other with a replay system monitoring communication and performing the persistence functions. Similarly assured messaging systems may be assembled from standard components such as servers, disks, software libraries etc., or custom hardware assemblies such as network processors, FPGAs or a combination of standard and custom components. An example of a custom hardware platform for assured messaging is the Solace 3200 Series of middleware appliances from Solace Systems, Inc. There are undesirable behaviors exhibited by assured messaging systems that result from resource contention under specific traffic patterns where the behavior of one client can affect the latency and jitter and message rate experienced by another client. The desired behavior of the message delivery system is to protect resources needed to provide service to real-time message flows so that contention for resources from non-real-time message flows does not impede the ability of the system to provide the ideal service to the real-time message flows.
Broadly speaking there are four client behaviors seen in assured message delivery systems: publishing (or producing) client behavior, subscribing (or consuming) streaming client behavior, subscribing recovering client behavior and subscribing offline or slow client behavior. From the point of view of a publishing client the ideal assured message delivery system will accept messages from the publisher as fast as the publisher can produce them; put another way the message delivery system will not back-pressure publishers in order to not impede the overall performance of the publishing application. The message delivery system may backpressure publishers to prevent congestion but this is an undesirable behavior from the point of view of the publishing client. Streaming subscribers have no messages queued that cannot be immediately dispatched for delivery in the messaging system and are able and willing to receive more messages (known in the art as having an open receive window). When the message delivery router receives a message that matches a topic or queue endpoint for a streaming subscriber it is able to immediately forward a copy of the message to the subscriber. The ideal behavior of an assured message delivery system from the point of view of a streaming subscriber is to deliver messages to the subscriber with the lowest possible latency from the publisher to the subscriber. The recovering subscriber has undelivered messages queued for it on the messaging system and is able to receive messages. The undelivered messages queued on the message delivery router are often the result of a subscriber application going offline for some period of time (during which the message delivery router stored messages without immediately delivering them) and upon coming back online the subscribing application seeks to catch up on the messages that were queued during the time it was unavailable, plus additional arriving messages may be added to the queue(s) for the subscribing application during the recovery phase. The ideal behavior of the assured message delivery system with regard to recovering subscribers is to catch up (reduce the number of undelivered messages queued in the message delivery router to zero) as quickly as possible and transition to the streaming subscriber behavior. Offline or slow subscribers are unable or unwilling to receive new messages at the rate they are being published. In the offline or slow subscriber behavior the message delivery router is forced to queue messages without the ability to immediately deliver them to subscribers that are either offline or have a closed receive window. In the case of the offline or slow subscriber the ideal behavior of the assured message delivery system is simply to not lose messages and minimize the impact on the other classes of participants.
The assured message delivery system has a pool of finite resources that it must manage in order to provide ideal (or as close to ideal) service to the four classes of clients previously described. The resources available to an assured message delivery system are processing and memory cycles, internal interconnect bandwidth between system components, network bandwidth and access to non-volatile storage. How these resources are applied by the assured message delivery system to the task of delivering messages will affect how close to the idea level of service a particular client will receive. The level of service the overall system is providing can be measured in terms of the number of ingress messages per second, the number of egress messages per second, the distribution of latencies between the message arrival time from publishers to the delivery times to streaming subscribers, and the time taken for recovering subscribers to catch up with the queued message backlog and transition back to a streaming state. Current generation assured message delivery systems do not distinguish between the four client behaviors previously described and consequently are not able to efficiently allocate resources to service the four classes of clients differentially. An example of an undesirable behavior that results from this is when there are slow subscribers present in the system. It takes more resources to deliver a message to a slow subscriber than it does to a streaming subscriber because messages that are destined to slow subscribers must be retrieved from non-volatile storage (typically disk) for delivery, whereas messages for streaming subscribers (in most implementations) are written to disk or some other form of non-volatile storage but delivered from RAM. Retrieving a message from disk is a relatively expensive operation since accessing disk takes orders of magnitude more time than accessing RAM. The extra time and consumption of system resources spent delivering messages to slow subscribers can cause contention for system resources needed for other tasks such as processing new messages received from publishers and delivering those messages to streaming subscribers. If the assured message delivery system fails to process acknowledgements from streaming subscribers quickly enough then it may falsely think that the receive window to the streaming subscriber has closed and mistakenly transition that subscriber to a slow subscriber, affecting system throughput and behavior experienced by the (now slow) subscriber. If the assured message delivery system cannot process and acknowledge inbound messages from publishers quickly enough then the transmit windows for the publishers will close, back-pressuring the publishers and causing them to slow down which was previously identified as an undesirable behavior. If the assured message delivery system cannot deliver messages to streaming subscribers in a timely manner, latency-sensitive applications can see excessive and/or unpredictable message latencies and jitter and eventually reduced overall message delivery rate.
Current assured message delivery system implementations do not identify different client behaviors and process client messages differentially according to the client's behavior. A system that can identify these client behaviors and tailor its interactions with clients according to the ideal system behavior will exhibit better system performance. The messaging system, by dedicating resources to specific client flows, prioritizing certain work flows and bundling certain work flows, can make more efficient use of system resources, provide better overall service to clients and create better true real-time decoupling between the clients. The goal is to create a system that provides service as close to the previously described ideal behavior as possible for all clients regardless of the behavior of other clients. A system where the behavior of one client does not cause the level of service that another client receives to deviate from the ideal is desirable.