Data processing networks exist in many forms from relatively small local distributed computing networks of computers to large remotely connected heterogeneous networks such as the Internet, which is a loose worldwide confederation of servers and browser clients. Connection and communication between points of a network takes place on several levels or layers, each with its own rules or protocols, ranging from the hardware level, through basic data transmission and transport levels, to the application level. Different multi-layer models have evolved, the best known being the TCP/IP (Transmission Control Protocol/Internet Protocol) suite, which has five layers. Another well-known model is the OSI (Open Systems Interconnection) model, which has seven layers.
In networks conforming to TCP/IP, for example, when one party ceases to require a connection to another party, TCP/IP should explicitly transmit data to signal the end of a connection. This frees both parties after which reconnection or a new connection can be established. However, if an application ends suddenly, no “end of connection” information is transmitted and the process at the other end of the connection may not observe its peer has ended. Subsequent attempts to re-establish the connection by the failing application may be rejected by the process at the other end, which may believe it is still connected.
To mitigate the risk of connections remaining in this half ended state forever, TCP/IP provides a liveness checking mechanism which may, optionally, be enabled for all users of a particular TCP/IP implementation (usually this would be all processes run on a particular computer). This mechanism involves periodically asking the party at the other end of a TCP/IP connection if they are still there—and if they do not reply in a timely fashion, assuming the connection has ended. In the scenario described above, this is the mechanism by which the process would eventually notice that a connection had ended, and permit the application to re-establish its connection.
The drawback with the liveness checking used by TCP/IP is that it is performed relatively infrequently, so early attempts by one party to re-establish a connection can still be rejected. Liveness monitoring has previously been proposed for Publish/Subscribe systems. In particular, related U.S. Patent Application Publication Nos. 2004/205439A1 and 2004/0250283A1 both entitled “Liveness Monitoring in a Publish/Subscribe Messaging System” describe the use of liveness monitoring of subscribers to ensure publication only takes place when there are live subscribers. These applications are silent on the problem stated herein of minimizing the prevention of reconnection (specifically resumption of a subscription) after a failure.
Liveness testing has also been employed outside the messaging environment as, for example in U.S. Pat. No. 6,990,668 B1 entitled “Apparatus and Method for Passively Monitoring Liveness of Jobs in a Clustered Computing Environment” and in U.S. Patent Application Publication No. 2006/0087985 entitled “Discovering Liveness Information in a Federation Infrastructure.” Neither of these examples addresses the problem of denial of reconnection after a failure.