1. Field of the Invention
The present invention pertains to the field of digital data communications and client-server distributed data processing systems and in particular techniques for the recovery of faults caused by either a communications problem or the failure of a client process.
2. Description of the Related Art
The use of data communications techniques to permit the distribution of data processing across a number of different digital computers is well known in the art. One of the most common organizations for such distributed systems follows the client-server model, where the distributed system has one or more servers assigned particular data processing tasks, and those servers are accessed by client programs whenever the particular data processing task of a server is required. Client programs access servers by sending a request message using an appropriate digital data communications system. Following the receipt of the request message, a server performs the data processing activity indicated by the request message to produce the desired result and sends a reply message containing the results back to the client program. The flow for such processing in a client-server distributed system is illustrated in FIG. 1. Initially, client program 101 is in step 111 performing other processing and server 102 is in step 121 waiting for a request. When client program 101 determines that it needs the processing of server 102, it prepares and sends such a request (step 112). The transfer of the request message from client program 101 to server 102 is indicated by arrow 131 in FIG. 1. Client program 101 then enters its step 113, where it either waits for a reply from server 102 or performs other processing. When request message 131 is received by server 102, it enters step 122 to receive the message and when the message has been completely received enters step 123 to process the request.
When server 102 has finished processing the request, it goes to step 124 where it sends reply message 132 to client program 101, and then enters step 125 where it waits for another request message. Client program 101 receives reply message 132 in step 114 and then proceeds to step 115, where it performs other processing. A simplified communications between client program 101 and server 102 was illustrated by messages 131 and 132 in FIG. 1. In particular, no provision was illustrated for handling the common problems in distributed processing: the loss of a data communications message or the failure of a server while processing a request. FIGS. 2A and 2B illustrate a well-known technique for handling such common problems. It is based on receiving an acknowledgement message (commonly called an ACK) for each message sent.
FIG. 2A shows the normal operation of the error recovery technique. It illustrates the steps for Process 201 sending a message to Process 202. In step 211 Process 201 sends message 231 to Process 202 and then enters step 212 waiting for an ACK. Process 202 is initially waiting in step 221 and enters step 222 when message 231 is received. After message 231 is completely received by Process 202, step 223 is entered to send an ACK message 232 back to Process 201. ACK 232 is received by Process 201 in step 213, indicating that message 231 was received by Process 202.
FIG. 2B shows how the error recovery technique handles a message being lost during its transmission. As in FIG. 2A, the example starts by Process 205 sending message 271 from step 251 and then enters step 252 waiting for an ACK. However, in the example in FIG. 2B message 271 is dropped and does not reach Process 206. Process 205 remains in step 252 waiting for an ACK (which will never come) until a predefined time period elapses, at which time Process 205 enters step 253 because of the timeout, and then enters step 254 where message 271 is resent to Process 206 as message 272. Message 272 is received by Process 206 in step 262, and Process 206 sends ACK 273 to Process 205 in step 263. This is received by Process 205 in step 256, completing the successful transfer of the message even after message 271 was dropped. Often a counter is employed so that only a specified number of retries will be attempted before deciding that a server cannot be accessed. When this occurs, it may be possible to locate an alternative server providing the same capabilities and attempt to access that alternative server.
While FIG. 2B indicates how error recovery occurs when message 271 is dropped, it can be readily seen how the same error recovery technique can produce the same result if a server fails after it receives a request but before it sends the ACK, if the ACK message is dropped, if a server fails during processing or before it sends its reply, or if the reply is dropped.
The case where the reply is sent by the server but is dropped and not received by the client presents a particular problem in that the server receives two identical requests and processes both of them. This is not a problem if a server operation is idempotent (can be repeated with no undesirable effects or giving identical results), as would be the case when the request to a server would be to read a particular block in a file. However, if the requested operation were something like doubling a particular value stored by a server, the result would be doubling the value twice (once for each request received), giving an improper result. Techniques are known in the art for handling errors when a server operation is not idempotent, such as recognizing that the multiple messages are the same through the use of unique identifiers and not performing an operation if its requesting message has already been seen.
Often it is necessary that a client employ a plurality of servers in order to produce its desired result. For example, in an information retrieval system the client may first call a parsing server that converts a query into a form needed for future processing. The parsed result is then passed to an index server that determines documents that possibly match the query. The list of possibly-matching documents is then passed to a searching server, where each document is examined to determine if it matches the original query.
FIGS. 3A, 3B and 3C illustrate three different ways in the prior art that four servers 301, 302, 303, and 304 can act in cascade to process a client 300's request. In FIG. 3A, client 300 sequentially calls server 301, server 302, server 303, and server 304. In particular, client 300 first sends request message 311 to server 301 and receives reply 312. Client 301 then sends request message 313 (which may simply be a copy of reply 312) to server 302 and receives reply 314. Client 300 then sends request message 315 (possibly a copy of reply 314) to server 303 and receives reply 316. Finally, client 300 sends request message 317 (possibly a copy of 316) to server 304 and receives reply 318.
FIG. 3B illustrates an alternate flow for communications between client 300 and the four servers 301, 302, 303, and 304. In this example, client 300 first sends request message 321 to server 301. When server 301 completes its processing, it sends its results to server 302 as request 323 (rather than as a reply to client 300). When server 302 completes its processing, it sends its results to server 303 as request 325, and when server 303 completes its processing it sends its results to server 304 as request 327. When the final server 304 completes its processing, it sends its results as reply 328 to server 303. Server 303 receives reply 328 and sends a copy of it to server 302 as reply 326, server 302 receives reply 326 and sends a copy of it to server 301 as reply 324. Finally, server 301 receives reply 324 and sends a copy of it as the final reply 322 to client 300.
FIG. 3C illustrates a third flow for communications between client 300 and the four servers 301, 302, 303, and 304. In this example, client 300 first sends request message 331 to server 301. When server 301 completes its processing, it sends its results to server 302 as message 332; server 302 sends its results to server 303 as message 333; server 303 sends its results to server 304 as message 334. Finally, server 304 sends the final reply 335 to client 300. In essence, the four reply messages 328, 326, 324, and 322 in FIG. 3B are replaced by a single reply message 335 in FIG. 3C, eliminating the need for each server to copy and resend the reply message. This can result in a substantial savings in processing and communications bandwidth if the reply msssages are large.
The four cascaded servers illustrated in FIG. 3C represent one possible server configuration: a series cascade of servers. More complex server configurations are possible, including servers operating in parallel. Robert N. Elens, in his doctoral dissertation at the University of Utah, Sequencing Computational Events in Heterogeneous Distributed Systems (June 1990), describes a technique for dynamically controlling the sequencing and configuration of servers.
In the following discussion, it will be convenient to refer to the sequential relationship of servers (in other words, the ordering of the flow of processing through the servers). If server A sends its result to server B, server A is the predecessor of server B and server B is the successor of server A. In FIG. 3C, for example, server 302 is the predecessor of server 303 and the successor of server 301. Any server which comes after a particular server in a cascade of servers is a subsequent server of that server. For example, in FIG. 3C servers 302, 303, and 304 are subsequent servers of server 301.
The last server in a cascade is the final server. In FIG. 3C, server 304 is the final server. All other servers in a cascade are intermediate, servers. In FIG. 3C, servers 301, 302, and 303 are intermediate servers. In the following discussion, when the term "server" is used without a qualifier, it is synonymous with intermediate server.
If a server X is a subsequent server of server Y, and server Y is a subsequent server of server X, then server X and server Y are in a loop. It is unusual to find cascaded servers in a processing loop.
It is important to note that while the cascaded servers illustrated in FIG. 3C resemble a ring or loop as used in low-level data communications techniques (such as a token ring), they are quite different. In a ring communications system, any node can originate a message to be transferred through the other nodes until it reaches its destination node. To achieve this universal connectivity, it is necessary that the nodes of such a network be in a physical loop. In the cascaded server system illustrated in FIG. 3C, the processing flow originates at client 300 with request message 331, and ends when reply message 335 returns to client 300. What appears to be a loop or ring in FIG. 3C is broken by the presence of client 300, which does not transfer messages.
The concepts of a predecessor server, successor server, subsequent server, intermediate server, final server, and servers not in a processing loop all pertain to the flow of processing through the cascade of servers, and do not imply any particular structure of the underlying low-level data communications system. In particular, if processing flows from server R to server S, but not from server S to server R, server R and server S are not in a loop even though they may be connected by a ring network.
Error recovery is simple in the configuration illustrated in FIG. 3A. If there is either a communications failure between a server and client 300 or if a server fails before it can send its reply, a timeout as described above occurs and client 300 can resend the request to the appropriate server. If the server cannot be accessed, as indicated by more than a given number of retries being made, an attempt is made to locate an alternative server and send the request to that alternative server. These error recovery techniques are well known in the art of data communications systems and particularly client-server distributed systems.
Error recovery is considerably harder in the configurations illustrated in FIGS. 3B and 3C. A timeout in client 300 indicates that a failure has occurred in one of the servers or the various messages, but does not indicate the particular server or message causing the problem. It is necessary for client 300 to retry the entire operation, rather than simply retry the operation at the server that has failed. Because servers that have already successfully completed their processing must recompute their results, processing that should be unnecessary is required.