Distributed systems are becoming an increasingly popular means of providing platforms for a variety of functions, including storage controller type functions. Their popularity arises from the flexibility and scalability such systems offer. Fault tolerance is implemented in a number of mutually supporting ways, such as providing redundant network infrastructures, or redundant storage attachments. Distributed applications depend on network connectivity and communications capability to perform their function. These fault tolerance features improve the availability of the system. High availability is also increasingly important for many applications.
Many systems implement retry algorithms for their network interfaces which operate as follows:    1. An error happens, such as a packet being dropped.    2. A timeout interval expires.    3. The network hardware or protocol stack detects the error.    4. The error is reported to the issuer of request.    5. The issuer attempts the request a second time, possibly using alternative hardware.
Such schemes are simple, and are adequate when it is acceptable to wait for the original request to fail. However, for some important environments, bare availability (lack of actual failures while attempting to access a service) is insufficient. In these environments, it is important to receive a response within a certain time. Failure to respond within that time will have a penalty comparable to that of a full-scale failure in accessing the service, for example, another application might time out and return an error condition, or an Internet user might click away to a competitor's web-site in frustration.
Thus, it would be advantageous to provide a method which would enable retries within a desired time limit, but existing systems do not naturally enable this. There are a number of problems to be addressed. First, the system might try to detect an error in a more timely fashion. It might be possible to reduce timeout intervals in interface adapters or hardware, but often there are architectural limits in how low these intervals can be. For example, in Fiber Channel, an exchange that fails to complete normally cannot be reused until the expiry of an error timeout interval defined by the switch, often 10 seconds. Further, many network implementations do not behave robustly if the network error timeout interval is reduced too much.
It might be possible to use some other timeout mechanism—one not associated with the interface software or hardware that has failed—to attempt to redrive the request, but this does not solve the problem, because the original request is still active. Any attempt to redrive the original request, using an alternative path offered by redundant hardware, will be blocked until the original request has completed, because there are resources associated with the original request that are still in use.
As a concrete example, where a Fiber Channel adapter is used to implement the transmission interface, and a multithreaded user process is attempting to retransmit a buffer, a second transmission attempt will be blocked by the virtual memory system because the memory is still in use by the original transmission.
Another possible solution might be to attempt to avoid this memory blocking problem by retaining a copy of the transmission data in a private buffer. This copy could then be used to create a second transmission should the first be deemed to have taken too long. The private copy could be accessed without being blocked by the virtual memory manager, but this scheme would increase the cost of every transmission, including the majority which do not encounter a problem, because the data would have to be copied for each transmission before the first transmission attempt is made. Such an additional processing cost is likely to prove unacceptable in most modern networks.