Authorization Pursuant to 37 C.F.R. .sctn. 1.17(e)
A portion of the disclosure of this patent document contains command formats and other computer language listings all of which are subject to copyright protection. The copyright owner, EMC Corporation, has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever.
1. Field of the Invention
The present invention relates generally to a file server employing a plurality of processing units. More particularly, the invention relates to failover services for resuming interrupted operations of a failed processing unit with little or no client involvement.
2. Background Art
Transactional semantics are typically employed between a host computer or client and a data storage system or file server to permit recovery from a failed processor in the data storage system or file server. The host computer or client sends to the data storage system or file server a command or command chain defining a transaction for which the data storage system or file server must acknowledge completion before committing results of any following commands.
In an environment employing commands in the IBM Corporation count-key-data (CKD) format, for example, all of the commands for a single input/operation for a single logical volume are included in a single channel command word (CCW) chain. The data storage system acknowledges completion of each write command by returning a channel end (CE) and device end (DE) to the host computer. The results of all channel command words of this single input/output operation are to be committed before commitment of the results of any following CCW's. Once the host processor sends the entire chain to the data storage system, it need not poll for a response; instead, the host typically continues with other operations, and is interrupted when the channel adapter responds with a device end (DE) signal indicating that the results of the last CCW in the chain has been committed.
In an open systems environment, a data storage system typically handles each input/output command as a separate transaction and acknowledges completion of each input/output command. If a problem occurs, the data storage system containing the primary returns a "unit check" with appropriate sense bytes to the host. This causes the host to retry the input/output operation.
By employing transactional semantics, a failure of a redundant processor in the data storage system or file server will not usually disrupt the operation of the host computer or client any more than a simple failure of the data link between the data storage system or file server and the host computer or client. Upon failing to receive an acknowledgement of completion of a transaction, the host computer or client re-sends the transaction. If the data storage system or file server continues to fail to acknowledge completion of the transaction, the host computer or client may re-send the transaction over an alternative data link to the data storage system or file server.
The use of transactional semantics and the re-try of unacknowledged transactions is a good technique for contending with processor failures in a data storage system or file server in which the transactions are primarily read and write operations. However, network-attached file servers, and video file servers in particular, perform data streaming operations for which the re-try of unacknowledged transactions has some undesirable consequences. A data streaming operation requires exclusive use of certain valuable resources, such as buffer memory, a dedicated port in the file server, and a dedicated network data link. Therefore, the file server should detect processor failure without reliance on the client in order to free-up the dedicated resources as soon as possible. Moreover, a data streaming operation that directs data to a network destination other than the client may also involve other clients or consumers having a special interest in minimizing delay or disruption in the streaming of the data. For example, in a video file server application, the data may be viewed in real time by an ultimate consumer, and any delay in transmission in excess of the consumer's buffering capabilities will interfere with the consumer's viewing pleasure.