Personal computers or workstations may be linked in a computer network to allow the sharing of data, applications, files, and other resources. In a client/server network, the sharing of resources is accomplished through the use of a file server. The file server includes a processing unit that is dedicated to managing centralized resources and to sharing these resources with the personal computers and workstations, which are known as the "clients" of the server.
Each client includes a central processing unit ("CPU"), a memory which is directly accessible to the CPU, and one or more input/output ("I/O") devices such as a screen, keyboard, mouse, and the like. The client's memory may be battery-backed, but more often the memory is volatile memory (unbacked RAM) that loses data if power to the client is interrupted or if the client is rebooted. Some clients also have local non-volatile data stores, such as a local hard disk, while other clients have no local non-volatile store.
Each file server has a CPU, a volatile memory, various I/O devices, and at least one non-volatile data store. The non-volatile data store includes a controller and a non-volatile medium. To access the medium, the server's CPU sends read and write requests to the controller, which then attempts to carry out the request. Write requests are accompanied by file data which is destined for storage on the medium, or by a memory address identifying the present location of such data. Read requests are accompanied by a memory address which identifies memory destined to receive file data copied from the medium.
Thus, in the client/server network there are four locations that might reasonably be used to hold part or all of a file that is being used by a process which resides on a particular client: the server's non-volatile store, the server's memory, the client's non-volatile store, and the client's memory. Each location has relative advantages and drawbacks.
As noted, some clients do not have a local non-volatile store. Thus, the client store is not suitable for use as a necessary part of any general approach and will not be discussed further.
Storing file data on the server's non-volatile store is generally favored for several reasons. Placing file data on the server's store makes that data potentially accessible to all clients so that access is denied only as needed for security purposes. For this reason, the larger capacity stores in a network tend to be attached to the server, which in turn makes the server's store more attractive because more storage room is available (at least initially). In addition, keeping the file data on the server store prevents inconsistency because there is only one authoritative copy of the data.
However, placing all file data on the server's store typically decreases network performance. Before the client can read the file data, the data must be transferred by the controller from the non-volatile medium into the server's memory, and then the data must be transferred over the network to the client's memory. Both transfers take time. In particular, the transfer from non-volatile store to server memory tends to be very slow when compared to the time needed to read the same data from memory. Write operations incur similar delays.
Accordingly, many known systems "cache" file data by keeping a copy of the data in a region of the server's memory known as the "server cache." If a client reads data that happens to be in the server cache, no additional transfer from the non-volatile store is needed. Likewise, if a client writes file data that corresponds (i.e., is destined for the same position in the file) to data in the server cache, the new data can be simply written to the server's memory. The new data must be written to the non-volatile store eventually, but that transfer can be delayed until a later time if the controller is busy at present. The server can still avoid inconsistencies by coordinating its memory and non-volatile store usage to ensure that clients see only the most current copy of the data.
However, caching data in the server's memory does not eliminate the need to transfer that data over the network between the client and the server. The only way to eliminate the network transfer is to cache the data in a portion of the client's memory known as the "client cache." Unfortunately, client caching creates inconsistent versions of the cached data.
Consider the situation in which client A and client B each read the same data from the server into their respective client caches and then each modify the data differently without either client being informed that the other client is also modifying the data. If client C then reads the data from the server, C will get the old version instead of getting one of the more recent versions. Moreover, when the modified data is written back to the server for storage, the version that happens to be written last will overwrite the version that was written first. Thus, if A's data is written last, then B's changes will be lost and B will not be informed of the loss.
One solution is to claim exclusive control of the file or of the relevant portion of the file. This may be accomplished by locking the file so that only one client (or process) at a time is allowed to write to the file. If several files are needed, they may be claimed on an exclusive basis by setting a file semaphore. Thus, B may claim the file by locking the file or by setting the semaphore, make its changes, and then release the file by unlocking the file or by clearing the semaphore, as appropriate. A may then claim the file, overwrite B's changes, and then release the file. B's changes are still lost, but notification of the loss can be performed in connection with the file lock or semaphore operations so that B is informed. Moreover, a log of the changes can be maintained so that earlier versions of the data can be recovered at need. The log is updated each time the file is locked or unlocked, and each time a semaphore claiming the file is set or cleared.
Another approach to preventing inconsistencies caused by client caching is to force write requests all the way through to the server store each time they are made. Under this "write-through" approach, a process that writes data to the client cache is suspended until the controller commits the data to the non-volatile medium on the server and sends an acknowledgement of that committal back to the client. After the acknowledgement arrives, the process resumes and the region of client memory used to cache the data is considered free for other uses.
Thus, write-through minimizes the loss of data that is cached in the server but not yet committed to the store by preventing data from remaining in the server cache any longer than absolutely necessary before storage. Write-through allows logs and notifications similar to those associated with file locks and semaphores. Write-through also eliminates inconsistencies by placing all file data in one authoritative copy on the server's store.
However, because write-through effectively eliminates caching for writes, it may significantly decrease network performance. Suspending the process while waiting for the network and controller transfers to complete often increases the time required for the process to finish because the process could have been doing other work while these transfers were proceeding. Write-through also tends to increase the number of network and controller transfers. Under a cached approach, for instance, overwriting the same region of the file five times could result in the transfer of only the most recent copy of the data, whereas write-through would cause five network transfers and five transfers to the non-volatile store.
To improve system performance, some approaches consider the file data "written" as soon as the data reaches the server cache. Thus, the process is suspended only for the time needed to transfer the data across the network and to transfer an acknowledgement back to the client. The process is not forced to wait for an acknowledgement that the data has been transferred to the non-volatile medium. Likewise, the portion of client cache holding the data is considered free for re-use as soon as the server acknowledges receipt of the data.
However, treating data as written once it reaches the server cache leaves the data vulnerable to network faults. Network faults may occur as a result of hardware or software problems. Some network faults cause the server to reboot in the midst of a client process, thereby destroying any data in the server cache. Such data is lost if it has not yet been transferred from the server cache to the server store. Rebooting also eliminates file locks or semaphores held by the client.
Other network faults occur when a cable between the client and server is disconnected. The server regularly "pings" the client to determine if the client is still attached to the server. If the client does not respond to the server's ping after some predetermined period of time, the server logs the client off. From the client's point of view, being unilaterally logged off by the server causes many of the same problems as having the server suddenly reboot.
If the client treats data in the volatile server cache as being effectively written to the non-volatile server store, the client will not be informed after the cached data is damaged or lost. The client may even subsequently overwrite the only remaining copy of the data, which is in the client cache.
File locks and semaphores that are recorded only in server memory are also damaged or lost when a network fault occurs. The client process must either reconstruct the current locks and semaphores and then re-establish them with the server, or else lose the needed exclusive control. Unfortunately, many existing applications are not constructed to maintain an internal copy of the current locks and semaphores, nor are they constructed to restore those locks and semaphores after a network fault. As a result, locks and semaphores are not restored, files are corrupted, and data is lost.
In summary, considering data written once it reaches the server cache improves performance but leaves the data vulnerable to network faults that alter server memory. Using the write-through approach provides file integrity and reduces or eliminates data loss, but does so at a heavy performance cost. Both approaches leave client data vulnerable to the loss of file locks and semaphores.
Thus, it would be an advancement in the art to provide a method and apparatus for restoring file data after a network fault while still obtaining the performance benefits of caching data during at least some write operations.
It would also be an advancement to provide such a method and apparatus which assists in properly restoring the current file locks and semaphore settings after a network fault.
Such a method and apparatus are disclosed and claimed herein.