A microfiche appendix, submitted on CD-R media, that accompanies this substitute specification:
1. has images of FIGS. 4, 5, 6, 9, 10, 11, 12, 13, 14 from published Patent Cooperation Treaty (xe2x80x9cPCTxe2x80x9d) International Application WO 93/24890 that have been previously published in issued U.S. Pat. Nos. 5,611,049, 5,892,914, 6,026,452 and 6,205,475; and
2. contains the following file(s):
FIG04.pdf created 03/30/01 at 09:31a contains 71,147 bytes,
FIG05.pdf created 03/30/01 at 09:31a contains 55,337 bytes,
FIG06.pdf created 03/30/01 at 09:32a contains 27,242 bytes,
FIG09.pdf created 03/30/01 at 09:32a contains 26,513 bytes,
FIG10.pdf created 03/30/01 at 09:33a contains 57,061 bytes,
FIG11.pdf created 03/30/01 at 09:34a contains 132,278 bytes,
FIG12.pdf created 03/30/01 at 09:35a contains 33,926 bytes,
FIG13.pdf created 03/30/01 at 09:35a contains 27,387 bytes,
FIG14.pdf created 03/30/01 at 09:35a contains 10,056 bytes,
is hereby incorporated by reference.
The present invention relates generally to the technical field of multi-processor digital computer systems and, more particularly, to multi-processor computer systems in which:
1. the processors are loosely coupled or networked together;
2. data needed by some of the processors is controlled by a different processor that manages the storage of and access to the data;
3. processors needing access to data request such access from the processor that controls the data;
4. the processor controlling data provides requesting processors with access to it.
Within a digital computer system, processing data stored in a memory; e.g., a Random Access Memory (xe2x80x9cRAMxe2x80x9d) or on a storage device such as a floppy disk drive, a hard disk drive, a tape drive, etc.; requires copying the data from one location to another prior to processing. Thus, for example, prior to processing data stored in a file in a comparatively slow speed storage device such as hard disk, the data is first copied from the computer system""s hard disk to its much higher speed RAM. After data has been copied from the hard disk to the RAM, the data is again copied from the RAM to the computer system""s processing unit where it is actually processed. Each of these copies of the data, i.e., the copy of the data stored in the RAM and the copy of the data processed by the processing unit, can be considered to be image of the data stored on the hard disk. Each of these images of the data may be referred to as a projection of the data stored on the hard disk.
In a loosely coupled or networked computer system having several processors that operate autonomously, the data needed by one processor may be accessed only by communications passing through one or more of the other processors in the system. For example, in a Local Area Network (xe2x80x9cLANxe2x80x9d) such as Ethernet one of the processors may be dedicated to operating as a file server that receives data from other processors via the network for storage on its hard disk, and supplies data from its hard disk to the other processors via the network. In such networked computer systems, data may pass through several processors in being transmitted from its source at one processor to the processor requesting it.
In some networked computer systems, images of data are transmitted directly from their source to a requesting processor. One operating characteristic of networked computer systems of this type is that, as the number of requests for access to data increase and/or the amount of data being transmitted in processing each request increases, ultimately the processor controlling access to the data or the data transmission network becomes incapable of responding to requests within an acceptable time interval. Thus, in such networked computer systems, an increasing workload on the processor controlling access to data or on the data transmission network ultimately causes unacceptably long delays between a processor""s request to access data and completion of the requested access.
In an attempt to reduce delays in providing access to data in networked computer systems, there presently exist systems that project an image of data from its source into an intermediate storage location in which the data is more accessible than at the source of the data. The intermediate storage location in such systems is frequently referred to as a xe2x80x9ccache,xe2x80x9d and systems that project images of data into a cache are be referred to as xe2x80x9ccachingxe2x80x9d systems.
An important characteristic of caching systems, frequently referred to as xe2x80x9ccache consistencyxe2x80x9d or xe2x80x9ccache coherency,xe2x80x9d is their ability to simultaneously provide all processors in the networked computer system with identical copies of the data. If several processors concurrently request access to the same data, one processor may be updating the data while another processor is in the process of referring to the data being updated. For example, in commercial transactions occurring on a networked computer system one processor may be accessing data to determine if a customer has exceeded their credit limit while another processor is simultaneously posting a charge against that customer""s account. If a caching system lacks cache consistency, it is possible that one processor""s access to data to determine if the customer has exceeded their credit limit will use a projected image of the customer""s data that has not been updated with the most recent charge. Conversely, in a caching system that possesses complete or absolute cache consistency, the processor that is checking the credit limit is guaranteed that the data it receives incorporates the most recent modifications.
One presently known system that employs data caching is the Berkeley Software Distribution (xe2x80x9cBSDxe2x80x9d) 4.3 version of the Unix timesharing operating system. The BSD 4.3 system includes a buffer cache located in the host computer""s RAM for storing projected images of blocks of data, typically 8 k bytes, from files stored on a hard disk drive. Before a particular item of data may be accessed on a BSD 4.3 system, the requested data must be projected from the hard disk into the buffer cache. However, before the data may be projected from the disk into the buffer cache, space must first be found in the cache to store the projected image. Thus, for data that is not already present in a BSD 4.3 system""s buffer cache, the system must perform the following steps in providing access to the data:
Locate the buffer in the RAM that contains the Least Recently Used (xe2x80x9cLRUxe2x80x9d) block of disk data.
Discard the LRU block of data which may entail writing that block of data back to the hard disk.
Project an image of the requested block of data into the now empty buffer.
Provide the requesting processor with access to the data.
If the data being accessed by a processor is already present in a BSD 4.3 system""s data cache, then responding to a processor""s request for access to data requires only the last operation listed above. Because accessing data stored in RAM is much faster that accessing data stored on a hard disk, a BSD 4.3 system responds to requests for access to data that is present in its buffer cache in approximately {fraction (1/250)}th the time that it takes to respond to a request for access to data that is not already present in the buffer cache.
The consistency of data images projected into the buffer cache in a BSD 4.3 system is excellent. Since the only path from processors requesting access to data on the hard disk is through the BSD 4.3 system""s buffer cache, out of date blocks of data in the buffer cache are always overwritten by their more current counterpart when that block""s data returns from the accessing processor. Thus, in the BSD 4.3 system an image of data in the system""s buffer cache always reflects the true state of the file. When multiple requests contend for the same image, the BSD 4.3 system queues the requests from the various processors and sequences the requests such that each request is completely serviced before any processing commences on the next request. Employing the preceding strategy, the BSD 4.3 system ensures the integrity of data at the level of individual requests for access to segments of file data stored on a hard disk.
Because the BSD 4.3 system provides access to data from its buffer cache, blocks of data on the hard disk frequently do not reflect the true state of the data. That is, in the BSD 4.3 system, frequently the true state of a file exists in the projected image in the system""s buffer cache that has been modified since being projected there from the hard disk, and that has not yet been written back to the hard disk. In the BSD 4.3 system, images of data that are more current than and differ from their source on the hard disk data may persist for very long periods of time, finally being written back to the hard disk just before the image is about to be discarded due to its xe2x80x9cdeathxe2x80x9d by the LRU process. Conversely, other caching systems exist that maintain data stored on the hard disk current with its image projected into a data cache. Network File System (xe2x80x9cNFS(copyright)xe2x80x9d) is one such caching system.
In many ways, NFS""s client cache resembles the BSD 4.3 systems buffer cache. In NFS, each client processor that is connected to a network may include its own cache for storing blocks of data. Furthermore, similar to BSD 4.3, NFS uses the LRU algorithm for selecting the location in the client""s cache that receives data from an NFS server across the network, such as Ethernet. However, perhaps one of NFS""s most significant differences is that images of blocks of data are not retrieved into NFS""s client cache from a hard disk attached directly to the processor as in the BSD 4.3 system. Rather, in NFS images of blocks of data come to NFS""s client cache from a file server connected to a network such as Ethernet.
The NFS client cache services requests from a computer program executed by the client processor using the same general procedures described above for the BSD 4.3 system""s buffer cache. If the requested data is already projected into the NFS client cache, it will be accessed almost instantaneously. If requested data is not currently projected into NFS""s client cache, the LRU algorithm must be used to determine the block of data to be replaced, and that block of data must be discarded before the requested data can be projected over the network from the file server into the recently vacated buffer.
In the NFS system, accessing data that is not present in its client cache takes approximately 500 times longer than accessing data that is present there. About one-half of this delay is due to the processing required for transmitting the data over the network from an NFS file server to the NFS client cache. The remainder of the delay is the time required by the file server to access the data on its hard disk and to transfer the data from the hard disk into the file server""s RAM.
In an attempt to reduce this delay, client processors read ahead to increase the probability that needed data will have already been projected over the network from the file server into the NFS client cache. When NFS detects that a client processor is accessing a file sequentially, blocks of data are asynchronously pre-fetched in an attempt to have them present in the NFS client cache when the client processor requests access to the data. Furthermore, NFS employs an asynchronous write behind mechanism to transmit all modified data images present in the client cache back to the file server without delaying the client processor""s access to data in the NFS client cache until NFS receives confirmation from the file server that it has successfully received the data. Both the read ahead and the write behind mechanisms described above contribute significantly to NFS""s reasonably good performance. Also contributing to NFS""s good performance is its use of a cache for directories of files present on the file server, and a cache for attributes of files present on the file server.
Several features of NFS reduce the consistency of its projected images of data. For example, images of file data present in client caches are re-validated every 3 seconds. If an image of a block of data about to be accessed by a client is more than 3 seconds old, NFS contacts the file server to determine if the file has been modified since the file server originally projected the image of this block of data. If the file has been modified since the image was originally projected, the image of this block in the NFS client cache and all other projected images of blocks of data from the same file are removed from the client cache. When this occurs, the buffers in RAM thus freed are queued at the beginning of a list of buffers (the LRU list) that are available for storing the next data projected from the file server. The images of blocks of data discarded after a file modification are re-projected into NFS""s client cache only if the client processor subsequently accesses them.
If a client processor modifies a block of image data present in the NFS client cache, to update the file on the file server NFS asynchronously transmits the modified data image back to the server. Only when another client processor subsequently attempts to access a block of data in that file will its cache detect that the file has been modified.
Thus, NFS provides client processors with data images of poor consistency at reasonably good performance. However, NFS works only for those network applications in which client processors don""t share data or, if they do share data, they do so under the control of a file locking mechanism that is external to NFS. There are many classes of computer application programs that execute quite well if they access files directly using the Unix File System that cannot use NFS because of the degraded images projected by NFS.
Another limitation imposed by NFS is the relatively small size (8 k bytes) of data that can be transferred in a single request. Because of this small transfer size, processes executing on a client processor must continually request additional data as they process a file. The client cache, which typically occupies only a few megabytes of RAM in each client processor, at best, reduces this workload to some degree. However, the NFS client cache cannot mask NFS""s fundamental character that employs constant, frequent communication between a file server and all of the client processors connected to the network. This need for frequent server/client communication severely limits the scalability of an NFS network, i.e., severely limits the number of processors that may be networked together in a single system.
Andrew File System (xe2x80x9cAFSxe2x80x9d) is a data caching system that has been specifically designed to provide very good scalability. Now used at many universities, AFS has demonstrated that a few file servers can support thousands of client workstations distributed over a very large geographic area. The major characteristics of AFS that permit its scalability are:
The unit of cached data increases from NFS""s 8 k disk block to an entire file. AFS projects complete files from the file server into the client workstations.
Local hard disk drives, required on all AFS client workstations, hold projected file images. Since AFS projects images of complete files, its RAM is quickly occupied by image projections. Therefore, AFS projects complete files onto a client""s local hard disk, where they can be locally accessed many times without requiring any more accesses to the network or to the file server.
In addition to projecting file images onto a workstation""s hard disk, similar to BSD 4.3, AFS also employs a buffer cache located in the workstation""s RAM to store images of blocks of data projected from the file image stored on the workstation""s hard disk.
Under AFS, when a program executing on the workstation opens a file, a new file image is projected into the workstation from the file server only if the file is not already present on the workstation""s hard disk, or if the file on the file server supersedes the image stored on the workstation""s hard disk. Thus, assuming that an image of a file has previously been projected from a network""s file server into a workstation, a computer program""s request to open that file requires, at a minimum, that the workstation transmit at least one message back to the server to confirm that the image currently stored on its hard disk is the most recent version. This re-validation of a projected image requires a minimum of 25 milliseconds for files that haven""t been superseded. If the image of a file stored on the workstation""s hard disk has been superseded, then it must be re-projected from the file server into the workstation, a process that may require several seconds. After the file image has been re-validated or re-projected, programs executed by the workstation access it via AFS""s local file system and its buffer cache with response comparable to those described above for BSD 4.3.
The consistency of file images projected by AFS start out as being xe2x80x9cexcellentxe2x80x9d for a brief moment, and then steadily degrades over time. File images are always current immediately after the image has been projected from the file server into the client processor, or re-validated by the file server. However, several clients may receive the same file projection at roughly the same time, and then each client may independently begin modifying the file. Each client remains completely unaware of any modifications being made to the file by other clients. As the computer program executed by each client processor closes the file, if the file has been modified the image stored on the processor""s hard disk is transmitted back to the server. Each successive transmission from a client back to the file server overwrites the immediately preceding transmission. The version of the file transmitted from the final client processor to the file server is the version that the server will subsequently transmit to client workstations when they attempt to open the file. Thus at the conclusion of such a process the file stored on the file server incorporates only those modifications made by the final workstation to transmit the file, and all modifications made at the other workstations have been lost. While the AFS file server can detect when one workstation""s modifications to a file overwrites modifications made to the file by another workstation, there is little the server can do at this point to prevent this loss of data integrity.
AFS, like NFS, fails to project images with absolute consistency. If computer programs don""t employ a file locking mechanism external to AFS, the system can support only applications that don""t write to shared files. This characteristic of AFS precludes using it for any application that demands high integrity for data written to shared files.
An object of the present invention is to provide a digital computer system capable of projecting larger data images, over greater distances, at higher bandwidths, and with much better consistency than the existing data caching mechanisms.
Another object of the present invention is to provide a generalized data caching mechanism capable of projecting multiple images of a data structure from its source into sites that are widely distributed across a network.
Another object of the invention is to provide a generalized data caching mechanism in which an image of data always reflects the current state of the source data structure, even when it is being modified concurrently at several remote sites.
Another object of the present invention is to provide a generalized data caching mechanism in which a client process may operate directly upon a projected image as though the image were actually the source data structure.
Another object of the present invention is to provide a generalized data caching mechanism that extends the domain over which data can be transparently shared.
Another object of the present invention is to provide a generalized data caching mechanism that reduces delays in responding to requests for access to data by projecting images of data that may be directly processed by a client site into sites that are xe2x80x9ccloserxe2x80x9d to the requesting client site.
Another object of the present invention is to provide a generalized data caching mechanism that transports data from its source into the projection site(s) efficiently.
Another object of the present invention is to provide a generalized data caching mechanism that anticipates future requests from clients and, when appropriate, projects data toward the client in anticipation of the client""s request to access data.
Another object of the present invention is to provide a generalized data caching mechanism that maintains the projected image over an extended period of time so that requests by a client can be repeatedly serviced from the initial projection of data.
Another object of the present invention is to provide a generalized data caching mechanism that employs an efficient consistency mechanism to guarantee absolute consistency between a source of data and all projected images of the data.
Briefly, in one embodiment the present invention is a network-infrastructure cache that provides proxy services to a plurality of client workstations concurrently requesting access to data stored on a server. The client workstations and the server are interconnected by a network over which:
1. client workstations transmit network-file-services-protocol requests to the server; and
2. the server transmits network-file-services-protocol responses to requesting client workstations.
The network-infrastructure cache includes a network interface that connects to the network. The network interface provides a hardware and software interface to the network through which the network-infrastructure cache receives and responds to network-file-services-protocol requests from client workstations for data for which the network-infrastructure cache provides proxy services. The network-infrastructure cache also includes a file-request service-module for:
1. receiving via the network interface network-file-services-protocol requests transmitted by the client workstations for data for which the network-infrastructure cache provides proxy services; and
2. transmitting to client workstations via the network interface network-file-services-protocol responses to the network-file-services-protocol requests.
The network-infrastructure cache also includes a cache from which the file-request service-module retrieves data that is included in the network-file-services-protocol responses that the file-request service-module transmits to the client workstations. Lastly, the network-infrastructure cache also includes a file-request generation-module for:
1. transmitting to the server via the network interface network-file-services-protocol requests for data specified in network-file-services-protocol requests received by the file-request service-module that is missing from the cache;
2. receiving from the server network-file-services-protocol responses that include data missing from the cache; and
3. transmitting such missing data to the cache for storage therein.
In another embodiment, the present invention is a protocol-bridging network-infrastructure cache in which the file-request service-module:
1. receives via the network interface network-file-services-protocol requests from client workstations that are expressed in a first network-file-services protocol; and
2. transmits to client responses to network-file-services-protocol requests workstations in the first network-file-services protocol.
The file-request generation-module of the protocol-bridging network-infrastructure cache:
1. transmits to the server network-file-services-protocol requests expressed in the first network-file-services protocol for that is missing from the cache; and
2. receives in the first network-file-services protocol network-file-services-protocol responses that include data missing from the cache.
The protocol-bridging network-infrastructure cache also includes a protocol-translation means which, upon detecting that the server to which network-file-services-protocol requests generated by the file-request generation-module are addressed does not respond to network-file-services-protocol requests expressed in the first network-file-services protocol
1. translates network-file-services-protocol requests expressed in the first network-file-services protocol into network-file-services-protocol requests expressed in a second network-file-services protocol that differs from the first network-file-services protocol and to which the server responds; and
2. upon detecting that network-file-services-protocol responses received from the server directed to the file-request generation-module are expressed in the second network-file-services protocol, translates the network-file-services-protocol responses into network-file-services-protocol responses expressed in the first network-file-services protocol.
These and other features, objects and advantages will be understood or apparent to those of ordinary skill in the art from the following detailed description of the preferred embodiment as illustrated in the various drawing figures.