In modern computer system and networking architectures, a computer system that is a repository for data files is typically not the computer system on which processing of the data files is performed. Consequently, a user at a computer workstation associated with a remote site computer system, such as a laptop computer, networked computer or desktop computer, often will desire to access, i.e., view (read) or modify (write), a data file that is stored in an internal memory, on a disk or in network attached storage of a remotely located central data source computer system. Such remote access of data files is performed over a communications channel, such as a data bus, a communications network or the Internet, which typically introduces a delay or latency in the presentation of the data file at the system accessing the data file. The latency is based on the need to transmit data between the system accessing the data file and the system that produces or stores the data file. In addition, the data file is usually accessed in portions or blocks rather than as a continuous stream, which exacerbates the latency because each block experiences the channel delay upon transmission.
In order to mitigate the effects of channel delays, most current computer systems that perform distributed file system applications, which provide for shared access to data files, implement some form of caching. In caching, a local copy of all or a portion of a data file, which is stored at a central source computer system, is maintained in a cache established at a remote system, such as in the local memory of a workstation associated with the remote system. The workstation can read or write to the cached data file, where the cached data file mirrors all or a portion of the data file stored at the central system. The cache also stores data that tracks any changes made to the cached data file, which are entered by the workstation and ultimately are to be incorporated into the data file stored at the file server. Thus, with caching, channel latency can be mitigated and a user of the workstation of the remote system is not aware that the data file is accessed from a local source rather than a remotely located central source system.
Although caching may reduce latency in certain data file access circumstances, if access to a data file which has not yet been stored as a copy (mirrored) in the cache is attempted, the latency associated with retrieving a copy of the data file from the file server, known as a cache miss, still exists. To avoid cache misses and consequently improve distributed file system performance, a caching system often implements a read-ahead technique, known as pre-populating the cache, in which data files that will be required for access in the future are stored in the cache.
In a distributed file system that provides for shared access to data files among a plurality of remote systems, the caching system that is implemented needs to maintain cache coherence and cache consistency to avoid different versions of a data file being accessed by different respective remote systems. Cache coherence is a guarantee that updates and the order of the updates to a cached data file are preserved and safe. Thus, in a coherent distributed file system, there is a guarantee that (i) a remote system does not delete the cached update data before the update data is used to update the corresponding data file stored at the file server, and (ii) no other system updates the data file in a manner that potentially can compromise the update of the data file until the data file at the server has been updated using the update data from the cache. Cache consistency is a guarantee that the updates to an opened, cached data file made by a workstation are reflected in the cached data file in a timely fashion.
The properties of cache coherence and cache consistency are equally important when multiple remote systems access the same data file. In this circumstance, coherence additionally ensures that updates on any cache corresponding to a data file stored at the file server do not override updates by another cache corresponding to the same data file. Cache consistency additionally ensures that updates to the cached data file made at any cache are, in a timely fashion, incorporated into the cached data file at any other cache which is accessing the same data file.
Cache consistency and cache coherence are easily maintained where a caching system includes a write-through architecture, which provides that all updates to the cached data file are immediately transmitted to the central computer system. This immediate transmission results in an immediate update of the data file stored at the file server of the central system. Although such architectures improve the performance associated with having multiple caches perform a read access of the data file from the central system, the latency associated with updating the data file based on write accesses still exists. Hence, this architecture typically only performs extremely well for a distributed file system where data file updates are infrequent.
Another caching architecture, known as write-back, evolved from the write-through architecture in an attempt to solve the latency problems of the latter. In a write-back architecture, a cache stores the updates to the cached data file for a period of time before transmitting (flushing) the cached updates to the central system. This periodic flushing updates the cached data file without significant latency. The simplest form of write-back is write-behind architecture, where the updates to the cached data file are not immediately, in other words after some delay, transmitted to the central source in the same order that the updates to the cached data file are stored on the cache. As cached updates are not immediately available to either the central source or other remote systems in write-back caching architectures, such architectures are mostly useful only when a single remote system will be accessing the data file for reading or writing.
If access to a data file by multiple remote systems is contemplated, the write-back caching system often is enhanced with mechanisms that track updates performed at all of the caches and also at the central source system to ensure consistency of data files. These mechanisms typically substantially increase the complexity and cost of the cache, so as to make such caches impractical in many applications. The performance benefits, however, are significant, which makes these caches very attractive for high performance computing implementations, such as computer systems connected over computer networks.
In a typical computer system architecture having file sharing capabilities, a local area computer network (“LAN”) remotely accesses data files over a distributed file system, such as NFS® (Network File System) for UNIX™ or CIFS® (Common Internet File System) for Microsoft Windows™ systems. These file systems provide workstations associated with remote computer systems with a mechanism to access data files stored at a file server of a central computer system. In addition, each remote system utilizes local caching to increase efficiency of access to data files. Typically, the caching is performed at a granularity of pages of a data file that usually constitute four Kilobyte blocks of data. The actual number of pages cached is a function of the memory available for caching in a workstation that is incorporated in or coupled to a remote system. In addition, these file systems utilize some measure of write-back caching to achieve acceptable performance.
Although cache consistency and cache coherence are important properties for a caching system, these properties are often very difficult to realize in a networked computer system having distributed file system performance capabilities, especially if the system uses write-back caching. Thus, many distributed file systems do not completely satisfy the guarantees of cache consistency and coherence. In practical implementations, a distributed file system relies on a crucial assumption that sharing of the same data file is rare and, therefore, makes a trade-off between performance and correctness when sharing of a data file does occur. For example, NFS currently is not particularly suitable for shared access because (i) it has weak consistency guarantees, namely, modifications to a cached data file for a first remote system may not be timely reflected at the central system and, thus, would not necessarily be mirrored at another remote system accessing the data file from the central system; and (ii) it has no coherence guarantees. In addition, although CIFS provides excellent consistency and coherence, shared access is at low performance because the consistency and coherence is achieved by utilizing write-through any time that more than one remote system is accessing any given data file.
In addition to automatic measures for maintaining consistency and coherence, NFS and CIFS also provide locking mechanisms that allow a file sharing application to control coherence and consistency aspects. In particular, NFS allows sharing applications to voluntarily cooperate with each other without any operating system control, which is commonly known as advisory byte range locking. CIFS provides operating system controlled locking, known as mandatory byte range locking, as well as explicit file sharing modes, which, for example, permit an application to control the manner in which a file is accessed such that no other application can access the file. The file sharing application can use such mechanisms to improve the coherence and consistency properties provided by such prior art file distribution systems. For example, an application can use byte range locking to provide coherence and consistency even if the underlying system, e.g., NFS, does not have these properties.
Further, the performance issues faced by a networked system over a local area network, where typical latencies are well under a millisecond, are compounded when file sharing is performed over a wide area network (“WAN”). One prior art system, known as Transarc Andrew File System (AFS), was created to overcome the latency existing in WANs that are geographically small, such as a WAN of a university campus. In contrast to NFS and CIFS, which use local memory of the remote system, such as memory of a computer workstation, for storing pages of files, AFS uses an on-disk local file system as a cache for entire files. In AFS, most operations occur on the local copy of the file and there is no need to retrieve data from the file server when access to the data file is requested. As each cached data file is modified and closed, the updates are transmitted (flushed) to the central system to update the corresponding data file at the file server, and then such updated data file is made available for access by other remote sites.
Thus, AFS provides flush on close consistency at file granularity, in other words, updates to a data file are immediately available when the data file is closed, but not as it is being written. AFS, however, weakens the coherence and consistency guarantees considerably to make WAN operation feasible. In particular, AFS lacks coherence because it allows multiple remote systems to simultaneously update respective cached data files, each of which corresponds to a single data file, and provides that the last remote system that closes the file is the remote system that controls the changes to the data file at the server of the central system. In other words, the modifications of such last closing remote system supersede the changes apparently being made to the data file by other remote systems. In addition, the consistency of AFS is weak because modifications are transmitted to the central source only when a remote system closes the file.
Consequently, although AFS is useful for a campus wide sharing application, it has multiple disadvantages when implemented in a business enterprise environment. For example, AFS must be installed on all computers. In addition, AFS cannot be operated in conjunction with NFS and CIFS distributed file systems or other like systems which are conventional in the prior art. Furthermore, the lack of consistency and coherence of AFS makes it unsuitable for many enterprise applications that require multiple remote systems to have shared access to a real time version of a data file.
Therefore, a need exists for a system and method for providing real time, shared access to data files through use of a distributed file system, and where the system and method exploit the benefits of caching while also providing data file coherence and consistency and ease of interoperability and interfacing with an existing distributed file system.