1. Field
The subject matter disclosed herein relates to file systems. More particularly, the subject matter disclosed herein relates to a method for reducing network latencies observable by a user application.
2. Description of the Related Art
Information sharing has become a critical component of most computer-based systems. As bandwidth is becoming cheaper and networking more ubiquitous, data is being increasingly shared over Wide Area Networks (WANs). A significant problem with sharing data over WANs is the access latency of the shared data. Various approaches have been applied to prefetching file data for reducing cache misses. Unlike file data, though, metadata normally is much smaller in size and prefetching metadata in a LAN setting would not have significant benefits. In a WAN environment, however, prefetching metadata could have significant benefits as long latencies would be hidden from user applications that access metadata. Typically, search programs, interactive user sessions, and other applications that provide a browsing interface to a filesystem would want to access a reasonable amount of filesystem metadata and would significantly benefit from metadata caching. Although speculatively prefetching files over a WAN could sometimes prove more expensive than useful, that possibility is not discounted. Metadata on the other hand is less expensive to prefetch.
Prefetching is an age-old concept. In computer science, prefetching has been used for virtual memory paging, prefetching of files and database objects. Prefetching has also been used on Multiple Instruction Multiple Data (MIMD) architectures to improve parallel file access and even for prefetching Java objects. For improved parallel file access for MIMD architectures, see, for example, C. S. Ellis et al., “Prefetching in file systems for MIMD multiprocessors,” Proceedings of the 1989 International Conference on Parallel Processing,” St. Charles, Ill., Pennsylvania State Univ. Press, pp. I:306-314, 1989. For prefetching Java objects, see, for example, B. Cahoon et al., “Tolerating latency by prefetching Java objects,” Workshop on Hardware Support for Objects and Microarchitectures for Java, Austin, Tex., October 1999.
Prefetching techniques have also been applied to linked data structures, which bear some resemblance in structure to filesystem hierarchies. See, for example, M. Karlsson et al., “A prefetching technique for irregular accesses to linked data structures,” HPCA, pp. 206-217, 2000; A. Roth et al., “Dependence based prefetching for linked data structures,” ACM SIGPLAN Notices, 33(11), pp. 115-126, 1998; and D. Joseph et al., “Prefetching using markov predictors,” IEEE Transactions on Computers, 48(2), pp. 121-133, 1999. Sequential readahead is a simple form of prefetching within a file. There are prefetching techniques, such as the informed prefetching and caching technique, that require hints from an application to do prefetching. See, for example, R. Hugo Patterson et al., “Informed prefetching and caching,” In “High Performance Mass Storage and Parallel I/O: Technologies and Applications,” edited by Hai Jin et al., IEEE Computer Society Press and Wiley, New York, N.Y., pp. 224-244, 1995.
There are a few instances of work based on probabilistic methods for prefetching files based on past accesses. See, for example, J. Griffloen et al., “Reducing file system latency using a predictive approach,” in USENIXSummer, pp. 197-207, 1994. H. Lei et al., “An analytical approach to file prefetching,” in 1997 USENIX Annual Technical Conference, Anaheim, Calif., USA, 1997, discloses a file prefetching mechanism that is based on on-line analytic modeling of file accesses to capture intrinsic correlations between the accesses. The file usage patterns are later used to heuristically prefetch files from a file server. Predictive prefetching has also been used to improve latencies in World Wide Web (WWW). See, for example, V. N. Padmanabhan et al., “Using predictive prefetching to improve World-Wide Web latency,” Proceedings of the ACM SIGCOMM '96 Conference, Stanford University, CA, 1996. In this case, the clients do prefetching based on hints from a server that has seen similar accesses from other clients.
In all the work in the area of prefetching in filesystems, little has been done about prefetching metadata because metadata is usually a small fraction of the size of the file system and needs to be revalidated from time to time anyway. In WAN environments, however, even accessing metadata can become a significant bottleneck. Metadata that has been recently prefetched is considered good enough by most applications that work on remote files. While WAN latencies cannot be remedied, the steadily increasing WAN bandwidth can be leveraged to aggressively prefetch metadata that is likely to be requested by the applications soon.
Consequently, what is needed is a technique that prefetches filesystem metadata to reduce the latency over a WAN.