1. Field of the Invention
The present invention relates to file systems that are accessible across computer networks. More particularly, the present invention relates to a method and an apparatus for reducing network traffic for remote file system accesses by sending information specifying unallocated regions of files from a server to a client across a network.
2. Related Art
As computer networks are increasingly used to link computer systems together, distributed operating systems have been developed to control interactions between computer systems across a computer network. Some distributed operating systems allow client computer systems to access resources on server computer systems. For example, a client computer system may be able to access a file on a server computer system across a network. Such distributed file systems make it easy to manipulate files located on a remote server. However, if such distributed file systems are not designed carefully, they can easily generate unnecessary transfers across the network, which can degrade overall system performance.
Unnecessary transfers may be generated when a file is configured for random accesses. When a file is configured for random accesses, the blocks of the file can be accessed without linearly scanning through intervening blocks in the file. Configuring a file for random accesses allows the file to be created without first allocating storage on disk for blocks that make up the file. The blocks are eventually allocated as needed during subsequent file write operations.
Unnecessary data transfers are generated when an application performs a read operation from region that is not allocated within a file. Such a read operation will simply return null values (such as zeros) indicating that the requested region of the file has not been allocated. Hence, returning such null values creates unnecessary data transfers across the network. For example, if an application makes a request to read an 8K block of a file located on a remote server and the block is unallocated, the remote server will return a number of packets containing null values across the network to the client. These packets will take up valuable network bandwidth and will cause a number of corresponding interrupts on the client in order to process the packets. These interrupts can be particularly time-consuming for an application on the client, because the application must typically save state in order to service each interrupt. Note that most of this overhead is wasted because only null values are being transferred across the network.
What is needed is a method and apparatus for accessing a file located on a remote server that does not generate unnecessary overhead in processing accesses to unallocated regions within the file.
One embodiment of the present invention provides a system for reducing network traffic for remote file system accesses by receiving information specifying unallocated regions within a file from a remote server. The system operates by receiving a request at a local computer system for an access to a file residing in storage on the remote server. If the request is a read operation, the system determines whether the read operation is directed to a region of the file that is presently unallocated in the storage on the remote server. If so, the system returns a block of null values to the requestor without receiving the block of null values from the remote server. If not, the system sends a request to the remote server to read the data from the file. If the request is a write operation, the system determines if the write operation is directed to a region of the file that is presently unallocated in the storage on the remote server. If so, the system sends a request to the remote server to allocate storage for the write operation in the storage on the remote server. Next, the system writes the data into a local cache for the file in the local computer system. At a later time, the system copies the data from the local cache to the storage in the remote server.
In one embodiment of the present invention, if there is no information stored on the local computer system regarding which regions of the file have been allocated, the local computer system determines whether the read operation is directed to a region of the file that is presently unallocated by forwarding the read operation to the remote server. If the read operation is directed to a region of the file that is presently allocated, the local computer system receives read data from the remote server. Otherwise, the local computer system receives information specifying which regions of the file have not been allocated.
In one embodiment of the present invention, before returning the block of null values to the requester, the system creates the block of null values in a local cache for the file in the local computer system and marks the block of null values as read only.
Another embodiment of the present invention operates by receiving an access to a file residing in a storage on the server. If the access is a read operation, the system determines whether the read operation is directed to a region of the file that is presently unallocated in the storage. If so, the system sends information to the remote client specifying regions of the file that have not been allocated in the storage. If not, the system reads the data from the file in the storage, and sends the data to the remote client.
In a variation on the above embodiment, if the access is a write operation directed to a region of the file that is presently unallocated in the storage on the server, the system allocates storage for the write operation in the storage on the server and waits for the data to be sent from the remote client.