A file server is a computer that provides file service relating to the organization of information on writeable persistent storage devices, such as non-volatile memories, tapes or disks. The file server or filer may be embodied as a storage system including a storage operating system that implements a file system to logically organize the information as a hierarchical structure of directories and files on, e.g. the disks. Each “on-disk” file may be implemented as a set of data structures, e.g. disk blocks, configured to store information, such as the actual data for the file. A directory, on the other hand, may be implemented as a specially formatted file in which information about other files and directories are stored.
A filer may be further configured to operate according to a client/server model of information delivery to thereby allow many client systems (clients) to access shared resources, such as files, stored on a server, e.g. the filer. In this model, the client may comprise an application, such as a database management system (DBMS), executing on a computer that “connects” to the filer over a computer network, such as a point-to-point link, a local area network (LAN), a wide area network (WAN) or a virtual private network (VPN) implemented over a public network, such as the Internet. Each client may request the services of the filer by issuing file system protocol messages (in the form of packets) to the filer over the network. By supporting a plurality of file system protocols, such as the Network File Service version 4 (NFSv4) and the Direct Access File System (DAFS) protocols, the utility of the filer may be enhanced for networking clients.
Typical clients of a file server desire the lowest possible latency of input/output operations. As used herein, the term “latency” generally means the time between when the client sends a specific I/O operation request and the time when the client receives an acknowledgement that a file server has completed the requested operation. With reference to read operations, the latency is normally the time from when the client transmits a read operation until the time when the client receives the requested data. Similarly, for a write operation, latency is the time between the client sending the write operation until the time when the client receives an acknowledgement that the file server has committed the data to some form of persistent storage.
To reduce the latency of write operations, many file servers provide for some form of journaling wherein data to be written to disk is first copied to a persistent media, non-volatile random access memory (NVRAM) or flash RAM, before writing the data to disk. The file server can send an acknowledgement of the write operation once the data has been committed to persistent storage. As NVRAM or flash RAM is typically many orders of magnitude faster than writing data to disk, a file server can acknowledge a write operation sooner, thereby reducing the latency of the write operation request—that is, the time needed to complete the request and commit it to persistent storage. The data that has been acknowledged is later written to disk at regular intervals when the file system flushes its memory buffers. In the example of a Write Anywhere File Layout (WAFL) file system (such as that available from Network Appliance, Inc. of Sunnyvale, Calif.), the flushing occurs during consistency points where the file system generates a consistent on-disk image of the file system. Typically, during this write operation, the data is copied from volatile memory (online system RAM) to disk. If a crash or other error condition occurs, the data can be recovered from the NVRAM or other persistent storage.
The writing and copying of data to the NVRAM or other forms of persistent storage entails an additional consumption of file server processing resources and time. However, this consumption of resources is necessitated to meet the low-latency expectations of clients. In general, many common data handling functions within the server environment dictate a low-latency approach so that the data is persistently committed and otherwise available to users as soon as possible.
Nevertheless, there are a variety of types of applications and/or bulk data operations that can tolerate a relatively high-latency for input/output (I/O) operations. In other words commitment of certain data to persistent storage is less time-dependent. An example of such applications are the “page cleaning” write operations for a database (in which pages which have been stored by the database server are written to disk), the swapping of virtual memory for operating systems, and lazy write operations where data is buffered and then written to disk when appropriate processing resources are otherwise free to do so. These types of applications are bandwidth sensitive, so as long as they can keep many outstanding I/O operations pending simultaneously, the latency of each individual operation is less important. When the I/O operations can tolerate a high-latency, several file systems allow a write operation to include a specified latency for that particular request. If the designated latency is high enough, the file server need not commit the write data to persistent storage disk (disk, NVRAM, etc.) until the requested latency period has expired. Once the acceptable window of time has passed, it must commit the data to persistent storage (typically NVRAM) and send an acknowledgment to the client. If the designated latency is high enough, however, a file server may begin a consistency point before the window expires. In this case, there is no need to first commit the write data to NVRAM; it can be written directly to disk, saving the CPU time and memory bandwidth of writing to NVRAM. When the data is flushed, that data stored in the volatile memory is written to the disk, and the I/O operation is then acknowledged back to the client. While the data is stored in volatile storage, no acknowledgement is made, and the client logs the existence of an outstanding I/O request to the server. Some file system protocols, like NFS, support UNSTABLE writes, which also obviate the need for NVRAM copies. But, these writes must be tracked by the client and explicitly committed using a COMMIT message. DAFS high-latency writes require no such action by the client. The client must be prepared to retransmit unstable requests to the server if the server suffers an error condition before committing the data to disk.
A flow diagram showing an I/O operational environment 100 for the two general types of I/O write operations (high and low-latency) is shown in FIG. 1. The environment 100 includes a main memory 110, an NVRAM 115 and a disk 120. In a conventional write operation 125, a set of requested data 130 is initially placed in the main memory 110 by an appropriate request operation. The storage operating system executing on the file server also copies the data 130 to the NVRAM 115 thereby committing it to persistent storage. The copying to the NVRAM typically occurs substantially simultaneously with the placement of the data in the main memory 110. Once the data 130 is copied to the NVRAM 115, an acknowledgement is sent to the client that originated the write request. At some later time (i.e. a consistency point), the file system flushes the data stored in its buffers in the main memory to disk 120. As the write operation was already acknowledged, the client is able to continue operating without waiting for the data to be written to disk.
When a high-latency write operation (as described generally above) 135 is executed, the data 140 is written to the main memory 110. The data 140 remains in the main memory until such time as a consistency point, or other operation to flush the buffers of main memory, occurs. At such a consistency point, the data 140 is then copied to disk 120. After the data has been copied to disk, an acknowledgement is sent to the originator of the write operation.
One example of a file system protocol that generally supports high-latency operations is the Direct Access File System (DAFS) protocol. The DAFS protocol is a file access and management protocol designed for local file sharing or clustered environments. This distributed file system protocol provides low latency, high throughput and low overhead data movement that takes advantage of memory-to-memory networking technologies. The DAFS protocol, in particular, supports a command entitled DAFS_PROC_BATCH_SUBMIT that initiates a set of I/O requests to and from regular files. The DAFS_PROC_BATCH_SUBMIT operation permits a client to bundle a group of input/output requests into a single operation, which reduces processing overhead in generating DAFS data structure packets. The DAFS server performs the bundled requests and then sends an acknowledgement of the requests. An example of the data structures sent as part of the DAFS_PROC_BATCH_SUBMIT operation is shown in FIG. 2. The data structure 200 includes fields for requests 205, latency 210, number of completions 215, a synchronous flag, and, in alternate embodiments, other fields 225. The requests array 205, contains pointers to the various I/O requests that have been bundled together for the batch I/O operation. The maximum latency field 210 identifies a number of microseconds that the client is willing to wait for an acknowledgment of the operations. Under the DAFS protocol, the file server is not obligated to respond within the time frame set in the maximum latency 210 field. However, traditional server implementations will attempt to acknowledge I/O requests as soon as possible if the maximum latency is under a given amount, for example, 5 seconds.
In certain instances, a client's activities are blocked due to the presence of an outstanding high-latency I/O request. This blocking effect can occur when, for example, data that is in a high-latency write operation must be read or otherwise modified. In such cases, the client cannot continue with its normal operation until the high-latency I/O request has been acknowledged using known file server implementations. In one example, the latency may extend for a time period up to the maximum time set in a predetermined protocol latency field (e.g., 30 seconds). Until the I/O request is acknowledged, the client may effectively cease processing as it awaits acknowledgement of the write operation. Such processing delays can eliminate any server performance gains of doing high latency I/O. It would thus be advantageous to enable a client of a file server to modify the maximum latency to expedite or “hurry up” the execution of a certain I/O request. By permitting a client to expedite a command, a client can, if the client reaches a point where it is blocked from a further processing until a command is executed, hurry up the execution of the command and thereby avoid being blocked. If a client can modify the original maximum latency of a given I/O operation, then the client can expedite those commands that may normally cause the client's processing (until the command is executed). However, known file system implementations lack the functionality to permit a client to modify the latency of a given I/O operation.