A storage system is a computer that provides storage service relating to the organization of information on writable persistent storage devices, such as memories, tapes or disks. The storage system is commonly deployed within a storage area network (SAN) or a network attached storage (NAS) environment. When used within a NAS environment, the storage system may be embodied as a file server including an operating system that implements a file system to logically organize the information as a hierarchical structure of directories and files on, e.g. the disks. Each “on-disk” file may be implemented as a set of data structures, e.g., disk blocks, configured to store information, such as the actual data for the file. A directory, on the other hand, may be implemented as a specially formatted file in which information about other files and directories are stored.
The file server, or storage appliance, may be further configured to operate according to a client/server model of information delivery to thereby allow many client systems (clients) to access shared resources, such as files, stored on the storage appliance. Sharing of files is a hallmark of a NAS system, which is enabled because of its semantic level of access to files and file systems. Storage of information on a NAS system is typically deployed over a computer network comprising a geographically distributed collection of interconnected communication links, such as Ethernet, that allow clients to remotely access the information (files) on the storage appliance. The clients typically communicate with the storage appliance by exchanging discrete frames or packets of data according to pre-defined protocols, such as the Transmission Control Protocol/Internet Protocol (TCP/IP).
In the client/server model, the client may comprise an application executing on a computer that “connects” to the storage appliance over a computer network, such as a point-to-point link, shared local area network, wide area network or virtual private network implemented over a public network, such as the Internet. NAS systems generally utilize file-based access protocols; therefore, each client may request the services of the storage appliance by issuing file system protocol messages (in the form of packets) to the file system over the network identifying one or more files to be accessed without regard to specific locations, e.g., blocks, in which the data are stored on disk. By supporting a plurality of file system protocols, such as the conventional Common Internet File System (CIFS), the Network File System (NFS) and the Direct Access File System (DAFS) protocols, the utility of the storage appliance may be enhanced for networking clients.
A SAN is a high-speed network that enables establishment of direct connections between a storage system and its storage devices. The SAN may thus be viewed as an extension to a storage bus and, as such, an operating system of the storage system enables access to stored information using block-based access protocols over the “extended bus”. In this context, the extended bus is typically embodied as Fibre Channel (FC) or Ethernet media adapted to operate with block access protocols, such as Small Computer Systems Interface (SCSI) protocol encapsulation over FC or TCP/IP/Ethernet.
SCSI is a peripheral input/output (I/O) interface with a standard, device independent protocol that allows different peripheral storage devices, such as disks, to attach to the storage system. In SCSI terminology, clients operating in a SAN environment are initiators that initiate requests and commands for data. The storage system is a target configured to respond to the requests issued by the initiators in accordance with a request/response protocol. The SAN clients typically identify and address the stored information in the form of blocks or disks by logical unit numbers (“luns”).
A SAN arrangement or deployment allows decoupling of storage from the storage system, such as an application server, and some level of information storage sharing at the application server level. There are, however, environments wherein a SAN is dedicated to a single server. In some SAN deployments, the information is organized in the form of databases, while in others a file-based organization is employed. Where the information is organized as files, the client requesting the information maintains file mappings and manages file semantics, while its requests (and server responses) address the information in terms of block addressing on disk using, e.g., a lun.
Packets of information received by a storage appliance from a network interface are typically stored in a memory buffer data structure or mbuf in the memory of the storage appliance. Mbufs are used to organize received information into a standardized format that can be manipulated by various layers of a network protocol stack within a storage operating system. The information stored in a mbuf can include a variety of different data types including, inter alia, source and destination addresses, socket options, user data and file access requests. Further, mbufs can be used as elements of larger data structures, e.g. linked lists, and are particularly useful in dynamically changing structures since they can be created or removed “on-the-fly.” A description of mbuf data structures is provided in TCP/IP Illustrated, Volume 2 by Wright, et al (1995) which is incorporated herein by reference.
Information is often received from a network as data packets of various lengths and these packets are stored in variable length chains of mbufs. In contrast, block-based protocols usually operate on predefined block sizes. Thus, received data is typically converted from variable length mbufs to appropriate fixed size blocks for use by the block-based protocols.
The process of converting data from variable length mbuf data structures to fixed sized blocks consumes system resources, such as memory and central processing unit (CPU) cycles, that could be used for other operations executed by the storage appliance. Furthermore, the latency resulting from this conversion becomes particularly noticeable when a large number of mbuf data structures are converted to fixed sized data blocks. For example, when the storage appliance receives a request to store (via a “WRITE” operation) a large amount of data to disk, the storage appliance must allocate a sufficient amount of memory for mbufs to receive the incoming data and, in addition, must allocate more memory to copy the contents of the mbufs when the received data is divided into fixed block sizes. Not only does such a WRITE operation consume a lot of memory, but it also requires the storage appliance's CPU to implement instructions for partitioning the data, thereby consuming CPU bandwidth that could be used by other processes.
Therefore, it is generally desirable to decrease the latency of processing in-coming data to a storage appliance by decreasing the number of times mbufs are copied and partitioned in the storage appliance's memory. More specifically, it is desirable to minimize the amount of time and system resources needed to write large amounts of data to one or more storage disks in a storage appliance without affecting the resolution of other data access requests, such as iSCSI “READ” requests.