A data storage system is a computer and related storage medium that enables storage or backup of large amounts of data. Storage systems, also known as storage appliances or storage servers, may support a network attached storage (NAS) computing environment. A NAS is a computing environment where file-based access is provided through a network, typically in a client/server configuration. A storage server can provide clients with a block-level access to data stored in a set of mass storage devices, such as magnetic or optical storage disks.
A file server (also known as a “filer”) is a computer that provides file services relating to the organization of information on storage devices, such as disks. The filer includes a storage operating system that implements a file system to logically organize the information as a hierarchical structure of directories and files on the disks. Each “on-disk” file may be implemented as a set of disk blocks configured to store information, whereas the directory may be implemented as a specially-formatted file in which information about other files and directories are stored. A filer may be configured to operate according to a client/server model of information delivery to allow many clients to access files stored on the filer. In this model, the client may include an application, such as a file system protocol, executing on a computer that connects to the filer over a computer network. The computer network can include, for example, a point-to-point link, a shared local area network (LAN), a wide area network (WAN), or a virtual private network (VPN) implemented over a public network such as the Internet. Each client may request filer services by issuing file system protocol messages (in the form of packets) to the filer over the network.
A common file system type is a “write in-place” file system, in which the locations of the data structures (such as inodes and data blocks) on a disk are typically fixed. An inode is a data structure used to store information, such as metadata, about a file, whereas the data blocks are structures used to store the actual data for the file. The information contained in an inode may include information relating to ownership of the file, access permissions for the file, the size of the file, the file type, and references to locations on disk of the data blocks for the file. The references to the locations of the file data are provided by pointers, which may further reference indirect blocks. Indirect blocks, in turn, reference the data blocks, depending upon the quantity of data in the file. Changes to the inodes and data blocks are made “in-place” in accordance with the write in-place file system. If an update to a file extends the quantity of data for the file, an additional data block is allocated and the appropriate inode is updated to reference that data block.
Another file system type is a write-anywhere file system that does not overwrite data on disks. If a data block on a disk is read from the disk into memory and “dirtied” with new data, the data block is written to a new location on the disk to optimize write performance. A write-anywhere file system may initially assume an optimal layout, such that the data is substantially contiguously arranged on the disks. The optimal disk layout results in efficient access operations, particularly for sequential read operations. A particular example of a write-anywhere file system is the Write Anywhere File Layout (WAFL®) file system available from Network Appliance, Inc. The WAFL file system is implemented within a microkernel as part of the overall protocol stack of the filer and associated disk storage. This microkernel is supplied as part of Network Appliance's Data ONTAP® storage operating system, residing on the filer that processes file service requests from network-attached clients.
As used herein, the term “storage operating system” generally refers to the computer-executable code operable on a storage system that manages data access. The storage operating system may, in case of a filer, implement file system semantics, such as Data ONTAP® storage operating system. The storage operating system can also be implemented as an application program operating on a general-purpose operating system, such as UNIX® or Windows®, or as a general-purpose operating system with configurable functionality, which is configured for storage applications as described herein.
Disk storage is typically implemented as one or more storage “volumes” that comprise physical storage disks, defining an overall logical arrangement of storage space. Currently available filer implementations can serve a large number of discrete volumes.
The disks within a volume can be organized as a Redundant Array of Independent (or Inexpensive) Disks (RAID). RAID implementations enhance the reliability and integrity of data storage through the writing of data “stripes” across a given number of physical disks in the RAID group, and the appropriate storing of parity information with respect to the striped data. In the example of a WAFL® file system, a RAID-4 implementation is advantageously employed, which entails striping data across a group of disks, and storing the parity within a separate disk of the RAID group. As described herein, a volume typically comprises at least one data disk and one associated parity disk (or possibly data/parity) partitions in a single disk arranged according to a RAID-4, or equivalent high-reliability, implementation.
NAS devices provide access to stored data using standard protocols, e.g., Network File System (NFS), Common Internet File System (CIFS), Internet Small Computer System Interface (iSCSI), etc. To manipulate the data stored on these devices, clients have to fetch the data using an access protocol, modify the data, and then write back the resulting modified data. Bulk data processing sometimes requires small manipulations of the data that need to be processed as fast as possible. This process (fetch-modify-write) is inefficient for bulk data processing, as it wastes processor time on protocol and network processing and increases network utilization. The closer the processing is to the stored data, the less time the data processing will take.
Traditional file systems are not particularly adept at handling large numbers (e.g., more than one million) of small objects (e.g., one kilobyte (KB) files). The typical way of addressing this problem is to use a container to hold several of the small objects. However, this solution leads to the problems of how to manage the containers and how to manage the objects within the container. Managing the containers presents the typical file system problems from a higher level in the containers.
In applications that use files for storing a list of records, a deleted record is often marked as “deleted” instead of being physically removed from the file. The file is periodically repacked to purge all of the deleted records and to reclaim space. This process is traditionally carried out by reading the file by an application via NFS, for example; packing the records by the application; and writing the file back to storage via NFS, for example. Again, this process uses the typical fetch-modify-write pattern, which makes the entire repacking process inefficient for the storage device.
Another example of this type of IO-intensive task is reading a file and rewriting the data to another file, with the data being relocated within the destination file. In addition to using resources on the NAS device, this task also incurs a load on the network (sending the file back and forth) and a load on the client that is processing the data.
FIG. 1 is a flow diagram of an existing fetch-modify-write method 100 for manipulating data stored on a storage device. The method 100 operates between a server 102 and a data storage media 104. The server 102 and the data storage media 104 communicate over a network connection. The server 102 requests data to be manipulated from the storage media 104 (step 110). The storage media 104 retrieves the data and sends the data over the network to the server 102 (step 112). The server 102 manipulates the requested data (step 114) and sends the manipulated data back over the network to the storage media 104 (step 116).
As can be seen from FIG. 1, the method 100 requires that the data be sent over the network twice—once from the storage media 104 to the server 102 (step 112) and second from the server 102 to the storage media 104 (step 116).
Accordingly, there is a need for a technique for manipulating data on a storage device that avoids the limitations of the prior art solutions.