A storage system, which also may be known as a filer or a file server, is a computer that provides file services relating to the organization of information on storage media such as disks. The storage system includes a storage operating system that implements a file system to logically organize the information as a hierarchical structure of directories and files on the disks. Each “on-disk” file may be implemented as a set of disk blocks configured to store information, whereas the directory may be implemented as a specially-formatted file in that information about other files and directories are stored.
Storage systems may issue packets using file-based access protocols, such as the Common Internet File System (CIFS) protocol or Network File System (NFS) protocol, over the Transmission Control Protocol/Internet Protocol (TCP/IP) when accessing information in the form of files and directories. Alternatively, storage systems may issue packets including block-based access protocols, such as the Small Computer Systems Interface (SCSI) protocol encapsulated over TCP (iSCSI) and SCSI encapsulated over Fibre Channel (FCP), when accessing information in the form of blocks.
A common type of file system for a storage system is a write in-place file system, in which the locations of the data structures (such as inodes and data blocks) on disk are typically fixed. An inode may be a data structure used to store information, such as metadata, about a file, whereas the data blocks are structures used to store the actual data for the file. The information contained in an inode may include information relating to: ownership of the file, access permissions for the file, the size of the file, the file type, and references to locations on disk of the data blocks for the file. The references to the locations of the file data are provided by pointers, which may further reference indirect blocks that, in turn, reference the data blocks, depending upon the quantity of data in the file. Changes to the inodes and data blocks are made “in-place” in accordance with the write in-place file system. If an update to a file extends the quantity of data for the file, an additional data block is allocated and the appropriate inode is updated to reference that data block.
Another type of file system is a write-anywhere file system that does not overwrite data on disks. If a data block on disk is read from disk into memory and “dirtied” with new data, the data block is written to a new location on the disk to optimize write performance. A write-anywhere file system may initially assume an optimal layout, such that the data is substantially contiguously arranged on the disks. The optimal disk layout results in efficient access operations, particularly for sequential read operations. A particular example of a write-anywhere file system is the Write Anywhere File Layout (WAFL®) file system available from Network Appliance, Inc. The WAFL file system may be implemented within a microkernel as part of the overall protocol stack of the storage system and associated disk storage. This microkernel may be supplied as part of the storage operating system.
A storage operating system generally refers to the computer-executable code operable on a storage system that manages data access. The storage operating system may, in case of a filer, implement file system semantics, such as the Data ONTAP® storage operating system provided by Network Appliance, Inc., of Sunnyvale, Calif. The storage operating system may also be implemented as an application program operating on a general-purpose operating system, such as UNIX® or Windows®, or as a general-purpose operating system with configurable functionality, which is configured for storage applications.
Disk storage may be implemented as one or more storage volumes that comprise physical storage disks, defining an overall logical arrangement of storage space. Currently available storage system implementations can serve a large number of discrete volumes. Each volume may be associated with its own file system.
The disks within a volume may be organized as a Redundant Array of Independent (or Inexpensive) Disks (RAID). RAID implementations enhance the reliability and integrity of data storage through the writing of data “stripes” across a given number of physical disks in the RAID group, and the appropriate storing of parity information with respect to the striped data. In the example of the WAFL® file system, a RAID 4 implementation is advantageously employed, which entails striping data across a group of disks, and storing parity (a data protection value) on a separate disk in the RAID group. A volume typically comprises at least one data disk and one associated parity disk (or possibly data/parity) partitions in a single disk arranged according to a RAID 4, or equivalent high-reliability, implementation. A person of ordinary skill in the art would understand that other RAID implementations can be used, such as RAID-5, RAID-DP, as desired.
A storage system may mirror (i.e. replicate), distribute, or exchange large amounts of data to another storage system across a network. A network may be a local area network (LAN), a wide area network (WAN), the Internet, a wired network, a wireless network, or a computer bus, as desired. The replication of data may be needed for disaster recovery or data distribution, as desired. Since storage systems may each be in different, remote, geographical locations, high latency (i.e. delay) occurs and some data packets may be lost when data is communicated across a network. An undesirable effect of the high latency and lost packets is a decrease in effective system throughput, or data rate, over a network when moving data between storage systems. Effective system throughput decreases since lost packets in a time or transmission window are typically retransmitted by a storage system until they are successfully received by a second storage system across a network. A transmission window may be a maximum amount of data a storage system may receive within a predetermined time frame.
To improve system throughput and overall link utilization, storage systems may use data compression. Conventional storage systems require additional hardware adapters for data compression. The additional hardware results in various incompatible hardware configurations for supporting compression between storage systems.
Therefore, a need exists for exchanging large amounts of data between storage systems across a network while maintaining a high data rate without the undesirable need for additional hardware.