Various forms of network-based storage systems exist today. These forms include network attached storage (NAS), storage area networks (SANs), and others. Network storage systems are commonly used for a variety of purposes, such as providing multiple users with access to shared data, backing up critical data (e.g., by data mirroring), etc.
A network-based storage system typically includes at least one storage server, which is a processing system configured to store and retrieve data on behalf of one or more client processing systems (“clients”). In the context of NAS, a storage server may be a file server, which is sometimes called a “filer”. A filer operates on behalf of one or more clients to store and manage shared files. The files may be stored in a storage system that includes one or more arrays of mass storage devices, such as magnetic or optical disks or tapes, by using a storage scheme such as Redundant Array of Inexpensive Disks (“RAID”). Additionally, the mass storage devices in each array may be organized into one or more separate RAID groups.
In a SAN context, a storage server provides clients with block-level access to stored data, rather than file-level access. Some storage servers are capable of providing clients with both file-level access and block-level access, such as certain filers made by Network Appliance, Inc. (NetApp) of Sunnyvale, Calif.
A storage server typically includes one or more file systems. A file system, as the term is used here, is a structured (e.g. hierarchical) set of stored data, such as files, directories and/or other types of data containers. As a file system ages and services client-initiated write requests, data and free space tend to become fragmented, a process that accelerates as the amount of free space decreases. Fragmentation occurs when free space within storage is divided into many small pieces over time because regions of storage of varying sizes are continuously allocated and deallocated. Subsequent allocations attempt to fill the “holes” left behind by the deallocated regions, which are interspersed throughout the storage space. As a result, read and write performance tend to degrade over time because data gets divided into non-contiguous pieces in order to fill the “holes.”
Moving (relocating) data on disk, a process known as “segment cleaning” alleviates fragmentation of free space within a disk. Nonetheless, segment cleaning can be expensive from a performance standpoint. Reads of existing data on disk in preparation for relocation as well as the relocation of the data may compete with the servicing of client requests, discouraging segment cleaning in heavily-loaded systems that would most benefit from it.