In the computer industry, memory devices such as hard and floppy disks and Random Access Memory (RAM) provide a common means for storing computer information. Space is allocated in and data is stored to and retrieved from these storage media through hardware and software adapted for that purpose. Contiguous free storage space generally provides optimum performance in such memory devices.
An insufficient amount of contiguous space is often the result of storage space having become fragmented. Fragmentation generally exists when only separate, discrete blocks of free storage space are available for use (usually in a randomly scattered fashion across the storage space), rather than a large contiguous block of storage space being available for use. Alternatively, fragmentation exists when use of a given storage area is not in compliance with specified storage management criteria. In particular, fragmentation can result when data is moved within the storage system in units smaller than can be allocated on the storage media within the context of the data storage system.
Two major problems arise with data being fragmented on disk drives and other storage media. First, some requests for space cannot be satisfied. Second, separate processing must occur to defragment the storage (commonly referred to as garbage collection) in order to create more allocatable free space from the fragments of free space that are scattered across the storage system.
Fragmentation also occurs where a plurality of disk storage devices are used together in what is commonly known as a Redundant Array of Independent Disks (RAID or disk array). However, fragmentation in disk arrays presents a more complex management issue due to the various data redundancy schemes that may be employed in the array.
Essentially, there are two common types of disk array data redundancy schemes: (1) mirror sets, in which two or more member disks contain identical images of data, and (2) stripe sets, which interleave data and redundant (parity) data on three or more member disks. From a data management and data redundancy perspective, these broad categories are further identified with differing RAID Levels. For example, the use of disk mirroring is referred to as RAID Level 1, and parity checking as RAID Levels 2, 3, 4, 5, and 6. Although RAID 1 provides the highest data reliability and may provide the best small-write Input/Output (I/O) performance, it uses the most storage space because all data is duplicated. In contrast, RAID Levels 2-6 provide a lesser amount of data reliability (relative to RAID 1) and, typically, reduced small-write performance. However, they don't consume as much disk space as a RAID 1 technique because data is not duplicated but rather interleaved and parity checked across the disk array in a stripe set.
The parity stripe set presents a single virtual disk whose user data capacity is approximately the sum of the capacities of its members, less the storage used for holding the parity (redundant) data of the user data. The mirror set presents a single virtual disk whose user data capacity is the sum of the capacity of one-half of its members, the other half holding the mirrored (redundant) data of the user data.
For example, RAID level 4 uses a stripe set and a dedicated parity disk to store redundant information about the data existing on the several data disks in the disk array. Segments of data from each virtual disk sector are distributed across corresponding sectors of all but one of the array members (i.e., the parity disk), and the parity of the distributed segments is written in the corresponding sector of the parity disk. Parity is commonly calculated using a bit by bit Exclusive OR function of corresponding data chunks in a stripe set from all of the data disks.
RAID level 5 is similar to RAID 4 in that data is striped but is dissimilar in that the redundant information is distributed across all disks in the array rather than on a dedicated parity disk. Although RAID 5 data reliability approaches that of mirroring there is a substantial performance penalty compared to a single disk when data is written in units smaller than a whole strip or not aligned on stripe boundaries. This write performance penalty also exists in RAID 4.
This write performance penalty is due in part to the read-modify-write overhead associated with calculating and storing the parity of the data. Specifically, whenever data is newly written to a stripe in a disk array that already contains data, the existing parity in the stripe must be read in order to calculate (modify) and write the new parity. Because of this read-modify-write parity storage overhead, present defragmentation schemes generally move data from one fragmented location to a stripe that the system is currently extending. One of the reasons for doing this is to be able to cache the parity of the stripe and avoid some of the read-modify-write overhead of parity writes. Another reason is that prior art teaches that this method is best for maintaining large free storage areas for fast writing. See, Rosenblum, Mendel and John K. Ousterhout. "The Design and Implementation of a Log-Structured File System." (Computer Science Div., Dept. of Electrical Engineering and Computer Science, Univ. of California, Berkeley: ACM, 1991); and de Jorge, Wiebren, M. Frans Kaashoek, and Wilson C. Hsieh. "The Logical Disk: A New Approach to Improving File Systems." (Proceedings of 14th ACM Symposium on Operating Systems Principles, Asheville, N.C.: Dec. 5-8, 1993) 15-28.
Although RAID 2-6 parity checking generally provides more efficiently used storage space than RAID 1 mirroring, the problem of disk fragmentation remains common in all. Moreover, fragmentation is also inevitable in memory hierarchy systems or systems that move data between multiple storage types. For example, fragmentation commonly occurs when data is deleted or migrated to another redundancy type in a memory hierarchy system, or to another storage medium, such as tape.
A memory hierarchy system employs the simultaneous use of multiple data redundancy schemes and/or storage types to optimize and improve data storage system cost, performance, and reliability. A memory hierarchy system may include a RAID management system operatively coupled to a disk array controller for mapping two virtual storage spaces into the physical storage space of the storage disks. A RAID-level virtual storage space presents the physical storage space as mirror and parity RAID areas, for example, that store data according to RAID Level 1 (mirror redundancy) and RAID Level 5 (parity redundancy). An application-level virtual storage space presents the RAID-level virtual storage space as multiple virtual blocks. The memory hierarchy system moves virtual blocks between the mirror and parity RAID areas so that data undergoes a change in redundancy between RAID Level 1 and RAID Level 5, or vice versa. The process of moving data between the mirror and parity RAID areas, or between storage types, is referred to as "migration"
A memory hierarchy system "tunes" the storage resources of the data storage system according to a function of two parameters: size of the physical storage capacity and size of the present amount of user data being stored in the data storage system. Initially, all data is stored in mirror RAID areas because this affords the highest performance and reliability. As more data is added to the storage system, the data is migrated between mirror RAID areas and parity RAID areas to optimize performance and reliability. As the data storage system approaches full capacity, more and more data is migrated to parity RAID areas in an effort to meet all demands by the user while still providing reliability through redundancy. Accordingly, maximum flexibility and adaptation is provided, and it is not required that a user select a specific storage regime; but, instead, the system dynamically adapts to any demand placed on it by the user.
The importance of memory hierarchy systems relative to this disclosure is that fragmentation inevitably occurs when blocks are migrated from one data redundancy type to another (for example, between parity and mirrored) or from one physical storage type to another (for example, between disk and tape, or between media with different performance characteristics). For example, holes are created in parity storage if a block is moved (migrated) to mirrored storage. Similarly, holes may be created in mirrored storage when blocks are migrated to parity storage. In order to free up at least one stripe that can be allocated as either mirrored or parity storage for the purposes of migration, it is particularly important that fragmentation issues be managed on a continuous basis. Other candidates for the improved hole-plugging garbage collection method of the present invention include any storage hierarchy, such as memory and disks; a log-structured file system; large, slow storage versus small, fast storage; disk and tape; or any combinations of these.
Given the ever increasing use of data and the inevitable fragmentation of data storage space (particularly in memory hierarchy systems) and given the cost, performance, and reliability benefits provided by memory hierarchy systems, there is a striking need for improved defragmentation capabilities. Accordingly, objects of the present invention are to provide an improved storage management system and method for defragmenting data storage.