1. Field of the Invention
This invention relates to selecting a repository and more particularly relates to selecting a space efficient repository.
2. Description of the Related Art
Data storage devices store increasing amounts of critical data for organizations and individuals. The data storage devices may be hard disk drives, optical storage devices, holographic storage devices, semiconductor storage devices, and micromechanical storage devices. In one embodiment, data is written to a controller. The controller may destage the data to a storage device. As used herein, destage refers to encoding the data on the storage device. The controller may also stage the date from the storage device and communicate the data to a host. As used herein, stage refers to retrieving data from a storage device.
Because of the value of the data, the data may be redundantly stored on a plurality of storage devices so that if a storage device is lost, the data may still be recovered. For example, a data storage system may employ a redundant array of independent disks (RAID) to store data on a plurality of hard disks. The data may be divided into a plurality of portions of varying granularity such as coarse grained and fine grained. Each portion may be written as a strip to a different hard disk. As used herein, a strip refers to a portion of data written to one hard disk. A strip typically comprises a fixed number of fine grained structures such as tracks, data blocks, or the like. In addition, parity data may be calculated for the data and the parity data written to a hard disk. A group of strips that share parity data comprise a coarse grained structure such as a stride.
Unfortunately, when only a few tracks of a data set are modified, the data storage system must still calculate parity data for the data set before the data blocks are written to the storage device. Data sets with a relatively small number of modified tracks to be destaged are referred to herein as random data. As a result, the data storage system may accumulate data blocks substantially equivalent to a stride in a cache before writing the data blocks sequentially to the storage devices. However, the frequent destaging of random data may consume excessive storage device space.