This disclosure relates to data processing and storage, and more specifically, to a data storage system, such as a flash memory system, that employs a hot spare storage device to store, and to service accesses to, a dataset that has low associated wear, for example, a dataset that is more frequently read than written.
NAND flash memory is an electrically programmable and erasable non-volatile memory technology that stores one or more bits of data per memory cell as a charge on the floating gate of a transistor. In a typical implementation, a NAND flash memory array is organized in blocks (also referred to as “erase blocks”) of physical memory, each of which includes multiple physical pages each in turn containing a multiplicity of memory cells. By virtue of the arrangement of the word and bit lines utilized to access memory cells, flash memory arrays can generally be programmed on a page basis, but are erased on a block basis.
In data storage systems employing NAND flash memory and/or other storage technologies such as magnetic hard disk drives (HDDs), the availability and/or performance of the data storage system is enhanced by employing some level of data redundancy. For example, data storage systems often employ one or more arrangements (often referred to as “levels”) of redundant array of inexpensive (or independent) disks (RAID). Commonly employed RAID levels include RAID 0, which employs data striping across a set of RAID disks (RAID 0 in and of itself does not improve availability but can improve performance); RAID 1, which involves mirroring of RAID disks; RAID 4, which implements block-level striping across RAID disks and a dedicated parity drive; RAID 5, which implements block-level striping across RAID disks and distributed storage of parity information; and RAID 6, which implements block-level striping across RAID disks and distributed storage of two independent sets of parity information. Various RAID levels can also be used in combination to form hybrid RAID arrays; for example, RAID 10, which combines RAID 1 and RAID 0, implements a mirrored set of striped drives. The data redundancy provided by the various standard or hybrid RAID levels allow the data storage system to recover from various modes of failure, thus generally improving data availability and storage system reliability.
In addition to the data redundancy provided by the various levels of RAID, physical device redundancy can also be provided through the provision of one or more spare storage drives. In many cases, the spare storage drives can be so-called “hot” spare drives in that the storage drives are powered on, formatted (if applicable), and ready to be used to rebuild the data storage array in case of the failure of one or more of the storage drives comprising the data storage array. In many cases, hot spare drives do no useful work until a drive failure occurs. After the replacement of the defective drive, the hot spare drive will then usually be employed as the spare and again do no work until and unless another drive fails. Thus, depending on the failure domain(s) to which a hot spare drive is applied, the hot spare drive may never be used, or at most, may be actively used for only a few hours out of the entire life of the data storage array.
In some prior art literature, it has been proposed to make use of the storage capacity of hot spare drives. For example, the technical disclosure “A method to expand SSD cache using hot spare disks.” IP.com No. 000233970 (Jan. 6, 2014) discloses:                So, use [of] the SSD hot spare disks to expand the SSD cache memory will enlarge the cache memory, thus improve the IO performance, and increase the resource utilization rate. After configuring the SSD hot spare disks as SSD cache memory, the cache memory will be composed by dedicated cache and dynamic cache, dedicated cache is serves [sic] as cache all the time, and dynamic cache is the SSD hot spare disk.The technical disclosure “Using Hot Spare Disk Drives for Performance Enhancement in a RAID System.” IP.com No. 000019208 (Sep. 4, 2003) further discloses the following regarding use of a hot spare HDD in an HDD RAID system:        As part of the RAID system parameter definitions, performance targets are defined for each volume in RAID 100. Some systems may have peak usage times for particular volumes at predictable times, or the system administrator may determine high usage volumes based on performance tracking. In either case, the solution is to treat hot spare 150 as an active spare. RAID controller writes a temporary duplicate of the frequently requested data on hot spare 150, represented in this figure by copied volume 180, an exact duplicate of volume 170. This duplication allows for a greater number of simultaneous data requests to be processed more efficiently by using per-command cost function calculations.        
However, the present application recognizes that previously known uses of hot spare drives are limited in their ability to optimize performance of the data storage system.