Technical Field
The present disclosure relates to storage systems and, more specifically, to predictive replacement of storage devices of a storage system.
Background Information
A storage system typically includes a storage array having one or more storage devices into which information may be entered, and from which the information may be obtained, as desired. The storage devices may include hard disk drives (HDDs) embodied as magnetic disk devices having mechanically wearing components (e.g., spindles and moving magnetic heads) and solid state drives (SSDs) embodied as flash storage devices having electronically wearing components. For example, some types of SSDs, especially those with NAND flash components, may be configured with erasable pages or segments, each of which may have a limited endurance, i.e., a limited number of erase cycles, before being unable to store data reliably. Wear-leveling may be employed to address this limitation by arranging the information so that erasure and rewrite operations are distributed evenly across components of the devices. In addition, usage patterns pertaining to input/output (I/O) workloads serviced by the storage system may target substantially all of the SSDs of the array so as to further distribute the information evenly among the drives.
However, such even distribution of operations and workloads may cause wear-out of the SSDs to occur at approximately the same time, thereby leading to a potentially catastrophic failure scenario, i.e., deterioration of redundancy and loss of data. Such even wear-out is unusual for HDDs, which typically have one or two disk failures occur at a time. In addition, HDD-based errors (e.g., typically based on mechanical failures) may be reported differently and have different error pattern characteristics than SSD-based errors. As such, a predictive failure technique to detect and replace storage media likely to fail as a group at approximately the same time, is needed.