Information drives business. Companies today rely to an unprecedented extent on online, frequently accessed, constantly changing data to run their businesses. Unplanned events that inhibit the availability of this data can seriously damage business operations. Additionally, any permanent data loss, from natural disaster or any other source, will likely have serious negative consequences for the continued viability of a business. Therefore, when disaster strikes, companies must be prepared to eliminate or minimize data loss, and recover quickly with useable data.
A multivolume file system (e.g., such as the VERITAS file system VxFS) can distribute a single file system name space across multiple VxVM virtual volumes. Using the Dynamic Storage Tiering (DST) feature of the file system, subsets of these volumes can be organized into administrator-defined storage tiers. Administrators can define policies that cause the file system to place classes of files on specific storage tiers when they are created, and relocated between tiers when their states change in certain ways. For example, files can be relocated when they have been inactive for a specified period, or when I/O activity against them has exceeded or dropped below a threshold. DST determines when to relocate files by periodically scanning a file system's entire directory structure or inode list and evaluating each file against the relocation policy rules in effect at the time of the scan. This works well with disk-based storage tiers, where the differences in performance and cost between tiers is relatively narrow (2-4×), and the scan frequency is relatively low (daily or less frequently). But as the number of files in a file system grows into the millions, the I/O and processing overhead of scanning begins to have a noticeable effect on operations, and is best done in off-peak periods.
Recently, the rapid rise in popularity of solid-state disks (SSDs) has changed the enterprise storage landscape. SSDs outperform rotating disks by a wide margin, but their cost per byte is roughly an order of magnitude higher. Moreover, the endurance of the current generation of SSDs is limited, wherein after a number of writes, flash memory cells begin to fail. These three factors make it doubly important to place the “right” type of files (very active; read-dominated) on SSDs, and to move them off to other storage media quickly when they are no longer active. From a DST standpoint, this might mean multiple relocation scans per day. In file systems containing large numbers of files, multiple scans per day is likely to be impractical from a resource consumption standpoint. These two factors, file systems containing large numbers of files and the need to optimize SSD utilization, provide strong motivation to search for an alternative to periodic relocation based on full file system scans.