Automated tiering and caching are known solutions to data storage system performance problems. Storage systems typically include different speed storage type devices, also named tiers. A fast tier (e.g., consisting of a flash based solid-state drive (SSD)) will typically have a lower latency in accessing data than a slower tier (e.g., consisting of a hard disk drive (HDD)). The storage systems, using storage software, automatically make data placement decisions to the different tiers based on data access patterns.
Caching is an acceleration mechanism typically using volatile memory or SSD in front of the storage device or system. Basic forms of caching include “read-only” caching, which operates on data that resides in the slower storage device or system and is copied into cache after a certain threshold of read activity occurs; and “write-through” caching, which writes data to both the cache and the storage device or system at the same time. In both cache methods write operations are committed at the slower storage speed. “Write-back” caching is a method in which write requests are directed to cache and completion is immediately confirmed to the requestor. This results in low latency and relatively high throughput however there is data availability exposure risk because the only copy of the written data is in cache.
Tiering typically writes data in its entirety, first to one storage type, and then moves that data to different storage types based on a data access pattern.
Tiering typically moves data between tiers (instead of copying it), both from slower storage to the faster storage and vice versa, whereas when the cache is done with the data it was accelerating, it typically nullifies it instead of copying it back to the storage area.
Flash technology, unlike HDD technology, wears out with every write, increasing the risk for SSD failure. An enterprise-grade flash-based SSD is only guaranteed to work for a limited number of write operations (e.g. 1000 full drive writes). This issue may be dealt with by using expensive over-provisioned SSD drives and/or replacing SSDs often. Thus, high maintenance costs and/or an increased risk of SSD failure are a concern with both tiering and caching technologies.
Newly emerging non-volatile or persistent memory (PM) technology may be implemented through a nonvolatile media attached to the central processing unit (CPU) of the computer. PM is characterized by low RAM-like latencies, so it is 1,000 to 100,000 faster per access than the flash-based SSD and HDD memories respectively.
PM is implemented today using backed-up dynamic random access memory (DRAM) or magneto-resistive random-access memory (MRAM) or spin-transfer torque magnetic random-access memory (STT-MRAM) technologies. Other emerging technologies, such as resistive random-access memory (ReRAM and phase-change memory (PCM) which are very dense, may enable cheaper, though slower, PM components.
File systems are usually block-based and tuned to HDD and/or SSD medias, and as such, they do not store or cache user data on memory resource. Typically, a separate software layer manages memory-based software caching. One such common example, is the Linux virtual file system (VFS) page cache, which caches user data in a volatile manner, so that read requests that can be served from the page cache may not even reach the underlying file system.
Some file systems (for example, ZFS, a combined file system and logical volume manager designed by Sun Microsystems) support tiering technology, whereas some file systems run on top of a multi tiering block service (for example, IBM EasyTier™).
Some file systems (for example, NetApp WAFL™) uses non-volatile RAM (NVRAM) but not as a tier for user data, but rather the NVRAM is used as a persistent write cache for meta data or as a temporary staging area being cleared for example every 10 seconds via a checkpoint mechanism.
Emerging PM-aware file systems (e.g. EXT4-DAX) directly access the PM, avoiding the slow and cumbersome caching and/or memory map services of the VFS layer. However, none of these systems support tiering, as they all assume that the entire data set resides in a homogenous PM space.
Thus, no multi-tiering file system uses a non-volatile memory tier directly (e.g., via a memory pointer).