Existing types of computer storage media include the hard disk drive (HDD) invented in the 1950's, the flash memory, which is a non-volatile computer storage medium, such as a solid-state drive (SSD), invented in the 2000's and the newly emerging persistent memory (PM) which is capable of storing data structures such that they can continue to be accessed using memory machine instructions (e.g. load/store) or memory application programming interfaces (APIs) even after the end of the process that created or last modified them and/or after a power failure. PM can be implemented through a nonvolatile media attached to the central processing unit (CPU) of the computer.
Unlike the earlier storage media, PM does not go through the CPU Input/Output (I/O) subsystem queue, and is characterized by low RAM-like latencies, so it is 1,000 to 100,000 faster per access than the flash and HDD memories respectively.
PM is implemented today using backed-up dynamic random access memory (DRAM) or magneto-resistive random-access memory (MRAM) or spin-transfer torque magnetic random-access memory (STT-MRAM) technologies. Other emerging technologies, such as resistive random-access memory (ReRAM) and phase-change memory (PCM) which are very dense, may enable cheaper, though slower, PM components.
Given the superior performance of the emerging fast PM and the lower cost of traditional storage and emerging slower PM, it makes sense to use both technologies to create a cost-efficient data storing solution.
Traditional file systems (e.g. XFS, ZFS, BTRFS) use block devices (e.g. HDD and/or SSD) as their storage media. Linux virtual file system (VFS) layer (and equivalents in other operating systems (OSs)) accelerate their performance by using volatile memory as a cache in front of such file systems. Volatile memory is also used in order to support methods of memory-mapped file I/O, such as the Linux system memory map system call (mmap( )).
Emerging PM-aware file systems (e.g. EXT4-DAX) directly access the PM, avoiding the slow and cumbersome caching and/or memory map services of the VFS layer. However, they all assume that the entire data set resides in a homogenous PM space.
The only file system that keeps hot data (data most accessed) in a volatile memory and colder data (data less accessed) in traditional storage devices is Linux TmpFS, which is, as its name indicates, a volatile file system designed for data that is temporarily useful, but not required for long term storage. TmpFS operates on virtual memory, and the underlying operating system always uses a least recently used (LRU) data management policy that is not unique for file systems and is not adaptive to the changing requirements of the workloads using it.
Adaptive replacement cache (ARC) is an adaptive replacement cache policy developed at IBM. ARC uses a fixed queue size which is rewarded according to cache hit, and which, as opposed to LRU, is an adaptive replacement cache. However, ARC does not operate at the file system level and thus is unaware of direct I/O access requests or system calls, such as mmap( )).