Some file systems are known to support de-duplicated data. A single data unit (e.g., data block) is retained and used to represent different portions of a same file or of a different file. As one example, a zero data unit, comprising only zeros, is used repeatedly in different locations of a file thereby reducing required storage space.
In some cases, de-duplication is performed by construction or as a pro-active step by the user. As an example, de-duplication by construction may include data snapshots and explicit user commands such as “cp-reflink”, in which the user explicitly requests to copy/clone/duplicate content. As another example, the proactive approach may require traversing the data, scanning it for duplicated values and eliminating them or avoiding building duplicated copies on persistent storage.
In traditional file systems, all accesses to de-duplicated data units are performed via the file system. The file system is responsible to perform a Copy-on-Write (CoW) in case a write access is performed.
File storage is traditionally implemented as non-volatile storage media such as magnetic hard-disk drive (HDD) or Flash-based solid-state drive (SSD), and employed as a peripheral device to one or more computing devices. Such technologies provide affordable capacity, but at latency longer in many orders of magnitudes as compared to the latency of volatile memory such as dynamic random-access memory (DRAM).
Newly developed storage media technologies are currently becoming available, which overcome this problem. Such a persistent memory device for example, a Non-Volatile Dual In-line Memory Module (NVDIMM) is a computer random access memory (RAM) that retains data even when electrical power is stopped due to normal system shutdown, an unexpected power loss, system crash or any other reason. Currently the main types of available and announced NVDIMM cards include: NVDIMM-N which is a byte-addressable memory-mapped device, NVDIMM-P which is a super set of NVDIMM-N, and new storage class memory (SCM) based NVDIMMs, built from for example 3D XPoint, MRAM and other byte addressable technologies. These devices are typically accessed at memory or near-memory speeds.
Thanks to emerging persistent memory technologies, memory-based file systems are now available. Some memory-based file systems may use NVDIMMs, or other persistent memory devices. However, some memory-based file systems may use non-persistent memory devices. Memory-based file systems may enable direct access (DAX) by computer programs with memory-mapped I/O directly to the storage media, without requiring to access a software cache memory or without any intervention of the file system itself within the data path.