Computers frequently operate with smaller amounts of RAM (random-access memory) than the total memory used by all programs. One of the various techniques to achieve this result includes compressing memory that has not been accessed recently, and decompressing such memory when it is accessed. Decompressing memory using only software is costly for a number of reasons including, for example, (i) the involvement of a kernel page fault handler, swap-related software layers, and software compressor/decompressor; (ii) a need to store memory blocks uncompressed, thereby forcing other memory blocks to be compressed or evicted, which causes additional energy consumption and potentially triggers thrashing behavior in a system (e.g., when memory is compressed and decompressed all the time with very little user-visible progress); and (iii) a need to write entire uncompressed memory blocks back to RAM, thus increasing the memory bus contention and the energy consumption of RAM.
Existing software memory compression schemes suffer from the problems (i)-(iii) described above. While some hardware-based blocks performing compression and decompression exist, such compressor-decompressor blocks are not capable of transparently handling cache line misses, and therefore still suffer from problems (ii) and (iii) while also adding hardware overhead for all memory accesses (similar to problem (i)).
Various approaches for software and hardware memory deduplication serving the same or similar high-level goals have been proposed over time. Deduplication saves space by detecting and sharing blocks of memory with the same content, as opposed to compression that instead reduces the space needed to store blocks. However, such software deduplication approaches suffer from problems (i)-(iii) described above, while existing hardware deduplication is generally done at small granularity (e.g., cache lines), causing high metadata overhead. Existing hardware as well as software deduplication schemes requires the computation-intensive, space-intensive, and energy-intensive process of finding blocks with duplicate content.