1. Field of the Invention
This invention relates generally to the field of computer processors. More particularly, the invention relates to an apparatus and method for implementing a cache line write back (CLWB) operation.
2. Description of the Related Art
Traditional computing architectures access code and data in two primary stores, volatile memory and persistent mass storage. Volatile memories such as static random access memory (SRAM) or DRAM are typically orders of magnitude faster (in terms of both latency and bandwidth) than persistent mass storage devices (e.g., magnetic disk, Flash). Volatile memory is directly attached to the CPU through a memory bus and hence directly accessible by CPU load/store instructions. Volatile memory has a higher cost/bit and limited capacity compared to mass storage.
Persistent mass storage, on the other hand, has significantly higher access latency and lower bandwidth compared to volatile memory. Mass storage is connected to the platform through an I/O controller (SCSI, SATA, PCI-Express, etc.), and can only be accessed through filesystem APIs, resulting in OS system calls. Persistent mass storage has much lower cost/bit and higher capacity compared to volatile memory.
Emerging “persistent memory” technologies blend the performance characteristics of volatile memory, with the cost, persistence, and capacity characteristics of mass storage. In particular, like mass storage, persistent memory is non-volatile. Persistent memory offers higher capacities compared to dynamic random access memory (DRAM) with a similar order of magnitude performance. Moreover, persistent memories are byte-addressable (as opposed to the page/block addressability of Flash memory), allowing them to be attached to the processor memory bus. As a result, using persistent memory, memory-intensive software can initialize faster, and save state information more quickly. Examples of persistent memory include Phase Change Memory (PCM), Phase Change Memory and Switch (PCMS), Memrister, and STT-RAM are examples of emerging persistent memory technologies.
With the emergence of persistent memory, new systems are being proposed where volatile and persistent memory are both part of the CPU addressable physical address space. In this scheme, volatile regions are managed by the operating system's virtual memory manager, and the persistent regions are managed separately from volatile memory through the operation system storage stack (i.e., the block driver and/or file system).
With such a persistent-memory architecture, system software and applications can access the non-volatile storage using regular load/store instructions, without incurring the overheads of traditional storage stacks (file systems, block storage, I/O stack, etc.). However, stores to persistent memory impose new challenges for software to enforce and reason about the “persistence” of stores. Specifically, there are a number of intermediate volatile buffers between the processor core and persistent memory (e.g., write-back buffers, caches, fill-buffers, uncore/interconnect queues, memory controller write pending buffers, etc.), and a store operation is not persistent until the data has reached some power-fail safe point at the persistent memory controller.