Emerging non-volatile random access memories (NVRAM), such as resistive random access memory (ReRAM), phase-change memory (PCM), and spin-transfer torque magnetic random access memory (STT-RAM) and other technologies, have wide applicability in computing systems. Such emerging non-volatile random access memory technologies are replacing existing technologies such as dynamic random access memory (DRAM) and solid state drive (SSD) and can enable additional capabilities in other memory technologies, such as persistent tiers of memory hierarchy. Unlike DRAM, which stores bit information in the form of electric charges, non-volatile random access memory stores the bit information by altering properties (e.g., resistance, physical state, magnetic orientation) of a suitable material in each memory cell.
Non-volatile main memories may have terabytes of capacity on a single device. Memory access latencies for non-volatile random access memories may be higher than volatile memory technologies, especially for write requests. High-latency write requests, for example, can block access to the memory bank that contains the corresponding memory cells, which increases service time of read requests to that same bank, leading to negative impact on performance.
For example, if a memory controller controls eight banks of memory, then eight write requests received from one or more memory access engines may be processed. However, non-volatile memory such as NVRAM may have long write latencies, meaning that it can take longer to effect a write than a read from the same memory bank. Memory latencies may be more important for real time operations such as providing video playback by memory access engines such as graphics processing units (GPUs), central processing units (CPUs), or other processors in the device, such as devices that provide real-time video conferencing, live video playback, or any other high latency data where a user can visually perceive interruptions if the data is not provided to a display or audio output in a timely manner.
Read latencies have been reduced by techniques that, for example, provide a write pausing to allow reads to other rows in a bank that is undergoing a high latency write to the same bank. The write is paused until the read operation is completed. However, such operations can unnecessarily slow down the memory access operation. For example, write requests to a memory bank can be paused by a memory controller while reads to different memory locations in the same bank occur.
As such, writes or reads to internal memory that cause high latency or memory access contention can negatively reduce performance and greatly impact a user experience.