1. Field of the Invention
The present invention relates generally to processors, and in particular to methods and mechanisms for processing uncacheable memory requests.
2. Description of the Related Art
Integrated circuits (ICs) often include multiple circuits or agents that have a need to communicate with each other and/or access data stored in memory. In many cases, agents may communicate through various addresses defined in a common memory map or address space. In a typical IC, the address space of the IC may be split up into multiple different regions, including a cacheable region and an uncacheable region. Requests with addresses that fall within the cacheable region are eligible to be cached within the IC, while requests with addresses that fall within the uncacheable region are not expected to be cached within the IC.
A processor of the IC may be configured to execute various types of memory operations that target both the cacheable and uncacheable regions. The processor may include a memory system with multiple levels of caches for providing low latency access to instructions and data, and memory requests that reference the cacheable regions of the address space may typically be stored at any level of cache without restrictions. However, it is often difficult for processors to maintain coherency throughout the memory system for memory requests that reference the uncacheable regions of the address space.
In multiprocessor ICs, and even in single processor ICs in which other devices access main memory but do not access a given cache, the issue of cache coherence arises. That is, a given data producer can write a copy of data in the cache, and the update to main memory's copy is delayed. In write-through caches, a write operation is dispatched to memory in response to the write to the cache line, but the write is delayed in time. In a writeback cache, writes are made in the cache and not reflected in memory until the updated cache block is replaced in the cache (and is written back to main memory in response to the replacement).
Because the updates have not been made to main memory at the time the updates are made in cache, a given data consumer can read the copy of data in main memory and obtain “stale” data (data that has not yet been updated). A cached copy in a cache other than the one to which a data producer is coupled can also have stale data. Additionally, if multiple data producers are writing the same memory locations, different data consumers could observe the writes in different orders.
Cache coherence solves these problems by ensuring that various copies of the same data (from the same memory location) can be maintained while avoiding “stale data”, and by establishing a “global” order of reads/writes to the memory locations by different producers/consumers. If a read follows a write in the global order, the data read reflects the write. Typically, caches will track a state of their copies according to the coherence scheme. For example, the popular Modified, Exclusive, Shared, Invalid (MESI) scheme includes a modified state (the copy is modified with respect to main memory and other copies); an exclusive state (the copy is the only copy other than main memory); a shared state (there may be one or more other copies besides the main memory copy); and the invalid state (the copy is not valid). The MOESI scheme adds an Owned state in which the cache is responsible for providing the data for a request (either by writing back to main memory before the data is provided to the requestor, or by directly providing the data to the requester), but there may be other copies in other caches. Maintaining cache coherency is increasingly challenging as various different types of memory requests referencing uncacheable and cacheable regions of the address space are processed by the processor(s).
It is noted that throughout this disclosure, memory requests that reference the uncacheable region of the address space may be referred to as “uncacheable memory requests”. Memory requests may also be referred to as “transactions”, “memory access operations”, or “memory operations”, which are a type of instruction operation. In various embodiments, memory operations may be implicitly specified by an instruction having a memory operation, or may be derived from explicit load/store instructions. Furthermore, a “load memory operation” or “load operation” may refer to a transfer of data from memory or cache to a processor, and a “store memory operation” or “store operation” may refer to a transfer of data from a processor to memory or cache. “Load operations” and “store operations” may be more succinctly referred to herein as “loads” and “stores”, respectively.
Furthermore, a load may be referred to as a “cacheable load” if the load addresses a cacheable region of the address space or an “uncacheable load” if the load addresses an uncacheable regions of the address space. Similarly, a store may be referred to as a “cacheable store” if the store addresses a cacheable region of the address space or an “uncacheable store” if the store addresses an uncacheable region of the address space.
It is also noted that the terms “uncacheable”, “non-cacheable”, “uncached” may be used interchangeably throughout this disclosure. Similarly, the terms “cacheable” and “cached” may be used interchangeably throughout this disclosure.