The invention relates to the field of microprocessor architectures. Microprocessor designers are continually striving to improve microprocessor performance, designing microprocessor architectures that provide, for example, increased computational abilities, increased operating speeds, reduced power consumption, and/or reduced cost. With many previous microprocessor architectures, it has become increasingly difficult to improve microprocessor performance by increasing their operating frequency. As a result, many newer microprocessor architectures have focused on parallel processing to improve performance.
One parallel processing technique employed in microprocessor architectures is multiple processing cores. This technique utilizes multiple independent processors, referred to as cores, operating in parallel to execute software applications. Two or more processing cores may be implemented within the same integrated circuit die, within multiple integrated circuit dies integrated within the same integrated circuit package, or a combination of these implementations. Typically, multiple processing cores share a common interface and may share other peripheral resources.
Microprocessors typically operate much faster than typical memory interfaces. Additionally, many types of electronic memory have a relatively long latency time period between the time when a processor requests data and the time the requested data is received. To minimize the time a microprocessor spends idle and waiting for data, many microprocessors use cache memory to store a temporary copy of program instructions and data. Typical cache memory is highly integrated with a microprocessor, often within the same integrated circuit die or at least within the same integrated circuit package. As a result, cache memory is very fast and has low latency. However, this tight integration limits the size of the cache memory.
Cache memory is typically partitioned into a fixed number of cache memory locations, referred to as cache lines. Typically, each cache line is associated with a set of system memory addresses. Each cache line is adapted to store a copy of program instructions and/or data from one of its associated system memory addresses. When a processor or processor core modifies or updates data stored in a cache memory location, this data will eventually need to be copied back into system memory. Typically, a processor or processor core defers updating system memory, referred to as a writeback operation, until the processor core needs the cache line to store a copy of different data from system memory.
Additionally, in processors with multiple processor cores, each processor core can have a separate cache memory. As a result, the processor must ensure that copies of the same data in different cache memories are consistent. This is referred to as cache coherency. Furthermore, one processor core may read from another processor core's cache memory, rather than copying the corresponding instructions and/or data from system memory. This reduces processor idle time and redundant accesses to system memory.
It is desirable for a processor to perform writeback operations efficiently. It is also desirable for the processor to ensure that writeback operations and reads between processor core caches do not interfere with each other. It is further desirable for processors to efficiently maintain cache coherency for multiple processor cores with separate cache memories operating independently. It is also desirable to minimize the size and complexity of the portion of the processor dedicated to cache coherency.