1. Field of the Invention
The present invention relates in general to computer architecture, and in particular to a method and system that allow a processor to flush a cache line associated with a linear memory address from all caches in the coherency domain.
2. Description of the Related Art
A cache memory device is a small, fast memory that is available to contain the most frequently accessed data (or xe2x80x9cwordsxe2x80x9d) from a larger, slower memory.
Dynamic random access memory (DRAM) provides large amounts of storage capacity at a relatively low cost. Unfortunately, access to dynamic random access memory is slow relative to the processing speed of modern microprocessors. A cost-effective solution providing cache memory is to provide a static random access memory (SRAM) cache memory, or cache memory physically located on the processor. Even though the storage capacity of the cache memory may be relatively small, it provides high-speed access to the data stored therein.
The operating principle behind cache memory is as follows. The first time an instruction or data location is addressed, it must be accessed from the lower speed memory. The instruction or data is then stored in cache memory. Subsequent accesses to the same instruction or data are done via the faster cache memory, thereby minimizing access time and enhancing overall system performance. However, since the storage capacity of the cache is limited, and typically is much smaller than the storage capacity of system memory, the cache is often filled and some of its contents must be changed as new instructions or data are accessed.
The cache is managed, in various ways, so that it stores the instruction or data most likely to be needed at a given time. When the cache is accessed and contains the requested data, a cache xe2x80x9chitxe2x80x9d occurs. Otherwise, if the cache does not contain the requested data, a cache xe2x80x9cmissxe2x80x9d occurs. Thus, the cache contents are typically managed in an attempt to maximize the cache hit-to-miss ratio.
With current systems, flushing a specific memory address in a cache requires knowledge of the cache memory replacement algorithm.
A cache, in its entirety, may be flushed periodically, or when certain predefined conditions are met. Furthermore, individual cache lines may be flushed as part of a replacement algorithm. In systems that contain a cache, a cache line is the complete data portion that is exchanged between the cache and the main memory. In each case, dirty data is written to main memory. Dirty data is defined as data, not yet written to main memory, in the cache to be flushed or in the cache line to be flushed. Dirty bits, which identify blocks of a cache line containing dirty data, are then cleared. The flushed cache or flushed cache lines can then store new blocks of data.
If a cache flush is scheduled or if predetermined conditions for a cache flush are met, the cache is flushed. That is, all dirty data in the cache is written to the main memory.
For the Intel family of P6 microprocessors (e.g., Pentium II, Celeron), for example, there exists a set of micro-operations used to flush cache lines at specified cache levels given a cache set and way; however, there is not such a micro-operation to flush a cache line given its memory address.
Systems that require high data access continuously flush data as it becomes dirty. The situation is particularly acute in systems that require high data flow between the processor and system memory, such as the case in high-end graphics pixel manipulation for 3-D and video performances. The problems with current systems are that high bandwidth between the cache and system memory is required to accommodate the copies from write combining memory and write back memory.
Thus, what is needed is a method and system that allow a processor to flush the cache line associated with a linear memory address from all caches in the coherency domain.
The cache line flush (CLFLUSH) micro-architectural implementation process and system allow a processor to flush a cache line associated with a linear memory address from all caches in the coherency domain. The processor receives a memory address. Once the memory address is received, it is determined whether the memory address is stored within a cache memory. If the memory address is stored within the cache, the memory address is flushed from the cache.