1. Field of the Invention
This invention relates to computer system memory and, more particularly, to pre-fetching of data.
2. Description of the Related Art
To improve computer system performance, many computer system processors employ some level of caching to reduce the latency associated with the time it takes for system memory to return data requested by the processor. A typical cache memory is a high-speed memory unit interposed in the memory hierarchy of a computer system between a slower system memory and a processor. A cache typically stores recently used data to improve effective memory transfer rates to thereby improve system performance. The cache is usually implemented in semiconductor memory devices having speeds that are comparable to the speed of the processor, while the system memory utilizes a less costly, lower speed technology. For example, system memories may use some form of dynamic random access memory (DRAM), while cache memories may use some form of static random access memory (SRAM).
A cache memory typically includes a plurality of memory locations that each stores a block or a “line” of two or more words. Each line in the cache has associated with it an address tag that is used to uniquely identify the address of the line. The address tags are typically included within a tag array memory device. Additional bits may further be stored for each line along with the address tag to identify the coherency state of the line.
A processor may read from or write directly into one or more lines in the cache if the lines are present in the cache and if the coherency state allows the access. For example, when a read request originates in the processor for a new word, whether data or instruction, an address tag comparison is made to determine whether a valid copy of the requested word resides in a line of the cache memory. If the line is present, a cache “hit” has occurred and the data is used directly from the cache. If the line is not present, a cache “miss” has occurred and a line containing the requested word is retrieved from the system memory and may be stored in the cache memory. The requested line is simultaneously supplied to the processor to satisfy the request.
Similarly, when the processor generates a write request, an address tag comparison is made to determine whether the line into which data is to be written resides in the cache. If the line is present, the data may be written directly into the cache (assuming the coherency state for the line allows for such modification). If the line does not exist in the cache, a line corresponding to the address being written may be allocated within the cache, and the data may be written into the allocated line.
Some processors may employ one or more levels of cache such as L1, L2 and even L3 cache. Depending on the type of cache, it may be either internal or external to the processor.
To further improve cache performance, many systems use data pre-fetching. In many cases a read request may result in further read requests to addresses sequential to the first address. Thus, pre-fetching typically refers to performing read cycles to a number of sequential addresses in memory and storing the cache lines of data within the cache. These read cycles are typically in addition to performing the first read cycle to a first address. Thus, a subsequent read request to one of the pre-fetched addresses will result in a cache hit. However, depending on the configuration of the system memory and the bandwidth limitations of the memory bus and associated hardware, some pre-fetching arrangements may not be desirable.