1. Field of the Invention
The present invention relates to the cache memory systems. More particularly, the present invention relates to a cache memory system which performs prefetching of data from memory based upon pointers in the cache.
2. Art Background
Dynamic Random Access Memory (DRAM) components provide an inexpensive form of solid-state storage for computer systems. However, the speed of DRAM components is typically slower than the processors which access the DRAM. A common technique for lessening the impact of slow DRAM access time (the DRAM latency) on the processor's performance is to employ a cache memory. A cache memory is typically much smaller than the main DRAM memory, but much faster. It's speed is well matched to the processor speed. A cache is typically implemented with Static Random Access Memory (SRAM) components, which store blocks of instructions and data (referred to as lines of data), that are copies of selected main memory locations. Because a cache memory is smaller than the main memory from which it copies data, the cache is not fully addressable and stores an address tag field for each data field. This tag field identifies the main memory address corresponding to a particular data line.
FIG. 1 shows a processor with a level one (L1) cache located between the DRAM and main memory. When a read request, R1, cannot be satisfied by the cache, a cache miss occurs and the cache must make a read request, R2, to the main memory. Likewise, for example in a write allocate policy, when a write request W1 can not be satisfied by the cache, a miss occurs and the cache must make a read request R2 to the main memory.
A read request affects a processor's performance more directly than a write request. This is because a processor must usually stall (wait) until the read data it has requested is returned before continuing execution. When a processor makes a write request, the address and data can typically be written into temporary buffers while the processor continues execution.
Write requests can be serviced using techniques such as write through and write back. Using the write through technique, the data line in main memory is always updated with the write data, and the copy in the cache is updated only if it is present in the cache. Using the write back technique, the copy in the cache is updated only if the line of data is present in the cache. If the line of data is not present in the cache, then the data must first be read in, and then updated. Using this technique, some lines in main memory will be incorrect. To track the lines in main memory which hold incorrect data because the line in main memory has not been updated, dirty bits associated with each line are used.
Modern processor components typically include an L1 cache. However, referring to FIG. 2, the resulting processor-L1 structure can still benefit from a second level (L2) cache which satisfies some portion of the read and write requests (R2 and W2) directed to main memory. FIG. 3 is an exemplary illustration of the structure of an L2 cache. The illustration utilizes the following variables:
L--The number of address bits needed to access a byte in a cache line PA1 S=2.sup.L --The number of bytes in a cache line PA1 A--Main memory contains 2.sup.A+L-L lines (address bits A:L access a line). PA1 C--The number of address bits needed to access a cache line in the cache PA1 2.sup.C --The number of cache lines in the cache PA1 Valid[2.sup.C --1:0]--A value of TRUE means the cache line holds a correct copy of the main memory line. A value of FALSE means it does not. PA1 Tag[2.sup.C --1:0][A:L+C]--The value of the Address[A:L+C] bits of the memory line if the cache line is valid. PA1 Data[2.sup.C --1:0][S--1:0][8:0]--The value of the cache line. PA1 ReqAddr[A:L]--The address of a memory line requested by the processor PA1 ReqData[S--1:0][8:0]--The memory line that is read or written. PA1 Memory[2.sup.A+1--L --1:0][S--1:0][8:0]--Main memory organized by [memory lines][bytes][bits].
FIGS. 4a and 4b are exemplary pseudo code which show how the above cache and memory structures are manipulated during an R2 read request or a W2 write request. Referring to FIG. 4a , an index into the cache (IndexA) is generated by taking the unsigned integer value of a C-bit field in the ReqAddr field using the UnsignedValue( ) routine. Similarly, a pointer into memory (PointerA) is generated. The rest of the ReqAddr field is then compared to the Tag field located at IndexA of the cache. If there is a match, and the Valid field at IndexA is set to TRUE, then a cache hit has occurred and the ReqData parameter is set to the Data field located at IndexA. If there is a cache miss, then a memory read (R3) is performed to get the requested data from main memory. The ReqData parameter is set to this data, as is the Data field at IndexA of the cache. The Tag and Valid fields are also updated.
Referring to FIG. 4b, the write routine, the IndexA and PointerA unsigned integers are generated as in the read routine from the ReqAddr parameter. The ReqData parameter is written (W3) into main memory at the PointerA line address. If this line is present in the cache at IndexA, then the Data field is also set to ReqData.