1. Field of the Invention
This invention relates in general to the field of data transfer in computer systems, and more specifically to a method for transferring burst data in a processing system.
2. Description of the Related Art
Software programs that execute on a microprocessor consist of instructions, which together direct the microprocessor to perform a function. Each instruction directs the microprocessor to perform a specific operation, which is part of the function, such as loading data from memory, storing data in a register, or adding the contents of two registers.
In a desktop computer system, a software program is typically stored on a mass storage device such as a hard disk drive. When the software program is executed, its constituent instructions are copied into a portion of random access memory (RAM). Present day memories in computer systems consist primarily of devices utilizing dynamic RAM (DRAM) technologies.
Early microprocessors fetched instructions and accessed associated data directly from DRAM because the speed of these microprocessors was roughly equivalent to the DRAM speed. In more recent years, however, improvements in microprocessor speed have far outpaced improvements in DRAM speed. Consequently, today's typical processing system contains an additional memory structure known as a cache. The cache is used to temporarily store a subset of the instructions or data that are in DRAM. The cache is much faster than the DRAM memory, but it is also much smaller in size. Access to a memory location whose data is also present in the cache is achieved much faster than having to access the memory location in DRAM memory.
Cache memory is typically located between main memory (i.e., DRAM memory) and the microprocessor. In addition, some microprocessors incorporate cache memory on-chip. Wherever the cache resides, its role is to store a subset of the instructions/data that are to be processed by the microprocessor.
When a processing unit in a microprocessor requests data from memory, the cache unit determines if the requested data is present and valid within the cache. If so, then the cache unit provides the data directly to the processing unit. This is known as a cache hit. If the requested data is not present and valid within the cache, then the requested data must be fetched from main memory (i.e., DRAM) and provided to the processing unit. This is known as a cache miss.
Structurally, a cache consists of a number of cache lines, the cache lines consisting of multiple bytes of data. Typical cache lines range in length from 4 to 256 bytes. Each cache line is associated with, or mapped to, a particular region in main memory. Thus, when a cache miss happens, the entire cache line must be filled, that is, multiple locations in memory must be transferred to the cache to completely fill the cache line. Although conventional processing systems have data bus widths that are capable of transferring multiple bytes of data from memory in a single read operation, with rare exception a given processing system's data bus is much smaller than its cache line length. Because of this, its cache line is subdivided into banks of data, the banks being equivalent in size to the data bus width. Thus, each read from memory is stored into a distinct cache bank within the cache line. The number of reads required to fill the cache line is equal to the number of cache banks within the cache line.
Multiple reads from memory take time. A technique commonly used to overcome the delays inherent in multiple reads is known as a burst read. The burst read is characterized in that, along with control signals, an initial address is provided on an address bus to memory. In response, the memory provides a burst of data items to fill the cache line. The number and sequence of the data items comprising the burst are most often predetermined system parameters. Each data item read from memory is stored in its corresponding bank within the cache line until all of the banks have been transferred from memory. Sequencing of data items within a burst transfer currently is designed to optimize attributes of the processing system such as compatibility with existing memory architectures.
In older processing systems utilizing cache memory, the processing unit was not able to access data in a cache line until the entire cache line had been filled. This is not true today. Present day processing systems provide the capability for the processing unit to access partially filled cache lines.
Because a bank within a partially filled cache line can now be accessed, the sequence of the data items transferred during a burst read becomes significant from the standpoint of anticipated data use. One skilled in the art will readily appreciate that the most likely data to be requested by an instruction, following a request that prompts a cache line fill, is contained within a next sequential memory address. The memory address may be mapped to the same bank or a next sequential bank in the cache. Yet in roughly half the cases for extant sequences, the second cache bank to be transferred during a burst is not the next sequential bank. This is a problem. Because of limitations imposed by current sequences, a processing unit that is capable and ready to access a cache bank may be forced to wait until the data that it needs is read into cache.
One skilled in the art will observe from the above that causing a processing unit to wait for data in roughly half of the cache misses that happen will markedly increase the execution time for most programs. Any means of reducing execution time is highly desirable.
Therefore, what is needed is a method for transferring burst data in a microprocessor whose sequencing is optimized for anticipated data use. In addition, what is needed is a burst sequence that is ordered to transfer the data most likely to be required by the following instruction.