1. Field of Invention
This invention relates to microprocessor architecture, more particularly, to the interface between a microprocessor and main memory.
2. Description of Related Art
The principal component in a modern computer is the microprocessor. It is the microprocessor, often executing hundreds of millions of instructions per second, which actually xe2x80x9crunsxe2x80x9d our applications. Computers are often evaluated on the basis of the speed of their microprocessorxe2x80x94for example, a machine with a 700 MHz processor is typically considered xe2x80x9cbetterxe2x80x9d than one with a 133 MHz processor. Yet, every microprocessor requires memory. The instructions executed by the microprocessor, as well as the data upon which it operates, are accessed from the memory. Thus, the overall performance of a computer depends on how efficiently the microprocessor utilizes memory, as well as the raw speed of its internal logic.
The microprocessor accesses the memory over a bus, which includes address and data lines. The address lines allow the microprocessor to designate a particular memory location to be read from or written to, and the data lines convey the data to or from the selected memory location. The microprocessor, typically operating at a higher speed, can sometimes be encumbered by the slower memory. For example, it may happen that the microprocessor is forced to postpone an instruction fetch because the memory is unavailable, due to a previous operation that has not completed. To deal with such situations, most microprocessors are capable of prolonging their normal instruction cycle through the insertion of xe2x80x9cwait states.xe2x80x9d The wait states effectively slow down the microprocessor""s read/write timing to accommodate the memory. However, a memory interface that depended heavily on wait states would effectively handicap the microprocessor.
Instead, various measures may be taken to improve the efficiency with which the microprocessor accesses memory. One approach involves the use of a write buffer. In general, a buffer is a data area shared by hardware devices or program processes that operate at different speeds or with different sets of priorities. The buffer allows one device or process to operate without being held up by another. A buffer is similar to a cache, but exists not so much to accelerate the speed of an activity as to support the coordination of separate activities, typically clocked at different speeds.
The central processing unit (CPU) in a microprocessor normally fetches (i.e., reads) instructions and data from memory and generates results to be written back to memory. Thus, memory access speed directly influences its speed of execution. However, read operations are typically more critical than writes in this regard. To maintain throughput, the CPU must fetch an instruction (and possibly an operand) from memory each instruction cycle. Results, on the other hand, need not be stored to memory immediately, but can be deferred until it is convenient (e.g., when the memory bus becomes available). Since a write buffer generally interfaces directly to the CPU, the CPU is able to write data into it without accessing the memory bus. Thus, the CPU can continuously fetch instructions and data from memory, and store results in the write buffer. The write buffer contents are independently dispatched to memory during times when the bus is not in use. A bus interface unit (BIU) within the microprocessor coordinates the shared use of the memory bus by the CPU and the write buffer. The BIU coordinates differences in operation between the CPU local bus and one or more buses external to the CPU (i.e., the memory bus). In these circumstances, it is often possible to advantageously increase CPU throughput by using a write buffer within the BIU between the CPU and memory.
A modern microprocessor may interact with memory in a number of ways. For example, a processor equipped with an instruction pipeline often includes the capability to defer load/store operations associated with a current instruction in the pipeline until similar operations associated with previous instructions have completed. Also, many microprocessors today incorporate diagnostic logic (e.g., JTAG-based scan networks), which may require access to the memory while performing test functions.
Diagnostic circuitry based on the joint test action group (JTAG) standard is now included in many microprocessors. The JTAG standard arose in response to the increasing difficulty in testing complex, high-speed integrated circuits by means of conventional external test instruments. Clock rates for many microprocessors, for example, now approach microwave frequencies. It is difficult, if not impossible, to convey diagnostic information to an external tester at the full operating speed of such devices. In addition, pin spacing on device packages has become so dense that traditional techniques for probing external signals are no longer practical. The JTAG standard provides for the inclusion of diagnostic hardware within integrated circuits, along with the functional circuitry. On-chip diagnostic circuitry, coupled with JTAG-compliant scan registers, makes it possible to load a test vector into the IC, run a test, and then scan out internal device states.
To support these various modes of interaction, the bus interface unit of the microprocessor may be compounded by multiple special-purpose write buffers and considerable additional logic, dedicated to the specific functions. Unfortunately, this adds to the complexity and manufacturing cost of the microprocessor. Therefore, it would be desirable to have a single write buffer that will support multiple types of processor-memory transactions.
The problems outlined above are addressed by a single write buffer that combines capabilities and features implemented in separate, specialized buffers of prior art microprocessors. In addition to storing the buffered data and its address, a set of control bits is associated with each storage location, by means of which the improved write buffer hereof directs the transfer of data to memory. The control bits can be used to modify the operation of the write buffer, allowing it to support a variety of memory access modes.
In an embodiment, the buffer is coupled to a central processing unit (CPU) and a memory bus. The buffer contains storage locations, into which data records received from the CPU may be stored, and from which these data records may be transferred to the memory bus. As used herein, the term xe2x80x9cdata recordxe2x80x9d refers to a discrete plurality of bits, indicative of data to be transferred to memory from a processor, direct memory access (DMA), peripheral device, etc. Associated with each location in the buffer is a set of control bits, which determine the mode in which the data record stored at that location will be transferred to the memory bus.
The storage locations are addressable by an input pointer and an output pointer. The input pointer indicates the storage location into which the next data record received from the CPU will be stored. As each new data record is received from the CPU and stored in the buffer, the input pointer advances to the next location. Similarly, the output pointer indicates the storage location from which the next data record transferred to the memory bus will be taken. As each data record is transferred from the buffer to the memory bus, the output pointer advances to the next location.
In an exemplary embodiment, the control bits associated with each location in the buffer include a valid bit, a sync bit, an EJTAG Bit, and store conditional and store conditional pass bits. The valid bit indicates that the respective location in the buffer contains data to be transferred to the memory bus. Thus, the contents of a given location in the write buffer will be transferred to the memory bus when the output pointer reaches that location only if the corresponding valid bit is set. The sync bit is used in conjunction with a SYNC instruction, to insure that any memory accesses initiated prior to the SYNC instruction are completed before memory accesses associated with subsequent instructions are allowed to begin. This is accomplished by forcing the CPU to delay any pending load operations until every entry in the write buffer for which the sync bit is active has been transferred to the memory bus. The EJTAG bit signifies that the corresponding data record in the write buffer has been received from the EJTAG test module, rather than from the CPU. In this case, the bus interface unit (BIU) may activate special control input/output (I/O) signals when the record is transferred to the memory bus. The store conditional bit is used together with the store conditional pass bit to make the transfer of a record from the buffer to the memory bus contingent upon an external event or signal. This may be useful, for example, for coordinating memory access between multiple processors. If the store conditional pass bit for a given buffer location is not set, the corresponding data record is not transferred to the memory bus. Moreover, the store conditional pass bit can be set or cleared at any point after the data record has been placed in the buffer. Consequently, an external event or signal that controls the store conditional pass bit can serve as a qualifier for the transfer of the data record to the memory bus.
The write buffer disclosed herein also supports the use of a write-back data cache. In this context, the write buffer receives data records from the CPU as they are written to a data cache. In this mode, the buffer does not individually transfer the records to the memory bus, but waits until an entire cache line (in an embodiment, a cache line contains four words) has been received. It then transfers all four words to the memory bus in burst fashion.
Also disclosed herein, a method is presented for storing data records received from a CPU and subsequently transferring them to a memory bus. According to the method, each record is stored in a FIFO buffer at a location indicated by an input pointer. The input pointer indicates the storage location into which the next data record will be stored, and is incremented each time another record is received. Associated with each buffer location is a set of control bits, which determine the mode in which the record at the respective location is transferred to the memory bus. An output pointer indicates the location of the next data record to be transferred to the memory bus, and is incremented each time another record is transferred. In the disclosed method, the control bits serve the same functions as described above.
Also disclosed herein is a microprocessor, comprising a CPU, memory bus and a write buffer. The buffer receives data records from the CPU and subsequently transfers them to the memory bus. In an embodiment, the buffer contains several locations, into which the data records are stored and from which they are transferred to the memory bus. An input pointer indicates the location into which the next record will be stored, and is incremented after each record is received. Similarly, an output pointer indicates the location from which the next record will be transferred to the memory bus, and is incremented after each record is transferred. In the microprocessor disclosed herein, the control bits serve the same functions as described above.