1. Field of the Invention
The present invention relates generally to the field of computer systems. More particularly, the present invention relates to the field of memory access for computer systems.
2. Description of Related Art
A processor typically executes instructions at a faster clock speed relative to that for external memory, such as dynamic random access memory (DRAM) for example. Accessing external memory therefore introduces delays in the execution of instructions by the processor as the processor fetches both instructions to be executed and data to be processed in executing instructions from the memory at a relatively slower clock speed.
A typical processor may help minimize delays due to this memory access latency by processing instructions through a pipeline that fetches instructions from memory, decodes each instruction, executes the instruction, and retires the instruction. The operation of each stage of the pipeline typically overlaps in time those of the other stages to help hide memory access latencies in fetching instructions and data for instruction execution.
By identifying instructions that may be executed regardless of whether one or more prior fetched instructions are executed, a typical processor may also help minimize delays due to memory access latency by executing instructions in parallel, that is overlapping in time the execution of two or more instructions, and/or by executing instructions out of order. In this manner, the processor helps hide memory access latencies by continuing to execute instructions while waiting, for example, to fetch data for other instructions. Regardless of the order in which instructions are executed, the processor retires each instruction in order.
The processor may also help minimize memory latency delays by managing the out of order execution of relatively more instructions at any one time to help widen the window to fetch instructions and/or data from memory without introducing significant delays. The processor may, for example, use a larger instruction reorder buffer to manage at any one time relatively more instructions for out of order execution, a larger memory order buffer to manage at any one time relatively more data requests from memory for out of order data fetching, and/or a larger memory request queue to allow relatively more memory requests to be issued at any one time.
A typical processor may further help minimize memory access latency delays by using one or more relatively larger internal cache memories to store frequently accessed instructions and data. As the processor may then access such instructions and data internally, the processor helps reduce accesses to external memory.
Using larger buffers, queues, and/or cache memories, however, increases the cost and size of the processor.