The present invention relates to the field of computer systems; more particularly, the present invention relates to performing store instructions.
A computer system may be divided into three basic blocks: a central processing unit (CPU), memory, and input/output (I/O) units. These blocks are coupled to each other by a bus. An input device, such as a keyboard, mouse, disk drive, analog-to-digital converter, etc., is used to input instructions and data to the computer system via an I/O unit. These instructions and data can be stored in memory. The CPU retrieves the data stored in the memory and processes the data as directed by a set of instructions. The results can be stored back into memory or outputted via an I/O unit to an output device such as a printer, cathode-ray-tube (CRT) display, digital-to-analog converter, etc.
Data is stored back into memory as a result of the computer system performing a store operation. In the prior art, a store operation included an address calculation and a data calculation. The address calculation generated the address in memory at which the data is going to be stored. The data calculation produces the data that is going to be stored at the address generated in the address calculation portion of the store operation. These two calculations are performed by different hardware in the computer system and require different resources. In the prior art, the store operation is performed in response to one instruction, or one part of an instruction, wherein the data calculation is performed first and, once complete, the address calculation occurs as the operation goes to memory for execution.
One problem with current store operation is that they can stall the execution engine of the CPU during their execution. Many of today""s CPUs are pipelined processors in which multiple instructions are being executed concurrently, all at different stages of execution. Often, load instructions follow store instructions in the execution pipeline. When executed, these load instructions cause data at an address specified in the instruction to be loaded into the CPU. If a load instruction is to load data from an address to which a store instruction is going to store data, then the execution of the load instruction must be stalled, thereby stalling the execution engine, until the store instruction has finished execution. Only in this way will the subsequent load instruction be guaranteed of loading the most current data. If the subsequent load instruction is not to the same address as the preceding store instruction, then its execution does not have to be stalled. Therefore, the determination of whether a particular load instruction must be stalled is based on an address comparison between the address of the store operation and the address of any subsequent load operations. However, in the prior art, the address calculation for store operations doesn""t occur until the data has been calculated and the operation is going to memory. Thus, it is advantageous to compute the address of store operations as quickly as possible so the determination of whether to stall a particular load instructions can be made.
The present invention provides for performing store operations so that the address calculation may be performed sooner than that of the prior art. In this manner, any required stalling of subsequent load operations may be identified sooner than in the prior art.
A method and apparatus for executing store instructions is a computer system is described. The present invention includes a unit for separating the store operation into two operations. One of the operations includes an address calculation to determine the destination address in memory to which the data designated by the store instruction is to be stored. The other operation includes a data calculation to produce the data that is to be stored at the destination address. Both operations are executed by an execution unit independently of each other. Once both operations have been executed, then the destination address that is calculated and the data produced are recombined into a single operation that is dispatched by a memory control unit to memory.