1. Field of the Invention
This invention relates to the field of pipelining memory accesses. In particular, the invention is related to pipelining ordered memory accesses.
2. Description of Related Art
As processor performance continues to outpace memory performance, reducing the latency in memory accesses is critical in achieving high performance in today's computer systems. One method to reduce this latency is pipelining. Pipelining is a mechanism in which several stages or phases in the processing of instructions are carried out in an overlapped or parallel manner such that overall instruction throughput is optimized.
In a typical prior art architecture, a bus activity is hierarchically organized into operations, transactions, and phases. An operation is a bus procedure that appears atomic to software even though it may not be atomic on the bus. A transaction is the set of bus activities related to a single bus request. A transaction may contain up to six phases. A phase uses a specific set of signals to communicate a particular type of information. The six phases of the prior art processor bus protocol are: arbitration, request, error, snoop, response, and data.
In the arbitration phase, a bus agent which is not the current bus owner requests the bus. In the request phase, the bus owner drives request and address information on the bus. In the error phase, any parity errors triggered by the request are reported. In the snoop phase, the address information is determined if it references a valid or modified (dirty) cache line. In addition, the snoop results also indicate whether a transaction will be completed in-order or may be deferred for possible out-of-order completion. In the response phase, the status of the transaction is reported (whether failure or success, whether completion is immediate or deferred, etc.). The data phase is needed when the bus agent requests a data transfer such as read or write.
Instructions are typically executed in order, i.e., in the same order that they are originally written. Therefore, the transactions initiated by the instructions are required to be ordered. If, for some reason, the transaction cannot be performed immediately, then it has to be deferred. There is a signal asserted by the system to inform the processor that the transaction has been deferred so that the processor is prevented from issuing further order-dependent transactions. Because the deferred transaction takes some time to be completed, the processor has to wait until it receives a signal stating the status of the deferred transaction before it can issue the next order-dependent transactions. This waiting time imposes a penalty to the overall system throughput, especially in high performance system.
It is, therefore, desirable to have a mechanism that allows the processor to continue issuing in-order transactions without waiting for the status of the deferred transactions.