The present invention relates to the field of memories for computers and particularly to random access memory (RAM) for computers.
Most RAM's can perform concurrent reading and writing but only to the same address. Without the ability to read and write concurrently to different addresses, two cycles are required, one to perform reading and one to perform writing to the different addresses. Such a two cycle operation requires twice as much time and hence performance may be twice as slow in some circumstances.
Computers are of many different designs and include, for example, those operating in accordance with the IBM ESA/390 architecture. One such well-known computer is the Amdahl 5995-A computer. In that computer, the instruction unit (I-Unit) pipeline is a six-stage pipeline consisting of stages D, A, T, B, X, and W for the pipeline processing of instructions. Instructions having additional time for processing may add segments Z, Z+1, Z+2 and so on after the W segment.
The function of the D stage is to collate the necessary information to reference storage in the A, T, and B stages. The D-stage generates the effective address and selection of the access key to be used in the reference to storage. The A, T, and B stages fetch operands/data from storage using the current key. The X stage is for executing arithmetic or other operations using the fetched operands/data to form results. The W (write) stage writes the results of operations to architecturally defined registers or storage.
A computer operating in accordance with the above pipelining is represented as follows in TABLE A:
TABLE A ______________________________________ I.sub.1 DATBXW flow 1 I.sub.2 DATBXW flow 2 I.sub.3 DATBXW flow 3 I.sub.4 DATBXW flow 4 I.sub.5 DATBXW flow 5 I.sub.6 DATBXW flow 6 time.fwdarw. ______________________________________
In TABLE A, the six instructions I.sub.1, I.sub.2, . . . , I.sub.6 are introduced into the instruction register (IDR) with, for example, a one cycle offset. If processing at any of the D,A,T,B,X or W segments cannot be completed in one cycle, then the pipeline must be sowed down so that the operation requiring more time can complete.
In prior art pipeline systems, without the ability to read and write RAM memory concurrently, an additional interlock was employed in the pipeline, delaying the read flow for the memory until the write flow had finished. Such a delay required slowing down the pipeline and was implemented with an interlock, named the Register Access Interlock (RAI). An example is given below of a pipeline sequence attempting to perform concurrent read and write operations.
TABLE B shows attempted concurrent read and write operations using non-RAI operation and TABLE C shows concurrent read and write operations using RAI operation.
TABLE B __________________________________________________________________________ 1. Write RA14 D A T B X W Z Z1 flow 1 2. -- D A T B X W Z Z1 flow 2 3. -- D A T B X W Z Z1 flow 3 4. Read RA16 D A T B X W Z Z1 flow 4 RAM Write RAM Read __________________________________________________________________________
In the example of TABLE B, the flow 4 may be invalid in prior art systems in which a read and write cannot occur simultaneously. In order to overcome the problem of TABLE B, an RAI interlock can be introduced into the pipeline as represented by TABLE C.
TABLE C __________________________________________________________________________ Write RA14 D A T B X W Z Z1 flow 1 -- D A T B X W Z Z1 flow 2 -- D A T B X W Z Z1 flow 3 Read RA16 D A T B B X W Z Z1 flow 4 RAM Write RAI Interlock RAM Read __________________________________________________________________________
In TABLE C, a one cycle delay is introduced into flow number 4 as indicated by two B segments using RAI Interlocks when a concurrent read and write are requested. If flows 2 and 3 also perform write operations, they can introduce up to two more cycles of delay further reducing performance.
Accordingly, because of the delays introduced when concurrent read/write accesses to memory are required, there is a need for an improved RAM complex that enables concurrent reading and writing to different addresses in memory.