A storage device, for example a storage array comprising a number of storage cells, is accessed via read-ports and write-ports. Whenever a read- or write-access to said storage array is to be performed, an address is applied to the storage array, and data is written to or read from the storage array via said read-ports and/or write-ports.
It is known to provide a number of read-ports and/or write-ports, in order to allow for simultaneous, separate read- and write-accesses, which may occur during one and the same clock-cycle.
As long as the simultaneous read- and write-accesses are directed to different addresses, no hazards occur, as each access reads from or writes to a different storage location. Even if two simultaneous read-accesses are directed via two separate read ports to one and the same storage address, there is no hazard. The content of said address is simply forwarded to both read ports that have addressed said storage location. Thus, simultaneous read-accesses to one storage location can easily be performed.
A more difficult situation arises as soon as one write- and one read-access are directed to one and the same address in storage. The data written to said storage address via a write port should be obtained, in the same cycle, by the read-access. This feature of immediately directing the data applied to a write port to a read port addressing the same storage location is called "write-through".
There are two ways how a write-through can be implemented. A first solution is to divide the clock cycle in a number of subcycles, with the first subcycle being responsible for the write-access, and with the read-access taking place during a second subcycle. Thus, it is possible to obtain, in the read-subcycle, data that has been written to said storage address during the write-subcycle, because said read-subcycle has been delayed with respect to said write-subcycle. Implementing a write-through via such a cascaded calculation implies that it is necessary to divide the clock-cycle in subcycles. This means that several subtasks have to be carried out, one after the other, in the same clock cycle, and therefore, each clock cycle has to be larger than a certain minimum length. Therefore, a subdivided clock cycle implies that the clock frequency can not be indefinitely augmented.
Another possible way of implementing a write-through is to provide extra logic at each of the write-ports and read-ports, in order to detect address matches between any of the write-ports and any of the read-ports. Whenever such a match is detected, data that is to be written to a certain storage location can immediately be forwarded to a respective read-port addressing said storage location. A disadvantage of the extra logic in the write- and read-paths is to be seen in the resulting performance degradation, and in the consumption of chip area.
Next, the case is to be considered that more than one write access towards a certain storage location occurs during one and the same cycle. In case identical data is written, via two different write ports, to one storage location, the solution to this hazard is rather simple: one pipe has to suppress the write-access of the other pipe, because one single write access is fully sufficient.
But there also exists the case that at least two write-accesses towards one address are performed in one and the same cycle, with said write accesses attempting to write different data. One could think of an array of status bits, which is accessed via several pipes. Each access may modify different bits, and therefore, the data written by each of the different write-accesses might differ. As it is necessary to record all changes to said status bits, even if they occur simultaneously, a total suppression of all except one of the write-accesses would not lead to correct results.
One solution to the problem of simultaneous write-accesses to one storage address is to provide means for a cascaded calculation of the write-accesses. In each subcycle, one write-access is taken care of, and by this sequential consideration of write-accesses, it is possible to maintain a correct status of the array at each point in time. Again, the disadvantage of dividing the clock-cycle is that limits are imposed on the maximum possible clock frequency.
Another solution to the problem of simultaneous write-accesses to one and the same storage address is to implement extra logic in the write paths in order to combine several write accesses to one resulting write-access. In case an array of status bits is updated via several write pipes, a large amount of difficult extra logic would be required which would slow down performance. Especially in case of a large number of write pipes, all possible combinations of write-accesses would have to be considered in said extra logic.
All the solutions described so far are based on one multiport cell being accessed by a plurality of write pipes via several write ports. Let us consider the example that said storage array holds status information which is updated, via several write pipes, by various on-chip facilities. This requires physical write paths connecting each of said facilities to the central array of said status bits. Thus, a multiport cell always represents a "hotspot", which has to be connected to a lot of different chip locations and which therefore imposes severe restrictions on chip layout and chip wiring.