Multi-port memories are widely used in electronic applications in which high-speed data transfer is critical, including, but not limited to, data buffering, video processing, data communications, etc. Multi-port memory (e.g., dual-port memory), unlike its single-port memory counterpart, is generally characterized by its ability to read data from or write data to the memory on one port while simultaneously reading a second piece of data from or writing a second piece of data to the memory on another port. Hence, each port provides a separate independent read and write access path for reading data from the memory, or writing new data into the memory. One embodiment of a multi-port memory is a two-port memory, such as a single-port read, single-port write (1R1W) memory, which has a dedicated read port and a dedicated write port.
Multi-port memory is typically implemented using static random access memory (SRAM). In a conventional single-port architecture, each bit in an SRAM cell is stored on four transistors that form two cross-coupled inverters operative as a storage element of the memory cell. Two additional transistors serve to control access to the storage element during read and write operations. A typical SRAM cell uses six transistors and is thus often referred to as a 6T SRAM. In a multi-port architecture, two additional access transistors are generally used for each additional port; hence two-port functionality would be provided by an eight-transistor (8T) SRAM, three-port functionality would be provided by a ten-transistor (10T) SRAM, and so on. However, because implementing a true monolithic multi-port memory can consume a significant amount of area and power on an integrated circuit (IC) chip, there have been various proposed memory architectures which employ single-port memory cells, often referred to as single port read/write (1RW) memories, each having their own inherent disadvantages.
In one known approach, often referred to as double-pumping, time-domain multiplexing of the memory clock is utilized. Using this approach, a two-port memory functionality might be achieved using multiple single-port memory cells, with half of the memory clock cycle being dedicated to read operations and the other half being dedicated to write operations. By multiplexing the clock in this manner, conflicts between read and write accesses of the same memory cell during a given memory cycle can be avoided. Although a savings in chip area can be achieved using this approach, the data path is now narrower and has less bandwidth compared to an implementation using true two-port memory cells, and thus the memory system must, overall, be slower. Since the memory is, in effect, required to run at twice the clock rate of a memory comprised of true two-port memory cells, the maximum frequency is typically low (e.g., about 400 MHz for a 45-nanometer (nm) IC fabrication process).
Another approach is to divide the dual-port memory into banks of single-port memory cells. Provided there are no bank conflicts (i.e., the read address and the write address do not require accessing the same single-port memory bank during the same memory cycle), the memory can theoretically run at the maximum frequency of the single-port memory cells. When a bank conflict does arise however, a pipeline stall will typically occur, resulting in a latency penalty and the need for complex arbitration or control logic outside of the memory. Moreover, the latency of the memory will not be constant, but will instead be dependent on the specific read and write addresses. The pipeline stall may also reduce effective memory throughput since there is only one memory access instead of two accesses during stall.