When an Ethernet switch chip is designed, it is usually necessary to use a large-capacity multi-port memory, such as a 2-read and 1-write (supporting 2 read ports and 1 write port simultaneously) memory, a 1-read and 2-write memory, a 2-read and 2-write memory or a memory with more ports.
Usually, a supplier generally provides only one read or write memory, one 1-read and 1-write memory, and two read or write memories. Thus, a designer can only construct a multi-port memory based on the basic memory units described above.
As shown in FIG. 1, which is a structural schematic view of a digital circuit of a typical 6T SRAM, it can be known that the 6T SRAM consists of six MOS transistors.
With reference to FIG. 2, for convenience of description, FIG. 1 is simplified in the way that the four MOS transistors in the middle are simplified into two inverters that are connected back to back. A true value of a memory unit is stored on the left side, a Inverted value of the memory unit is stored on the right side, the MOS transistor on the left is connected with a bit line X and the MOS transistor on the right is connected with a bit line Y.
In a data reading process, the bit line X and the bit line Y are charged to certain voltage values in advance. Then, a certain voltage is applied to a word line. The two MOS transistors on the left and the right are conducted. The MOS transistor on the left outputs the stored true value, e.g., “0”, onto the bit line X. The MOS transistor on the right outputs the stored inverted value, e.g., “1”, onto the bit line Y. The bit line X and the bit line Y outputs the true value “0” of the memory unit through a connected differential sense amplifier at last.
In a data writing process, for example, a datum “1” is written in the memory unit. The bit line X inputs “1”, and the bit line Y inputs “0” to form complementary input. Then, the word line is opened. The two MOS transistors on the left and the right are conducted. In this way, the true value is forcibly modified into “1” and the negated value is forcibly modified into “0”. It can thus be known that reading and writing of the 6T SRAM cannot be performed simultaneously. Correspondingly, the supplier can provide a dual-port SRAM memory.
With reference to FIG. 3, which is a structural schematic view of a digital circuit of a dual-port SRAM memory, two MOS transistors are added on the basis of the forgoing 6T SRAM. Not only that, to prevent the true value of the SRAM memory from flipping during data reading, four MOS transistors in the middle are larger than the four corresponding MOS transistors of the 6T SRAM. As shown by data provided by a certain manufacturer under 14 nm, the 6T SRAM is 0.064 square microns while an 8T SRAM is 0.201 square microns which is 3.14 times of the former.
With reference to FIG. 4, to prevent the SRAM memory which has both reading and writing functions from an excessive area and to increase the memory density of the memory, usage of the 8T SRAM is avoided as much as possible. Thus, the above problem is avoided through line changes on the basis of the 6T SRAM. Correspondingly, on the basis of the 6T SRAM, one word line is segmented into a left one and a right one, such that two read ports can be made for simultaneous operation or one write port is made. In this way, data read from the left MOS transistor and data read from the right MOS transistor can be performed simultaneously. It should be noted that the data read from the right MOS transistor need to be inverted for use. Meanwhile, for not affecting the data reading speed, a sense amplifier for reading needs to be a pseudo-differential amplifier. Thus, the 6T SRAM keeps its area unchanged. The only cost is to double the word line so as to keep the overall memory density basically unchanged.
Therefore, on the basis that the port type of the SRAM is 1-read or 1-write, 2-read or 2-write, and 1-write or 2-read, the number of ports of the SRAM is increased by customized design, for example, a method for modifying the memory unit, and algorithm design.
As shown in FIG. 5, which is a schematic view of a read-and-write operation procedure of a 2R1W formed by customized design of another embodiment in the prior art, the ports of the SRAM can be increased through the customized design. In an example of FIG. 4, one word line is segmented into two word lines, and the number of read ports is increased to two. In the prior art, a time-sharing operation technology, namely, performing a read operation on a rising edge of a clock and a write operation on a falling edge of the block, can be adopted to expand a basic 1-read or 1-write SRAM into a 1-read and 1-write SRAM. That is, 1-read operation and 1-write operation can be performed simultaneously with the memory density basically unchanged.
The period of the customized design is generally long, as spice simulation is required, and a memory compiler is also needed to generate the SRAM of different sizes and types. For suppliers, it usually takes six to nine months to provide a new type of SRAM, and such a customized design is strongly related to the specific process (such as 14 nm and 28 nm of GlobalFoundries or 28 nm and 16 nm of TSMC). Once the process changes, the customized-designed SRAM library needs to be redesigned.
As shown in FIG. 6, which is a schematic view of a read-and-write operation procedure of a 2R1W memory formed by an algorithm design in an embodiment of the prior art, the algorithm design is to realize the 2R1W memory through an algorithm on the basis of an existing SRAM type provided by the suppliers and has the greatest advantages of avoiding customized design and saving time. Meanwhile, it has nothing to do with a technology library, and can be easily transplanted between different technology libraries.
In this embodiment, taking constructing of a 2R1W SRAM based on an SRAM2P as an example, the SRAM2P is an SRAM which can support 1 read and 1 read/write. That is, two read operations, or one read and one write operation, can be simultaneously performed on the SRAM2P.
In this embodiment, the 2R1W SRAM is built based on SRAM2P by replicating one SRAM. In this example, SRAM2p_1 on the right is a copy of SRAM2P_0 on the left. During specific operations, the two SRAM2Ps serve as a 1-read and 1-write memory. During data writing, data are written in the left and right SRAM2Ps. During data reading, data A are regularly read from SRAM2P_0, and data B are regularly read from SRAM2P_1, such that one write operation and two read operations can be made concurrently.
In the process of this algorithm design, there needs an additional copy of the SRAM2P memory and area doubling of the memory, which results in large overhead.
As shown in FIGS. 7a and 7b, which are schematic views of read-and-write operation procedures of 2R1W memories formed by an algorithm design in another embodiment of the prior art, in this embodiment, a logically entire block of 16384-depth SRAM is segmented into four blocks of 4096-depth SRAM2Ps with the numbers of 0, 1, 2 and 3. A 4096-depth SRAM with the number of 4 is added to solve a read-and-write conflict. For reading of the data A and B, the two read operations can be kept concurrent forever. When addresses of the two read operations are in different SRAM2Ps, since any one of the SRAM2Ps can be configured into a 1R1W type, no read-and-write conflict exists. When addresses of the two read operations are in the same SRAM2P, e.g., SRAM2P_0, since the same SRAM2P can only provide up to 2 ports for simultaneous operation, at this time, the ports are taken up by the two read operations. If exactly one write operation is to be written into SRAM2P_0, the data are written to the fourth SRAM2P_4 of the memory.
In this embodiment, there needs a memory block mapping table to record which memory block stores valid data. As shown in FIG. 7b, the memory block mapping table has the same depth as a memory block, namely 4096 depths. In each entry, the numbers from 0 to 4 of all memory blocks are sequentially stored after initialization. In the example of FIG. 7a, as SRAM2P_0 has a read-and-write conflict during data writing, data are actually written into SRAM2P_4. At this time, the read operation reads corresponding content in the memory mapping table. The original content is {0, 1, 2, 3, 4}, which becomes {4, 1, 2, 3, 0} after modification. The number of the first block and the number of the fourth block are exchanged, which represents that the data are actually written into SRAM2P_4. Meanwhile, SRAM2P_0 becomes a backup entry.
During data reading, it is necessary to firstly read a memory block number mapping table of the corresponding address, to check which memory block the valid data are stored in. For example, when data of the address of 5123 are to be read, content stored at the address 1027 (5123−4096=1027) of the memory block number mapping table is read first. Content at the address 1027 of a corresponding memory block is read in accordance with the number in the second line.
For the data writing operation, the memory block number mapping table needs to provide a 1-read and 1-write port. For the two data reading operations, the memory block number mapping table needs to provide two read ports. Thus, the memory block number mapping table needs to provide three read ports and 1 write port in total. These four access operations have to be performed simultaneously.
Therefore, in contrast with the first design, this algorithm design saves ¾ SRAM in size but needs an additional memory block number mapping table. Each time of read or write operation needs reading of the mapping table first, increasing access delay. Meanwhile, the memory block number mapping table needs four access ports, including three read ports and one write port, which cannot be satisfied until a special SRAM design is adopted or a register array is directly adopted.