Computer systems may employ a multi-level hierarchy of memory, with relatively fast, expensive but limited-capacity memory at the highest level of the hierarchy and proceeding to relatively slower, lower cost but higher-capacity memory at the lowest level of the hierarchy. At the highest level of the memory hierarchy, computers commonly have register structures implemented, which are typically limited in capacity but provide very fast access thereto. Such register structures may be referred to as "register files," and various such register structures may be implemented for a system, such as an integer register structure and floating point register structure. A register structure enables high speed memory access, and is typically capable of satisfying a memory access request (e.g., a read or write request) in one clock cycle (i.e., one processor clock cycle). Various lower levels of memory may be implemented including a small fast memory called a cache, either physically integrated within a processor or mounted physically close to the processor for speed, as well as the main memory (e.g., the disk drive) of a computer system.
Static random access memory (SRAM) is typically implemented for register structures of a computer system for storing data therein. Generally, SRAM memory is a type of memory that is very reliable and very fast. Unlike dynamic random access memory (DRAM), SRAM does not need to have its electrical charges constantly refreshed. As a result, SRAM memory is typically faster and more reliable than DRAM memory. Unfortunately, SRAM memory is generally much more expensive to produce than DRAM memory. Due to its high cost, SRAM is typically implemented only for the most speed-critical parts of a computer, such as for register structures. However, SRAM memory may be implemented for other memory components of a computer system, as well. Moreover, types of memory other than SRAM (e.g., other types of RAM) may be implemented within a computer system for a register structure.
To enable greater efficiency in processing instructions, multiple ports are commonly implemented within a computer system. For instance, multiple ports may be implemented such that each port is capable of satisfying a memory access request (e.g., a read or write instruction) in parallel with the other ports satisfying such a memory access request. Accordingly, various register structures have been developed to enable access thereto by multiple ports. That is, multi-ported register structures are commonly implemented in the prior art to enable multiple ports to access the register structure to satisfy a memory access request.
Register structures of the prior art are typically implemented with dual-ended writes through an N-channel field effect transistor (NFET) into a latch. FIG. 1A illustrates a typical dual-ended SRAM cell 100 of the prior art. The exemplary implementation of FIG. 1A illustrates a multi-ported SRAM structure, which comprises a typical SRAM cell comprising cross-coupled inverters 126 and 128 for storing data (i.e., for storing one bit of data). Additionally, NFETs 102 and 112 are provided, which enable writes from a first port (i.e., port 0). That is, a write is accomplished to the SRAM cell by passing a voltage level across NFETs 102 and 112 into the cross-coupled inverters 126 and 128. Also, a second port (i.e., port 1) is coupled to the SRAM cell 100 by implementing NFETs 122 and 124, which enable writes from the second port to the SRAM cell 100. The multi-ported SRAM structure 100 of FIG. 1A is well-known in the art and is commonly implemented in integrated circuits of the prior art. The SRAM cell 100 of FIG. 1A is a memory cell capable of storing one bit of data (i.e., a logic 1 or a logic 0). Thus, many of such SRAM cells 100 are typically implemented within a system to provide the desired amount of SRAM memory.
Either of the two ports (i.e., port 0 and port 1) coupled to the SRAM cell 100 may write data into the cell to satisfy a memory write request. As shown, BIT_P0, NBIT_P0, and WORD_0 lines are implemented to enable a write for port 0 to the SRAM cell 100, and BIT_P1, NBIT_P1, and WORD_P lines are implemented to enable a write for port 1 to the SRAM cell 100. The BIT_P0 and BIT_P1 lines may be referred to herein as data carriers for port 0 and port 1, respectively, and NBIT_P0 and NBIT_P1 may be referred to herein as complementary data carriers for port 0 and port 1, respectively. The operation of this prior art implementation is well known in the prior art, and therefore will be described only briefly herein. Typically, the BIT_P0 and BIT_P1 lines are held to a high voltage level (i.e., a logic 1), unless one of them is actively pulled to a low voltage level (i.e., a logic 0). For instance, when writing data from port 0 to the SRAM cell 100, the BIT_P0 line is actively driven low by an outside source (e.g., an instruction being executed by the processor) if the outside source desires to write a 0 to the SRAM cell 100, and the NBIT_P0 line is held to a high voltage level (the opposite of BIT_P0). Otherwise, if an outside source desires to write a 1 to the SRAM cell 100, the BIT_P0 line remains high and the NBIT_P0 line is pulled low. Thereafter, the WORD_0 line is fired (e.g., caused to go to a high voltage level), at which time the value of the BIT_P0 line is written into the SRAM cell 100. More specifically, the voltage level of BIT_P0 is transferred across NFET 102 and the voltage level of NBIT_P0 is transferred across NFET 112 to accomplish a write of the value of BIT_P0 to DATA of the cross-coupled inverters 126 and 128.
A similar operation is performed when writing data from port 1 to the SRAM cell 100. For instance when writing data from port 1 to the SRAM cell 100, the BIT_P1 line is actively driven low by an outside source (e.g., an instruction being executed by the processor) if the outside source desires to write a 0 to the SRAM cell 100, and the NBIT_P1 line is held to a high voltage level (the opposite of BIT P1). Otherwise, if an outside source desires to write a 1 to the SRAM cell 100, the BIT_P1 line remains high and the NBIT_P1 line is pulled low. Thereafter, the WORD_1 line is fired, at which time the value of the BIT_P1 line is written into the SRAM cell 100. More specifically, the voltage level of BIT_P1 is transferred across NFET 122 and the voltage level of NBIT_P1 is transferred across NFET 124 to accomplish a write of the value of BIT_P1 to DATA of the cross-coupled inverters 126 and 128. The data value written into the SRAM cell 100 (e.g., a logic 0 or logic 1) is shown as DATA in FIG. 1A, and the complement of such value is shown as NDATA. The SRAM register structure illustrated in FIG. 1A is referred to as a dual-ended write structure because it utilizes two lines to write a data value into the SRAM cell 100. For instance, it requires both a data carrier and a complementary data carrier (e.g., BIT_P0 and NBIT_P0) to write a value to the SRAM cell 100 from port 0, and it requires both a data carrier and a complementary data carrier (e.g., BIT_P1 and NBIT_P1) to write a value to the SRAM cell 100 from port 1.
Typically, multiple SRAM cells, such as SRAM cell 100, are connected to a single BIT line (e.g., BIT_P0) and a single NBIT line (e.g., NBIT_P0). Accordingly, a single BIT line may be utilized to carry data to/from multiple ones of SRAM cells 100 for a port. Therefore, even though only SRAM cell 100 is shown, it should be understood that many such SRAM cells may be connected to the BIT_P0 and NBIT_P0 lines for port 0, as well as to the BIT_P1 and NBIT_P1 lines for port 1, to form a group of SRAM cells. Additionally, it should be recognized that additional ports may be coupled to the SRAM cell 100. Thus, even though only two ports (port 0 and port 1) are shown as being coupled to SRAM cell 100, SRAM cell 100 may have any number of ports coupled thereto. In general, it is desirable to have a large number of ports coupled to each SRAM cell 100 in order to increase the number of instructions that may be processed in parallel, and thereby increase the efficiency of a system.
The dual-ended register structure illustrated in FIG. 1A is problematic in that it requires an undesirably large amount of surface area for its implementation. More specifically, the dual-ended register structure of FIG. 1A requires an undesirably large amount of high-level metal tracks (or lines) to be implemented. For example, the metal tracks for implementing BIT_P0, NBIT_P0, WORD_0, BIT_P1, NBIT_P1 and WORD_1 are typically high-level metal tracks that span several register structures. Such high-level metal tracks are commonly referred to as "metal-two tracks" or "metal-three tracks," whereas lower-level metal tracks that are implemented within a single register structure, for example, are commonly referred to as "metal-one tracks." Because of the size and spacing requirements of high-level tracks, such high-level tracks often require a relatively large amount of surface area. Typically, high-level tracks consume more surface area than is required for the component parts (e.g., FETs) of a register structure. That is, a small device geometry process may be utilized for the component parts of a register structure, such as FETs, wherein the component parts may require much less surface area than the amount of surface area required for implementing the high-level metal tracks for a register structure. For example, in a small device geometry process commonly used today, component parts of a register structure may be approximately 0.18 micron in process size (i.e., the actual drawn size for the component parts). Additionally, the component parts (e.g., FETs) of a register structure may typically be implemented below the high-level metal tracks. Accordingly, the high-level tracks implemented for register structure 100 typically dictate the amount of surface area required for such register structure. As a result it becomes desirable to limit the number of high-level metal tracks that are required to be implemented in order to reduce the overall surface area required for the register structure. More specifically, it is desirable to provide an optimum number of high-level metal tracks that require the minimum amount of surface area below which the actual components of a register structure may be implemented. Ideally, the number of high-level metal tracks required for a register structure design would require no more than the surface area required to implement the actual components (e.g., FETs) of the register structure.
In the prior art implementation of FIG. 1A, three high-level lines are required to be implemented for each port that is coupled to the SRAM cell 100. As shown in FIG. 1A, high-level wires or metal traces must be implemented for three lines for port 1 (i.e., BIT_P1, NBIT_P1, and WORD_1). Therefore, if a third port were implemented for the SRAM cell 100, three additional high-level lines (i.e., BIT_P2, NBIT_P2, and WORD_2) would be required to be added to the design of FIG. 1A. As a result, the prior art multi-ported structure of FIG. 1A is undesirable because it requires an undesirably large number of high-level lines to be implemented for each port coupled to the SRAM cell 100. Thus, the prior art implementation of FIG. 1A results in an undesirably high cost and an undesirably large consumption of surface area for each port implemented therein.
Turning to FIG. 1B, a second implementation of a prior art register structure is illustrated. The exemplary implementation of FIG. 1B illustrates a multi-ported SRAM structure having two ports coupled thereto for performing write operations. The implementation of FIG. 1B utilizes a dual-ended write structure much like the register structure described above in FIG. 1A, except the register structure of FIG. 1B includes an inverter, such as inverter 130, within the individual SRAM cell 150 to locally generate a NBIT signal for a port. Accordingly, because the NBIT signal for each port is included only within the individual SRAM cell 150, the number of higher-level metal tracks required is reduced below that required for the implementation of FIG. 1A. That is, rather than a NBIT line for each port being implemented as a high-level metal track as in FIG. 1A, FIG. 1B provides a design in which the NBIT signal is implemented as a low-level metal track within the individual SRAM cell 150.
The multi-ported SRAM structure of FIG. 1B includes a typical SRAM cell 150 comprising cross-coupled inverters 126 and 128 for storing data (i.e., one bit of data) within the SRAM cell 150. As with the implementation of FIG. 1A, the structure of FIG. 1B further comprises NFETs 102 and 112, which enable writes to the memory cell from a first port (i.e., port 0). Also, a second port (i.e., port 1) is coupled to the SRAM cell 150 by implementing NFETs 122 and 124. The multi-ported SRAM structure 150 of FIG. 1B is also well-known in the art and is commonly implemented in integrated circuits of the prior art.
Either of the two ports (i.e., port 0 and port 1) coupled to the SRAM cell 150 may write data into the cell to satisfy a memory write request. As shown, high-level lines BIT_P0 and WORD_0 and low-level line NBIT_P0 are implemented to enable a write for port 0 to the SRAM cell 150, and high-level lines BIT_P1 and WORD_1 and low-level line NBIT_P1 are implemented to enable a write for port 1 to the SRAM cell 150. The operation of this prior art implementation is well known in the prior art, and therefore will be described only briefly herein. Typically, the BIT_P0 and BIT_P1 lines are held to a high voltage level (i.e., a logic 1), unless one of them is actively pulled to a low voltage level (i.e., a logic 0). For instance, when writing data from port 0 to the SRAM cell 150, the BIT_P0 line is actively driven low by an outside source (e.g., an instruction being executed by the processor) if the outside source desires to write a 0 to the SRAM cell 150, and the NBIT_P0 line is held to a high voltage level (the opposite of BIT_P0). Otherwise, if an outside source desires to write a 1 to the SRAM cell 150, the BIT_P0 line remains high and the NBIT_P0 line is pulled low. Thereafter, the WORD_0 line is fired (e.g., caused to go to a high voltage level), at which time the value of the BIT_P0 line is written into the SRAM cell 150. More specifically, the voltage level of BIT_P0 is transferred across NFET 102 and the voltage level of NBIT_P0 is transferred across NFET 112 to accomplish a write of the value of BIT_P0 to DATA of the cross-coupled inverters 126 and 128.
A similar operation is performed when writing data from port 1 to the SRAM cell 150. For instance when writing data from port 1 to the SRAM cell 150, the BIT_P1 line is actively driven low by an outside source (e.g., an instruction being executed by the processor) if the outside source desires to write a 0 to the SRAM cell 150, and the NBIT_P1 line is held to a high voltage level (the opposite of BIT_P1). Otherwise, if an outside source desires to write a 1 to the SRAM cell 150, the BIT_P1 line remains high and the NBIT_P1 line is pulled low. Thereafter, the WORD_1 line is fired, at which time the value of the BIT_P1 line is written into the SRAM cell 150. The data value written into the SRAM cell 150 (e.g., a logic 0 or logic 1) is shown as DATA in FIG. 1B, and the complement of such value is shown as NDATA. As with the register structure of FIG. 1A, the SRAM register structure illustrated in FIG. 1B is referred to as a dual-ended write structure because it utilizes two lines to write a data value into the SRAM cell 150. For instance, it requires both a data carrier and a complementary data carrier (e.g., BIT_P0 and NBIT_P0) to write a value to the SRAM cell 150 from port 0, and it requires both a data carrier and a complementary data carrier (e.g., BIT_P1 and NBIT_P1) to write a value to the SRAM cell 150 from port 1.
In this implementation, an inverter is included within the SRAM cell 150 to generate each NBIT signal locally. For example, inverter 130 is implemented to invert the BIT_P0 signal, thereby generating NBIT_P0, and inverter 140 is implemented to invert the BIT_P1 signal, thereby generating NBIT_P1. As shown in FIG. 1B, inverter 130 comprises PFET 132 and NFET 134, and inverter 140 comprises PFET 142 and 144. Accordingly, each port coupled to the register structure 150 for write operations requires an inverter, which comprises two FETs, to be implemented within the register structure 150. Thus, while the implementation of FIG. 1B reduces the number of high-level metal tracks required (e.g., by implementing the NBIT line for each port as a low-level metal track within the individual register structure 150, the implementation of FIG. 1B requires an undesirably large number of components that must be implemented within the register structure 150.
Typically, multiple SRAM cells, such as SRAM cell 150, are connected to a single data carrier line (e.g., BIT_P0). Accordingly, a single data carrier line may be utilized to carry data to/from multiple ones of SRAM cells 150 for a port. Therefore, even though only SRAM cell 150 is shown, it should be understood that many such SRAM cells may be connected to the BIT_P0 line for port 0, as well as to the BIT_P1 line for port 1, to form a group of SRAM cells. Additionally, it should be recognized that additional ports may be coupled to the SRAM cell 150. Thus, even though only two ports (port 0 and port 1) are shown as being coupled to SRAM cell 150, SRAM cell 150 may have any number of ports coupled thereto. Again, it is generally desirable to have a large number of ports coupled to each SRAM cell 150 in order to increase the number of instructions that may be processed in parallel, and thereby increase the efficiency of a system.
The dual-ended register structure illustrated in FIG. 1B is problematic in that it requires an undesirably large number of components to be implemented for each port coupled to the SRAM cell to perform write operations. In this prior art implementation, two FETs, one inverter (which comprises two additional FETs), and two high-level lines are required to be implemented for each port that is coupled to the SRAM cell 150. As shown in FIG. 1B, two FETs (i.e., NFETs 122 and 124) are required to be implemented to enable port 1 to be coupled to the SRAM cell 100 for write operations. Additionally, one inverter (i.e., inverter 140) that comprises PFET 142 and NFET 144 is required to locally generate NBIT_P1 within the register structure 150 for performing a write operation for port 1. Furthermore, high-level wires or metal traces must be implemented for two lines for port 1 (i.e., BIT_P1 and WORD_1). Therefore, if a third port were implemented for the SRAM cell 150, two additional FETs, one additional inverter, and two additional lines (i.e., BIT_P2 and WORD_2) would be required to be added to the design of FIG. 1B. As a result, the prior art dual-ended register structure of FIG. 1B is undesirable because it requires an undesirably large number of components to be implemented for each port coupled to the SRAM cell 150. In fact, the prior art implementation of FIG. 1B requires more components than is required for the implementation of FIG. 1A because FIG. 1B requires an inverter to be included for each port coupled to the SRAM cell 150 in order to generate the appropriate NBIT signals required for writing data from a port. Thus, the prior art implementation of FIG. 1B results in an undesirably high cost and an undesirably large consumption of surface area for each port implemented therein.