1. Technical Field
Embodiments of the present invention are related to the field of electronic devices, and in particular, to memory devices.
2. Description of Related Art
Referring to FIG. 1, conventional processor designs typically include one or more register files 10 located on the processor chip to provide data to the execution resources with very low latencies. Typically, a register file 10 includes memory array 12, and multiple read and write ports (not shown) to access selected word register entries in the memory array 12. The memory array 12 includes columns and rows of memory cells 14. Each memory cell 14 stores a single bit register entry (logic 0 or logic 1) identified as “Data”, with the register entries in a column forming a word register entry. Each port typically includes an address decoder (not shown) and a word-line driver (not shown). Multiple word-lines 15 are selectively driven one-at-a-time by the word-line driver, with one of the word-lines 15 being coupled to each of the columns of memory cells 14 so as to be able to provide an enable signal EN to each of the word register entries. The register file 10 also includes multiple local bit-lines 16, with one of the local bit-lines 16 being coupled to the memory cells 14 in one of the rows. In response to the enable signal EN provided over a selected word-line 15, a word register entry is fed, one Data bit at a time, over the multiple local bit-lines 16 to a single global bit-line 17. In one design, there are different bit-lines for read operations and write operations.
The register file 10 may be implemented with multiple domino logic circuits 18. Each local bit-line 16 forms a domino node for one of the domino logic circuits 18, with each bit-line 16 being coupled the drain of a PMOS precharge transistor P1 (pull-up device) and the drains of multiple, cascaded pairs of NMOS transistors N1 and N2 (pull down devices forming domino stages). The gate of the precharge transistor P1 is coupled to a precharge clock signal. The gates of the transistors N1 and N2, respectively, are coupled to a read-enable signal Rden and the Data state stored in one of the memory cells 14. The signal Rden is provided by ANDing a clock signal CK and the enable signal EN via an AND gate 19. Typically, the precharge clock signal and the clock signal CK have the same frequency and phase. The voltage state of the Data stored in the memory cell 14, consisting of a logic 0 (low voltage state) or a logic 1 (high voltage state), drives the gate of the transistor N2. The precharge transistor P1 is connected between a supply voltage VCC and the associated local bit-line 16. The transistors N1 and N2 provide a series connection between the associated local bit-line 16 and ground.
The operation of each of the domino logic circuits 18 is divided into a precharge phase and an evaluation phase, with the mode of operation being delineated by the clock signals. When the precharge clock signal is low (logic 0), the local bit-line 16 is precharged to the supply voltage VCC by the precharge transistor P1. During this precharged phase, the evaluate transistors N1 are off, so that the pull down paths to ground are disabled. When the precharge clock signal is high (logic 1), the precharge transistor P1 is off and the evaluate transistor N1 is turned on. For example, to read the Data state in a given cell 14, the signal Rden for that cell is brought high so that the evaluate transistor N1 conducts. If Data is in a low voltage state, the transistor N2 does not conduct and prevents the associated local bit-line 16 from discharging. When the precharge clock signal subsequently goes low, there is no need to recharge the local bit-line 14. If Data is in its high voltage state, the transistor N2 conducts and allows the local bit-line 14 to discharge. When the precharge clock signal subsequently goes low, the local bit-line 14 is recharged through the precharge transistor P1. In summary, the local bit-line 14 is charged to an initial precharged state and then, depending on the voltage state of Data in a selected cell 14, the precharged state is maintained or discharged.
Each of the local bit-lines 16 is coupled to the global bit-line 17 through an inverter 20 and a NMOS pull-down transistor N3. When the local bit-line 16 evaluates to logic 0, then the transistor N3 pulls down the global bit-line 17 to logic 0 from its precharged state; hence, the local bit-line 16 provides its Data value to the global bit-line 17. The global bit-line 17 is coupled to a set-dominate latch (SDL) 22. The SDL 22 has coupled thereto its own precharge transistor P2, which is driven by the precharge clock signal. The SDL 24 is a dynamic state device used for holding a logic state, which in this case is the Data value.
Register file 10 may create significant power demands on the processor. Domino logic provides greater speed and lower loading than static logic in return for greater power dissipation. The register file 10 has clock power dissipation on the precharge clock nodes. Prior art designs have used clock gating to prevent the precharge clock node from switching under certain conditions when precharge may not be necessary. However, when precharge is necessary and all of the Data voltage states are low (logic 0s), then precharge clock will continue to dissipate power. In other words, even though the stream of zeroes does not cause the domino logic circuit 18 to toggle between charged and discharged phases, the logic gates of transistors P1 and N1 continue to toggle between high and low states in response to different phases of the clock signals, thereby consuming additional power.