Memory devices such as register files are used in high performance microprocessors to store data due to their relatively fast access, ease of design and area efficiency. A register file is usually organized by bits and entries and FIG. 1 illustrates a prior art register file 100 with sixteen entries.
The prior art register file 100 has two segments of eight entries as illustrated by the bit cell segment 1 101 with entries [7:0] and the bit cell segment 2 102 with entries [15:8]. The bit cell segment 1 101 is connected with the local bit line 1 140, the pre-charge device or transistor 132 and a keeper device or transistor 134. The bit cell segment 2 102 is connected with the local bit line 2 145 and another pre-charge and keeper transistors (not shown).
The bit cell segment 1 101 has eight entries connected in parallel to the local bit line 1 140 to form a 8:1 multiplexer and is merged with the other eight entries in the bit cell segment 2 102 using the NAND gate 150 to form a 16:1 multiplexer. The 16:1 multiplexer is connected to the global bit line 180. The domino implementation in the prior art register file 100 requires the bit lines 140, 145 and 180 to be pre-charged to a supply voltage when the bit lines 140, 145 and 180 are inactive or not being accessed. Each entry in the bit line segments 101 and 102 is a source of leakage current when the bit lines 140, 145 and 180 are pre-charged to the supply voltage.
FIG. 2 illustrates a prior art clocking circuit 200 for the prior art register file 100. Each read word line for each entry in the bit cell segments 101 and 102 is clocked to create a clocked domino. The address signals (ADDR[3:0]) 222 are decoded to activate one of the read word lines (RDWL[15:0]) 238. Each read word line is connected with a respective delay stage in the RDWL delay stages 230. The prior art clocking circuit 200 creates clock distribution requirements and constraints in the prior art register file 100. The main clock (CLK) 202 has to be distributed to all the sixteen stages in the RDWL delay stages 230 and it may cause routing congestion, grid clock loading, and power dissipation.
Each delay stage slows the read access time of each entry by two gate delays due to the NAND gate 212 and the inverter 214. FIG. 3 illustrates a prior art timing diagram 300 of the prior art clocking circuit 200. The prior art clocking circuit 200 requires clock shielding to reduce the noise on the read word lines. The rising edge of the RDWL[0] 310 occurs after the rising edge of the main clock 202 due to the gate delays in the delay stage 1 232.