(a) Field of the Invention
The present invention relates to a register file and a method for designing a register file and, more particularly, to a technique for designing a register file by using a cell base designing technique.
(b) Description of the Related Art
In general, deign for semiconductor devices, such as an ASIC (application-specific IC) or LSI, employs a cell base design technique, wherein a cell library is used which stores therein a large number of designed circuits in the form of modules. The designed circuits of the cell library, which are generally called hardware cells, have different scales and include small-size basic logic circuits such as AND gate, OR gate and flip-flop, medium-size circuit blocks such as ALU (arithmetic logic unit) and adder, and large-size circuits blocks (macro blocks) such as CPU and RAM. Each semiconductor device is designed by combining these hardware cells in the cell base designing technique, which can reduce the time length needed for the design of the semiconductor device and verification of the design accuracy.
For example, in the design of a semiconductor device having therein a built-in processor, the designer selects one of processor cores from the cell library in consideration of the circuit scale and throughput thereof. Each processor core in the cell library is designed by a dedicated design sector to have an optimum basic architecture and provided to the cell library in the form that allows easy installation in semiconductor devices. The designer of semiconductor devices installs the selected processor core in a desired semiconductor device while providing thereto peripheral resources required of the desired semiconductor device. By determining the peripheral resources depending on desired semiconductor devices, a single processor core having a basic architecture can be installed in a variety of semiconductor devices as processors having different peripheral resources.
FIG. 7 exemplifies architecture of a register file, which constitutes a part of a semiconductor device designed by using a conventional design technique. The register file includes a plurality of registers and has functions of writing data in a specified register and reading data from a specified register. Such a register file is described in a literature entitled “Computer Organization & Design” published from Nikkei BP corp., translated by Mitsuaki Narita from the original literature written by John L Hennessy and David A Patterson, 1996 ISBN 4-8222-8002-0, p678-p680.
In the example shown in FIG. 7, the register file 200 includes therein 4-bit registers Fi (i=0 to 3), a selection signal generator 210, an output port selector 220, four write ports WR_DATAj (j=0 to 3), and four read ports RD_DATAk (k=0 to 3).
The selection signal generator 210 includes therein four decoders DECj each corresponding to one of the write ports WR_DATAj, four AND gates ANDj and an OR gate OR. Each decoder DECj decodes a corresponding 2-bit write address signal WR_ADRS j to generate a 4-bit signal. Each AND gate ANDj calculates a logical product of the decoded 4-bit signal and a write enable signal WR_ENj delivered from a processor core (not shown) to deliver the logical product as a selection signal to the register Fi. The OR gate OR calculates a logical sum of the outputs from the AND gates ANDj to deliver the same as an activating signal αi to one of the registers Fi selected by one of the write address signals WR_ADSRj.
The output port selector 220 includes therein four multiplexers MUX corresponding to the output ports RD_DATAk, wherein each of the multiplexers MUX receives any of the data Qi stored in the registers Fi to converts the same as 4-bit data. Each multiplexer MUX selects one of the data Qi to be read out through the read ports RD_DATAk based on a 2-bit read address signal RD_ADRSk.
Each register Fi includes therein an input port selector 230 and a data storage 240. The input port selector 230 includes therein three multiplexers 231 to 233, and acts as a selector having an priority order specified among the input port selectors of the registers. The input port selector 230 selects one of the write ports WR_DATAJ to be connected to the data storage 240 based on the selection signal from the selection signal generator 210. If a plurality of write addresses WR_ADRSj concurrently specify the same register Fi, then the input port selector 230 selects the write address WR_ADRSj supplied through the write address port having a highest priority order, i.e., a lowest sequential number. In addition, if the write address WR_ADRSj does not specify any of the write ports WR_DATAj, then the input port selector 230 selects the data supplied through the write port WR_DATAJ having a highest or lowest priority order.
The data storage 240 includes therein a memory 241 and a clock gate 242. The memory 241 includes therein synchronous D-type flip-flops (D-FF) in number corresponding to the number of bits of the data to be stored. Each D-FF stores therein a 4-bit data, received through one of the write ports WR_DATAj, in synchrony with the clock signal CLK in a bit-by-bit basis. The clock gate 242 generates a logical product of the clock signal CLK and activating signal αi. Since the write port WR_DATAj is connected to an external data line or bus (not shown), the data from the external data line is delivered to the data input “D” of the D-FF of each register Fi, even if the corresponding register Fi is not specified for receipt of data. In this case, the clock gate 242 delivers a low-level inactivating signal to the clock inputs “C” of the registers Fi which are not specified to receive the data. Thus, the data stored in the memories 241 of these files are not updated by the received data.
Hereinafter, the design for the multiplexers in the output port selector 220 will be described with reference to FIGS. 8 to 10. FIGS. 8 to 10 exemplify the design description of the multiplexer in the output port selector 220 during the cell base design, the circuit configuration of a 2-input/1-output multiplexer stored in the cell library with a gate level notation, and the configuration of the 4-input/1-output multiplexer obtained by the cell base design, respectively. The multiplexer shown in FIG. 9 is selected from the cell library as a primitive cell based on the design description.
In general, the cell base design technique is such that the function of a circuit block is described in a hardware description language (HDL), and the resultant description is used for logical synthesis to obtain a circuit configuration of combined primitive cells in a gate level notation.
For designing the semiconductor device shown in FIG. 7, each multiplexer MUX of the output port selector 220 is described in a case sentence such as shown in FIG. 8. The cell library stores therein a large number of primitive cells in a gate level notation, the primitive cells including a 2-input/1-output multiplexer such as shown in FIG. 9. Thus, the function shown in FIG. 8 can be implemented by combining the 2-input/1-output multiplexers retrieved from the cell library to configure a 4-input/1-output multiplexer such as shown in FIG. 10. The multiplexer shown in FIG. 10 includes three multiplexers 221 to 223 each having a configuration shown in FIG. 9, and is installed in the; semiconductor device to be designed. It is to be noted that each multiplexer of the 4-input/1-output multiplexer shown in FIG. 10 may have a circuit configuration different from the circuit configuration shown in FIG. 9 depending on the tool for the logical synthesis and cell library used for the design.
There is a possibility that the circuit in the gate level notation obtained by the cell base design technique does not necessarily provide an optimum configuration for the desired semiconductor device because the designer obtains the circuit configuration by using the tool for the logical synthesis. For example, if the multiplexer in the output port selector 220 is designed by logical synthesis while combining together the multiplexers each having a gate level configuration shown in FIG. 9 to have the circuit configuration shown in FIG. 10, the resultant multiplexer does not necessarily provide a lower operating current depending on the data Qi stored in each register Fi, as detailed hereinafter.
It is assumed herein that the zero-th bits of the data Q0 to Q3 stored in the registers F0 to F3 are (0,1,0,1) as viewed from Q0 to Q3, and that the read address RD_ADRSO is (00). The first-stage multiplexers 221 and 222 in FIG. 10 select and deliver data Q0 and Q2, respectively, based on the least significant bit “0” of the read address RD_ADRS0. The second-stage multiplexer 223 selects and delivers data Q0, or “0”, based on the most significant bit “0” of the read address.
After the read address RD_ADRS0 shifts from (00) to (11), both the first-stage multiplexers 221 and 222 select and deliver data Q1 and Q3, respectively, and the second-stage multiplexer 223 selects and delivers data Q3, or “1”. It is to be noted that the multiplexer 222, the output of which is not selected by the second-stage multiplexer 223, also operates to shift the output thereof from “0” to “1” in this example. In view that each multiplexer dissipates operating current when the output of the multiplexer shifts from “0” to “1”, the multiplexer shown in FIG. 10 wastes the current due to the output shift of the unselected multiplexer 222.
FIG. 11 shows the configuration of a synchronous D-FF employed in the memory 241 in the register Fi. D-FF is of a master slave type, and thus includes a master latch 243 and a slave latch 244. The D-FF stores therein the data input through the data input “D” thereof in synchrony with the rising edge of the clock signal CLK. The master latch 243 shifts the output thereof based on the data input through the data input “D” during the low level of the clock signal CLK. The slave latch 244 stores therein data based on the potential of the output node of the master latch 243 at the rising edge of the clock signal CLK and delivers the stored data through the data output “Q”.
The data input “D” of the D-FF in the memory 241 in FIG. 7 receives data from the external data line due to selection of one of the write ports WR_DATAJ even when the write operation is not needed. In this case, even if the clock gate 242 fixes the clock input “C” of the D-FF at a low level, the output node of the master latch 213 follows the data input through the external data line, although the data stored in the slave latch 244 does not shift due to the cut-off by the input transfer gate of the slave latch 244. More specifically, although the data stored in the D-FF does not shift, the output node of the master latch 213 shifts from “0” to “1” or “1” to “0” depending on the data input through the external data line, thereby wasting the electric power.
In summary, the conventional register file wastes operating current during input of the write data and output of the read data.