This application claims the benefit of Korean Patent Application No. 2004-5310, filed on Jan. 28, 2004, in the Korean Intellectual Property Office, the contents of which are incorporated herein in their entirety by reference.
1. Field of the Invention
The present invention relates to a digital multiplier, and more particularly, to a simplified 4:2 carry save adder (hereinafter, referred to as a CSA) and a 4:2 carry save adding method.
2. Description of the Related Art
Operation of digital computers typically includes efficient algorithms for using complicated logic circuits and other hardware. Numbers used in a digital computer are expressed in strings composed of 0s or 1s, and the computer hardware performs simple and basic Boolean operations on the binary number strings. All arithmetic operations are performed based on a hierarchical arithmetic operation, which typically requires that the arithmetic operation start with the simplest operation. The performance of a digital computer depends on a computer operation performed according to a specific computation method or a specific algorithm.
Multiplication is the most commonly executed arithmetic operation in current computer systems. High-speed multiplication is essential. Fast multiplication can be achieved using a Booth algorithm which reduces the total number of partial products by modifying a multiplier. Alternatively, fast multiplication can be achieved using a multiplier based on a Wallace tree which adds partial products according to its tree shape and gradually reduces the total number of partial products.
FIG. 1 illustrates a 32×32 multiplier based on a Wallace tree structure. Referring to FIG. 1, 16 partial products PP0 to PP15 produced using a 4 radix Booth decoding method are provided to CSAs. As illustrated in FIG. 1, 14 CSA cells arranged in 6 stages are required to add 16 partial products PP0 to PP15. Each of the CSA cells is composed of 3:2 CSA cells and receives three partial products (e.g., PP1 through PP3) and outputs two outputs, which are a sum SUM and a carry CRY.
In this Wallace tree-based multiplier, each of the CSA cells needs a computation space where 32 to 64 bits are to be processed. Hence, when the multiplier is implemented in hardware, it occupies a large silicon area. Since a sum calculation is slower than a carry computation until a final CSA level, a glitch occurs as the calculations proceed, thus decreasing the speeds of the calculations.
4:2 CSA cells instead of the 3:2 CSA cells may be used to increase the speed of a computation executed in the Wallace tree-based multiplier of FIG. 1. FIG. 2 illustrates a conventional 4:2 CSA cell which has four inputs A, B, C, and D and two outputs SUM and CRY and is composed of two 3:2 CSA cells 201 and 202. The 3:2 CSA cell 201 calculates a sum SUM of and a carry CRY of the three inputs A, B, and C and is shown in greater detail in FIG. 3. The 3:2 CAS cell includes two XOR gates 301 and 302 and four NAND gates 303, 304, 305, and 306. The sum SUM is obtained by the XOR gate 302 performing an exclusive OR (XOR) operation on an XOR of the inputs A and B obtained by XOR gate 301 and the input C. The carry CRY is obtained by NAND gate 306 performing a NAND operation on a NAND of the inputs A and B obtained by NAND gate 303, a NAND of the inputs B and C obtained by NAND gate 304, and a NAND of the inputs A and C obtained by NAND gate 305.
Because the conventional 4:2 CAS cell is composed of two 3:2 CSA cells as described above, it needs twice the number of logic gates required by a 3:2 CAS cell. That is, the conventional 4:2 CAS cell includes 4 XOR gates and 8 NAND gates. Transistors that form the gates of the conventional 4:2 CAS cell cause the same problem that the Wallace tree-based multiplier of FIG. 1 occupies a large silicon area.
Hence, there is a demand for a new 4:2 CSA cell capable of minimizing power consumption by decreasing the logic depth through logic optimization.