The present invention relates to an arithmetic unit and to a digital signal processor. The present invention also relates to a method of scheduling multiplication and addition in an arithmetic unit, a method of selectively delaying adding and a method of selectively adding during a first or second clock cycle.
In many digital signal processors, dedicated blocks of circuitry carry out specific functions, such as multiplication or addition. A wide variety of digital building blocks are known for each function that is needed, such as multiplication and addition. Typically, the digital signal processor is designed by selecting and coupling together circuit blocks from a library of standardized designs.
In turn, the circuit blocks in the library represent a series of compromises between the amount of area each of the circuit blocks occupies on an integrated circuit, such as a digital signal processor, and the rapidity with which each of the circuit blocks is able to carry out the function associated with the circuit block. A circuit block that is constructed to reduce delay in providing an output signal is also very likely to require a relatively large circuit area and also a relatively large amount of electrical power. Conversely, circuit blocks that are optimized to require relatively little circuit area within an integrated circuit and to consume relatively less electrical power also tend to be poorly optimized for operational speed.
Digital signal processors typically include an ensemble of large numbers of interconnected circuit blocks. Each of these circuit blocks is selected to meet timing requirements for worst-case input signals, which are often input signals having a most significant bit that is a logical xe2x80x9c1xe2x80x9d. Schedulers that coordinate interactions between these circuit blocks include timing constraints based on the worst-case inputs. As a result, operation of the ensemble of circuit blocks forming the digital signal processor is often slowed relative to what is necessary in order to process the actual input signals, because the actual input signals often differ from and are more benign than the worst-case input signals.
For example, ripple carry adders may be designed to be quite compact. However, because results ripple through ripple carry adders, and because this process takes time, the most significant output bits are available late in the time period allotted for operation of the ripple carry adder. A carry bit may be precalculated to make that portion of the result available earlier in time, but this requires additional circuitry, which also results in a doubling of the area required for the adder. Additionally, the amount of electrical power required in order to provide the result increases.
What is needed is a capability for obtaining results as rapidly as is possible from circuit blocks forming digital signal processors, without undue compromise of integrated circuit area or power dissipation.
In a first aspect, the invention provides an arithmetic unit configured to perform multiply and add operations on three operands A, B and C, where A is a multiplicand, B is a multiplier and C is an addend. The arithmetic unit includes a multiplier unit having an input stage configured to receive operands A and B from a data pump, and includes an output to provide a product AB. A register has an input coupled to the multiplier unit output and has an output. A multiplexer has a first data input coupled to the multiplier unit output, a second data input coupled to the register output, a toggle command input and a data output. The arithmetic unit also includes a bypass decision block having an input stage configured to receive the operands A and B, and includes an output coupled to a scheduler and to the toggle command input. The bypass decision block is configured to set the multiplexer to couple the first data input to the data output when most significant bits of the operands A and B have values below a first threshold. The arithmetic unit also includes an adder having a first data input coupled to the multiplexer data output and configured to receive the product AB, a second data input configured to receive the addend C and an output to provide an output AB+C.
In another aspect, the invention provides a digital signal processor. The digital signal processor includes a data input, a data pump having an input coupled to the data input and having an output, a scheduler having inputs and an output and an arithmetic unit having inputs coupled to the data pump output. The arithmetic unit operates on the data inputs to provide an output in response to commands from the scheduler. The arithmetic unit includes a multiplier unit having an input stage configured to receive operands A and B from the data pump and an output to provide a product AB. The arithmetic unit also includes a register having an input coupled to the multiplier unit output and having an output and a multiplexer having a first data input coupled to the multiplier unit output, a second data input coupled to the register output, a toggle command input and a data output. The arithmetic unit further includes a bypass decision block having an input stage configured to receive the operands A and B. and an output coupled to the scheduler and to the toggle command input. The bypass decision block is configured to set the multiplexer to couple the first data input to the data output when the operands A and B have values below a multiplier threshold tm. The arithmetic unit additionally includes an adder having a first data input coupled to the multiplexer data output configured to receive the product AB, a second data input configured to receive the addend C and an output to provide an output AB+C.
In a further aspect, the invention provides a method of scheduling multiplication and addition in an arithmetic unit configured to multiply a multiplicand A and a multiplier B to provide a product AB and to add an addend C to the product AB to provide an output signal AB+C. The method includes coupling the multiplicand A and the multiplier B to first and second inputs to a multiplier. The multiplier provides the product AB at an output. The method also includes coupling the multiplicand A and the multiplier B to first and second inputs of a bypass decision block and determining, by the bypass decision block, when most significant bits of the multiplicand and the multiplier have values below a first threshold. The method further includes toggling a multiplexer to couple a first multiplexer data input coupled to the multiplier output to accept the product AB and couple the product AB from the first multiplexer data input to the multiplexer output when the bypass decision block determines that the most significant bits have values below the first threshold.
In another aspect, the invention provides a method of selectively adding a product AB to an addend C during a first or a second clock cycle in an arithmetic unit configured to multiply a multiplicand A and a multiplier B to provide the product AB and to add the addend C to the product AB. The method includes coupling, during the first clock cycle, the multiplicand A and the multiplier B to first and second inputs to a multiplier having an output to provide the product AB and coupling the multiplicand A and the multiplier B to first and second inputs of a bypass decision block during the first clock cycle. The method also includes determining, by the bypass decision block and during the first clock cycle, when the multiplicand A and the multiplier B have values above a multiplier threshold tm and coupling the product AB from the multiplier output to an adder input during a second clock cycle when the bypass decision block determines that the multiplicand A and the multiplier B have values above the multiplier threshold tm. The method further includes coupling the product AB from the multiplier output to the adder input during the first clock cycle when the bypass decision block determines that the multiplicand A and the multiplier B do not have values above the multiplier threshold tm.