The present invention generally relates to computing machines and Integrated Circuits (ICs), and more specifically to a universal computing unit capable of performing multiple operations without program instructions.
A goal of IC design methodologies is to provide both high performance in relation to low power consumption and price, and high flexibility. However, traditional IC technologies, such as Applications Specific Integrated Circuits (ASICs) and Digital Signal Processors (DSPs), do not satisfy both goals. An ASIC provides high performance with low power consumption and price, but provides very low flexibility. A DSP provides high flexibility, but provides low performance in relation to power consumption and price because a DSP requires extensive programming complexity, control, and execution instructions to perform a complete application algorithm.
An IC typically performs multiple functions, such as addition, multiplication, filtering, Fourier transforms, and Viterbi decoding processing. Units designed with specific rigid hardware have been developed to specifically solve one computation problem. For example, adder, multiplier, multiply accumulate (MAC), multiple MACs, Finite Impulse Response (FIR) filtering, Fast Fourier Transform (FFT), and Viterbi decoding units may be included in an IC. The adder unit performs additional operations. The multiplier unit performs multiplication operations. The MAC unit performs multiplication and addition operations. Multiple MACs can perform multiple multiplication and addition operations. The FIR unit performs a basic filter computation. The FFT unit performs Fast Fourier Transform computations. And, the Viterbi unit performs a maximum likelihood decoding processing.
The FIR, FFT, and Viterbi units are specially designed to perform complicated filter, transform, and decoding computations. Multiple MACs may be able to perform these operations, but performing the operations requires complicated software algorithms to complete a computation. Thus, performing the FIR filtering, FFT, and Viterbi decoding computations with multiple MACs requires an enormous amount of processing time, which restricts the operations of the IC.
All of these units are implemented in rigid hardware to obtain the best performance of the specific operations. Thus, the functions performed by the units may be performed faster by the IC because the IC includes units to specifically perform certain operations. However, if an application does not need a provided operation, the hardware for the unused operation is wasted. For example, an IC may include FIR, FFT, and Viterbi units. If an application does not need to perform a Viterbi decoding operation, the Viterbi unit is not used by the IC because the unit can only perform Viterbi operations. This results in dead silicon because the silicon used to implement Viterbi unit is wasted or not used during the execution of the application.