The invention lies within the field of computer chip design. More particularly, the invention relates to a method and an electronic computing circuit for operand width reduction for modulo adder followed by saturation.
In modern chip design the reduction of design time is a critical issue. Reuse of building blocks enables a reduction of design effort and design time. However, for different operations executed by a similar unit, specific instructions sometimes make a more complex design necessary. This especially applies for Single-Instruction-Multiple-Data (SIMD) units like e.g. Vector Multimedia Extension (VMX), Synergistic Processing Elements (SPE) or Supplemental Streaming SIMD Extensions 4 (SSE4) units. The data width, i.e. bit width, in these units is variable and depends on the instruction to be performed. The common bit widths are by the power of two, like 8, 16, 32, 64, etc. Operands and results are always of these bit widths.
In many cases during the calculation in the various compute units, intermediate results may not be presentable with the given bit width of the result. Intermediate calculation can demand to apply the modulo function, saturation or rounding based on the function performed. In these cases either the modulo function, saturation or rounding is applied on the intermediate result. These options add effort to the design as well as making an implementation inhomogeneous.
In some cases, even a combination of the above functions must be applied to intermediate results. For a final adder of a carry network, it is frequently required to first add with the modulo function to an intermediate bit width of N, followed by a saturation towards a smaller result bit width M. Such functionality can be required, e.g., in the final adder stage of a multiplier circuit that saturates the result to M bits.
An electronic computing circuit to perform these functions can be derived from EP 0 209 014 B1 by ignoring the carry output of the adder, i.e. using a modulo adder. A disadvantage of this micro architecture is that the adder device must be adapted to process input operands having a bit width of N. If other functions only need adders of the target width M this approach might increase the delay of these functions. Also, this approach disables a wider re-usage of existing designs or building blocks available for the target width M.
Up to now, no solution is known to perform such functionality within adder building blocks with a bit width equal to the bit width M of the result.