1. Field
This invention relates to the synthesis of a functional block in logic of the type which can be used to perform a sum-of-addends operation, as commonly used for binary multiplication and to a method and apparatus for manufacturing an integrated circuit using the thus synthesised functional block.
2. Related Art
When modern integrated circuits (IC) designs are produced, these usually start with a high level design specification which captures the basic functionality required but does not include the detail of implementation. High level models of this type are usual written using high level programming language to derive some proof of concept and validate the model, and can be run on general purpose computers or on dedicated processing devices.
Once this has been completed and the model has been reduced to register transfer level (RTL) using commercially available tools, this RTL model can then be optimised to determine a preferred implementation of the design in silicon.
The implementation of some types of multiplier in hardware often involve the determination of a number of partial products which are then summed, each shifted by one bit relative to the previous partial product. Visually this can be considered as a parallelogram array of bits as shown in FIG. 1, where the black circles represent bits. The example of FIG. 1 shows the multiplication of four 4 bit numbers. The result of the multiplication is an 8 bit number.
FIG. 1 illustrates a straightforward multiplier but others can be implemented, such as Booth multipliers which will be discussed later in this specification. Some of these other multipliers do not arrange the bits in this parallelogram pattern. Nevertheless, they still have to perform a sum of addends in order to produce a result.
When such a multiplier is synthesised in RTL it produces a netlist of gates which can then be implemented in silicon. In many cases, the precision of the sum of addends required is lower than that provided by a full summation. Therefore, in some instances truncated multipliers are used which produced a less accurate result. A modified version of FIG. 1 is shown in FIG. 2. In this example the least significant whole k columns of bits are truncated (that is to say discarded) and to compensate for this a constant value represented by the hatched circles is added. Once the summation has been calculated, further truncation of the n-k least significant bits of the result can be performed to leave an approximation to the multiplication result. Thus, the truncation comprises the discarding of some of the columns and the adding of a constant to one or more of the remaining columns to provide the approximation to the multiplication results. Synthesis of such an arrangement in RTL will result in a smaller netlist and therefore will enable a multiplier to be manufactured using fewer gates and thus less silicon. This will reduce the cost.
The issue with truncating bits in sum of products operations is that it is complex to determine the effect of truncation and usually error statistics need to be gathered which is time consuming and can lead to many iterations being required during RTL synthesis to produce just one sum of addends unit. This problem becomes much worse with larger multipliers than those shown in the examples of FIGS. 1 and 2. Further complexity may arise when many separate multiplications are combined together and summed with a larger array as is shown in FIG. 3. This shows four separate multiplications x1*y1 to x4*y4 which are all combined in a single summation array. Each one of these multiplications can be truncated by different numbers of columns (k). This significantly further increases complexity in determining the effects of truncation on any error in the thus approximated result. The larger the sum of addends and the more multiplications which are combined the more the complexity rises.
It is therefore desirable to be able to construct a sum of addends function which minimises hardware implementation cost as the output of RTL synthesis while maintaining a known error profile. In other words, it will be desirable to reduce the complexity of the synthesised logic as much as possible through truncation while maintaining a known error in the thus approximated result, without the time consuming data manipulation required to gather error statistics. Any reduction in complexity results in a reduction in silicon area and hence also in costs and power consumption.