1. Field of the Invention
This invention relates to a digital processor for two's complement computations, and more particularly to a processor of the kind incorporating an array of individual logic cells operating on single bit input. Arrays of this kind are referred to as "bit-level systolic arrays".
2. Discussion of Prior Art
Bit-level systolic arrays are known in the prior art, and are described in for example British Patent No. 2,106,287 B (Ref (1)), (U.S. Pat. Nos. 4,533,993 and 4,639,857). In FIG. 7 et sequi, Ref (1) describes the basic features of one form of bit-level systolic array for matrix-vector multiplication. The FIG. 7 device consists of a rectangular array of individual logic cells each connected to its row and column neighbours. Each cell has a specified logic function, but no gate level constructional details are given. The array includes intercell clocked latches for bit storage and advance along array rows and down array columns. Each cell evaluates the product of input data and coefficient bits received from neighbouring cells. The product is added to input carry and cumulative sum bits. The manner in which bits propagate through the array is governed by the form of computation to be executed. In Ref (1), FIG. 7 relates to multiplication of a vector X by a matrix W to form a product vector Y, the matrix W having coefficients with a value of + 1 or -1. The vectors X and Y represent digital numbers in two's complement form.
The matrix W is input to a first array edge one diagonal per clock cycle. Successive coefficients of X are input to a second array edge orthogonal to the first and in a bit parallel, word serial bit staggered manner; i.e. bits of like significance of different coefficients are input in succession to the respective row, but bits of each word with significance differing by one are input to adjacent rows with a time delay of one clock cycle per row. Coefficient bits propagate along array rows, and are multiplied by successive matrix coefficients to provide contributions to product vector element bits at a third array edge. Carry bits pass between adjacent cells evaluating bits one level higher in significance.
FIG. 9 of Ref (1) relates to a logic cell suitable for use with the FIG. 7 array, and extending its application to multiplication by a matrix of coefficients +1, -1 or 0. This requires two bits to define each matrix coefficient as opposed to one bit previously. Each cell has extra inputs to accommodate this. However, no gate level constructional details of the logic cells are given.
FIG. 15 of Ref (1) relates to a processor for executing a convolution operation, i.e. a convolver. It operates on all positive input data and coefficients, and comprises a rectangular array of gated full adder logic cells with row and column connections; i.e. each cell comprises a full adder with an AND gate connected to one input to act as a multiplier of two input bits. Data and coefficient bits propagate in counter flow along array rows, and product bits are accumulated in cascade down array columns. The array is connected to a full adder array arranged to sum contributions to like convolution results. To avoid the generation of unwanted bit-level partial products, individual bits of each word are separated by zeros. This means that part of the array is idle at any given time, since some of the cells are computing zero products.
British Patent No. 2,144,245 B (corresponding to U.S. Pat. No. 4,686,645 (Ref (2)) relates to a bit-level systolic array for matrix-matrix multiplication. It employs an array of gated full adder logic cells with row and column connections as in Ref (1), FIG. 15. Each cell recirculates its output carry bit to its carry input, since it computes bits in ascending order of significance on successive cycles. Multiplicand matrices move in counterflow along array rows. Contributions to product matrix elements accumulate down array columns. The contributions are grouped appropriately by array output adder trees, which are switchable for separation of different matrix elements.
The use of so-called "guard bands" is described in Ref (2). This relates to the extension of input numbers with extra zero bits to allow for output product terms to have greater length without overlap between adjacent terms. It is commonly referred to as "word growth". Ref (2) largely relates to computations involving all positive numbers, but FIG. 9 illustrates an array cell suitable for processing two's complement numbers. It is a gated full adder as before, and includes a control line to provide for appropriate products to be complemented. In addition, extra electronic components are required to be added to the array output accumulator to introduce a correction factor. The array in fact produces erroneous results, and a correction term must be applied by the output accumulator. This term has the value 2.sup.m -2.sup.2m-1 in the case of multiplication of two words each of m bits. This implementation of two's complement arithmetic arises from the Baugh-Wooley algorithm. However, Ref (2) does not provide a gate level description of the array logic cell construction required to implement complementation of negatively weighted partial products.
British Patent No. 2,147,721 B (Ref (3)) (corresponding to U.S. Pat. No. 4,701,876) relates to a bit-level systolic array for matrix-vector multiplication. It is addressed to the problem of reducing the number of array logic cells which are effectively idle. Ref (3) employs switchable array output accumulation and cell clocking arranged to provide for bit movement in adjacent rows on alternative cycles. By this means, full cell utilisation is achieved. As in Refs (1) and (2), the array is rectangular and multiplicand bits move in counterflow along array rows. In addition, as in Ref (2), carry bits are recirculated on respective cells, and guard bands are employed to extend input digital words to provide for output word growth.
In order to provide for two's complement computation, Ref (3) envisages the use of a control line to complement appropriate products. In addition, output accumulation is to be corrected for the presence of unwanted terms. Here again, there is no gate level description of an array logic cell for two's complement arithmetic.
More recent prior art in the bit-level systolic array area relates to the use of stationary multiplicand coefficients each associated with a respective array logic cell. This is discussed by Urquhart and Wood in the GEC Journal of Research, Vol. 2, No. 1, 1984, Ref (4). It is implemented in published British Patent Application Nos. 2,168,509 A, (corresponding to U.S. Pat. No. 4,777,614) 2,187,579 A (corresponding to U.S. Pat. No. 4,885,715) and 2,192,474 A (corresponding to U.S. Pat. No. 4,833,635) (Refs (5), (6) and (7)). Of these, only Ref (5) addresses the problem of two's complement multiplication. It relates to matrix/vector multiplication in the special case when the matrix coefficients are restricted to the values +1, 0 and -1. It observes that multiplication by +1 and 0 are straightforward, but that multiplication by -1 is more complex. The latter requires bits to be complemented and 1 added to least significant bits. However, Ref (5) merely specifies a logic function required to implement this procedure, no gate level description of such a cell is given.
The foregoing prior art demonstrates a general problem in bit-level systolic arrays, that of dealing efficiently with input data which may be positive or negative. There is no difficulty with all positive input data. As has been said, the prior art approach to dealing with positive and negative data has been to employ the Baugh-Wooley algorithm. This is undesirably complex for two reasons, these being the need to use control bits and the need to correct the accumulator. Futhermore, where guard bands are necessary, these must contain zeros, which conflicts with two's complement arithmetic and conventional digital circuits adapted for it. In two's complement arithmetic, the word length of a number is increased by replicating the most significant (sign) bit; i.e. 101 would be extended in five bits to 11101. In consequence, conventional digital arithmetic circuits arranged to receive sign extended inputs and generate sign extended outputs are inappropriate. In practice, bit-level systolic arrays employing the Baugh-Wooley algorithm have used specially adapted circuits which are undesirably complex.