Digital signals having words of a number W of bits may be subjected to parallel processing, serial processing, or processing that combines features of parallel and serial processing. Parallel processing of the W-bit words wherein the W bits flow in respective bit streams for simultaneous individual processing allows relatively high rates of processing with relatively low latency. However, processing circuitry is in large part replicated W-fold with attendant cost in terms of operating power and digital hardware. In monolithic integrated circuit constructions, more die area is consumed because of the increased hardware requirements. Serial processing, wherein the W bits of each word are sequentially processed, does not require W-fold replication of hardware. However, processing is slower and latency in terms of clock cycles is longer than for parallel processing.
To obtain favorable trade-offs between speed of processing and digital hardware requirements, the W-bit words can each be divided into W/n subwords or digits of n bits each, providing W is a multiple of n. Then, the digits are serially subjected to parallel processing in n parallel bit streams.
M. J. Irwin and R. M. Owens describe a specific system of this general sort in "Digit-Pipelined Arithmetic as Illustrated by the Paste-Up System: A Tutorial", Computer, April 1987, pages 73-85. Irwin and Owen espouse a system wherein successive digits of a word are supplied in order of decreasing significance of their bits. Irwin and Owen in FIG. 3 of their article show a pipelined adder for such a system. That adder requires two single-bit addition steps per bit of each digit. Irwin and Owen avoid the need to wait for a carry that ripples up from the least significant end by using signed-digit or redundant arithmetic.
Such arithmetic is described by A. Avizienis in "Signed-Digit Number Representations for Fast Parallel Arithmetic", IRE Transactions in Electronic Computers, September 1961, pages 389-400. Signed-digit arithmetic is redundant in that positive and negative digits are differently represented. Essentially, signed arithmetic costs a sign bit per digit, rather than just one sign bit per word. A pipelined arithmetic that is efficient is desirable, however, since a digital hardware saving of almost one-n.sup.th would then be possible.
If Irwin and Owen are correct in their opinion that most-significant-digit-first processing mandates redundant arithmetic, then least-significant-digit-first processing must be employed in order to use efficient digits.
S. G. Smith and P. A. Denyer describe the use of an efficient digital arithmetic with two-bit digits in "Radix-4 Modules for Higher-Performance Bit-Serial Computation", IEE Proceedings, Vol. 134, Pt. E, No. 6, November 1987, pages 271-276. They denominate normal bit-serial data communication carried out on a single wire as being "radix-two" bit-serial data communication. In radix-two-bit-serial data communication, they note, computational elements have one logical input per input operand. An W-bit data word is transmitted least significant bit first and is processed in W clock cycles, one bit per clock cycle. Smith and Denyer propose what they term "radix-four bit-serial" data communication which is performed concurrently on a pair of wires, one carrying even-numbered bits, the other odd-numbered bits. The concurrent bit pairs, or radix-four digits, represent side-by-side bit places from the data word; and data are transmitted in order of increasing bit significance. That is, the relatively less significant digits of a word are transmitted before the relatively more significant bits of a word. Computational elements for the radix-four digital data have two logical inputs per input operand; and a W-bit, (W2)-digit data word is processed in W2 clock cycles. Radix-four bit-serial data communication is also described in less particular terms in an earlier-published S. G. Smith, M. S. McGregor and P. B. Denyer paper "Techniques to Increase the Computational Throughput of Bit-Serial Architectures", IEEE Proceedings of ICASSP 1987, April 1987, pages 543-546.
U.S. patent application Ser. No. 182,602, filed 18 Apr. 1988 by P. F. Corbett and R. I. Hartley, entitled A CELL STACK FOR VARIABLE DIGIT WIDTH SERIAL ARCHITECTURE, assigned to General Electric Company and not acknowledged by this disclosure to constitute prior art adversely affecting the patentability of the present invention, is of interest. This application defines "digit-serial arithmetic" wherein W-bit operand words are grouped on the basis of bit significance into W/n successive digits of n bits each, which digits occur in successive clock intervals in order of increasing significance with passage of time. The n parallel bit streams that provide serial digit flow are accompanied by a control signal identifying the partitioning between successive words. Application Ser. No. 182,602 indicates that digit-serial data processing using digits of four to eight bits usually provides the best tradeoffs between throughput rate and efficient utilization of monolithic-integrated-circuit die area. That is, radix-16, radix-32, radix-64, radix-128, or radix-256 digits are indicated to be generally preferable to radix-four digits. Application Ser. No. 182,602 notes that these optima had not been previously appreciated in the serial computational arts.
Of particular interest in the above-referred-to November 1987 Smith and Denyer paper, insofar as the invention herein described is concerned, is the radix-four cascade adder of FIG. 6b in that paper. This adder is similar to cascade adders for digit-serial numbers for higher radix digits, as described in application Ser. No. 182,602.
A digit-serial linear combining apparatus that provides for comparison of two digit-serial operands as well as for selectively additively or subtractively combining them is attractive, it is here pointed out, since a plurality of such apparatuses may be utilized together with other elements to perform division of one digit-serial operand by another. This can be done using non-restoring division in two's complement, which type of division is generally described on pages 113-116 of Jean-Loup Baer's book Computer Systems Architecture, copyright 1980, Computer Science Press, Inc., Potomac, Md., which pages are incorporated herein by reference.