The present invention is generally directed to a system for processing digital data in which data are processed in portions that are smaller than the word size, the size of the portions being optimally selected to maximize throughput efficiency, as that term is defined herein. More particularly, these digital signal processing systems are neither fully parallel nor fully serial in their architectures, but rather exhibit an intermediate architecture selected on the basis of optimizing a measure of performance based upon speed and circuit size.
In fully parallel (or word-parallel) digital signal processing architectures, all bits of a data word, n in number, are processed simultaneously by the circuitry. This architecture has the advantage of relatively high processing speed, but suffers from the disadvantage that fully parallel architectures for each bit of a word replicate circuit elements and interconnections between elements, each of which replications tends to consume a commensurate additional amount of die area in a monolithic integrated circuit. Interconnections between monolithic integrated circuits for parallel data are multi-wire and a considerable number of interconnection terminals or "pins" must be provided for each integrated circuit to implement those multi-wire connections.
On the other hand, fully serial digital signal processing architectures process one bit at a time in each clock cycle. These circuits have the advantage of simplicity, ease of design and, most importantly, they require minimal amounts of circuitry and so take up only a small amount of die area in a monolithic integrated circuit. Also single-wire interconnections between monolithic integrated circuits are made possible by serial digital signals, which is important when the restrictions upon the number of interconnection terminals or "pins" available for such connections are pressing. Within a monolithic integrated circuit the single-conductor interconnections between circuit elements tend to appropriate less chip area than the multi-conductor interconnections between elements that characterize fully parallel architectures.
Serial architectures also tend to exhibit a substantial amount of latency. That is, because of the serial design, a relatively large number of clock cycles can elapse between the time that an input bit is received and the time that output information related to the input bit is provided by the circuitry. However, circuit speed is generally sufficiently fast once the latency period has elapsed. Also, when a number of serial computations are to be performed in a data-flow pipeline, later computations can begin before earlier ones finish, which tends to reduce overall latency in the system. Accordingly, throughput is not so low as to preclude utilization of this architecture. The main advantage of serial computation is the need for only a small area for the processing elements and their electrical interconnections. The drawback, however, is that throughput is often lower than otherwise desired. Equivalent throughput can often be approached by more traditional non-pipelined Von Neumann architectures.
A widely used fully serial architecture employs bit-serial signals in which a serial stream of bits de.TM.scribes a succession of data words bit by bit, in order of increasing significance, where those data words represent two's complement numbers. This serial stream of data bits is accompanied by a signal indicating when one data word finishes and another commences, which signal can be a signal that is a ONE when the most significant bit of a data word occurs in the serial stream of data bits and that is otherwise a ZERO.
Data-flow pipeline architectures are recognized as being appropriate to the implementation of a large class of algorithms such as those that appear in digital signal processing applications. There have been two major approaches to data flow architecture, namely fully parallel and fully serial implementations. These architectures are discussed broadly above. Both of them have been studied extensively.
Many algorithms, especially in the areas of digital signal processing and graphics applications, have a constant throughput and can be performed with a constant latency. These algorithms are suitable for direct implementation in hardware using pipelined data-flow architectures. Unfortunately, many algorithms require more operations, and hence more individual operators than can be accommodated on a single very large scale integrated circuit (VLSI circuit) using fully parallel arithmetic or logic. On the other hand, bit-serial systems often do not provide a sufficiently high throughput. Furthermore, the structure of many algorithms makes it difficult to avoid these problems by decomposing the data processing so as to dispose different portions of the circuitry on separate integrated circuits.
Fully-parallel computational elements have been one of the main objects of study in computer arithmetic. Even with the advent of VLSI, fully-parallel computational elements are not well suited to data-flow architectural treatment, however, because their replicated digital hardware causes a tendency towards excessive size (as measured with respect to utilization of chip area). Furthermore, the multi-conductor interconnections within an integrated circuit are difficult to route unless the die size is allowed to be larger than one would wish.
In connection with fully-parallel computation in data flow architectures, a technique known to designers (particularly those engaged in the design of digital filters) is to employ plural-path networks for "plural-phase" or "polyphase" data processing. See the M. G. Bellanger, G. Bonnerot and M. Coudrese paper Digital Filtering by Polyphase Network: Application to Sample Rate Alteration and Filter Banks. (IEEE Transactions Acoustics and Speech
Signal Processing, Vol. ASSP-24, No. 2, pages 109-114, April 1976). See also pages 79-98 of the R. E. Crochiere and L. R. Rabiner book Multirate Digital Signal Processing, copyright 1983 by Prentice-Hall, Inc., Englewood Cliffs, N.J. 07632. In plural-phase data processing a stream of digital words supplied at an original sample rate is considered to comprise a succession of cycles, each cycle containing a plurality p in number of successive words. The p words in each cycle are considered as separate phases of the cycle. These phases may be identified by the consecutive ordinal numbers zeroeth through (p-1).sup.th assigned in accordance with occurrence of the words representative of those phases in the cycle. Each word phase is used to form a separate sample stream, the sample rate of which is one-p.sup.th that of the original sample rate; and calculations are performed at the lower sample rate on each of the sample streams. The results of these plural-phase calculations are then combined to generate results at the original sampling rate. Plural-phase data processing permits a relatively high throughput rate for a system, while calculations can be performed at reduced rates.
Another technique that is used by digital circuit designers to slow the rates at which data processing needs to be done is a procedure known as "banking". An operator that is to process a stream of data at a higher throughput rate is simulated by parallelly processing segments of that data stream in a plurality, p in number, of operators operating at a lower throughput rate one-p.sup.th as fast as the higher throughput rate. Successive segments of the data streams are displaced one sample word from each other in the banking procedure. When banking is employed in transverse filtering, each segment of the data stream spans the number of sample words in the filter kernel. The same filter kernel weights each segment of data to determine each successive sample word of filter response, and the component filter responses parallelly generated at the lower throughput rate are then sequentially polled at the higher throughput rate to supply the complete filter response at that higher throughput rate.
The present inventors perceive that the use of arithmetics that use non-redundant plural-bit digits including multiple-bit as well as dual-bit digits greatly expands the range of design alternatives, lying between fully parallel and fully serial architectures, that are available to the integrated circuit designer. One can design systems, using a small digit size where high through-put is not so stringent a requirement and the space available on an integrated-circuit die for digit hardware is at a premium, and using a larger digit size where higher through-put rate is necessary. One can change digit size to adjust to the number of pins available for interconnection between integrated circuits or to solve routing problems for connections within an integrated circuit die.
The particular arithmetic favored by the inventors is a digit-serial arithmetic in which each word is a two's complement number, of n bits, n being a positive integer that is a multiple of another positive integer m. The submultiple of m, is the number of bits in each digit of the word. The digits of a word are successively supplied to data flow architecture in order of their significance, least significant digit first and most significant digit last. The order of bits within digits is prescribed according to the significance of the bits within its digit. The sign bit is the most significant bit of the word and is contained in the last digit of the word. The flow of digits is accompanied by another signal that indicates how the flow of digits may be partitioned into individual words.
While the indication may be furnished during the first digits of words, the inventors find it is preferable to furnish the indication during the last digits of words. Different digit-serial operations may be controlled during the first digits of words and during the last digits of words, respectively. It is usually more economical of hardware to derive the former indications from the latter indications by unit digit-interval delay than it is to derive the latter indications from the former indications by [(n/m)-1]-digit-interval delay. Bit-serial processing may be considered to be a special case of digit-serial processing, where digit size is one bit wide.
The invention is directed to digital standards converters that are useful in in systems using digit-serial signals.