Today, bit-serial processors are in widespread use. For example, bit-serial processors are commonly used to efficiently carry out pixel, or bit-plane, operations in image processing applications. See, for example, U.S. patent application Ser. No. 09/057,482, entitled "Mesh Connected Computer" (filed on even date herewith in the name of Abercrombie and under Attorney Docket No. 017750-351), which describes a system for performing image processing operations using arrays of coupled bit-serial processors. As the teachings of the present invention are useful in a system such as that described in the aforementioned patent application, the aforementioned patent application is incorporated herein in its entirety by reference. Those skilled in the art will appreciate, however, that the teachings of the present invention are broadly applicable in processors generally, irrespective of the particular form of processor in which the invention is employed (and irrespective of whether the processor is a bit-serial processor or a multiple-bit or parallel processor).
Generally, and for purposes of the discussion that follows, a bit-serial processor is any processor including an arithmetic logic unit configured to operate on single-bit, or few-bit, data and/or control inputs. The arithmetic logic used to construct such a bit-serial processor is typically minimal, and such logic is most often used to access and process only single-bit operands within a given clock cycle. Thus, an individual bit-serial processor typically provides an elemental computing platform. However, when many bit-serial processors are coupled in a strategic fashion, they are quite powerful, and extremely fast, particularly in applications in which a common operation must be performed simultaneously on many single-bit, or few-bit, operands. Such is often the case, for example, in image processing applications, wherein entire pixel-data bit-planes are manipulated in unison. See, for example, the above incorporated patent application.
By definition, then, conventional bit-serial processors require many clock cycles to perform multi-pass operations such as multiplying or dividing two multiple-bit numbers. Whereas a multiple-bit processor can employ considerable arithmetic and control logic to enable multiple-bit computations to occur within a single or very few clock cycles, conventional bit-serial processors expend many clock cycles performing multiple-bit computations in a multiple-pass fashion. This problem is exacerbated by the fact that bit-serial processors are often required to operate on numbers which are digitally represented using the IEEE standard floating point format or an equivalent.
The IEEE floating point standard (ANSI/IEEE Std 754-1985 Standard for Binary Floating Point Arithmetic) is predicated upon a need to represent as much information as possible in 32 bits (for single precision numbers). Thus, important information (e.g., the zero-ness of a number) is encoded and can be extracted only by examining multiple bits. Further, the standard specifies special treatment for "denormalized" numbers (i.e., operands falling outside the range of the IEEE 8-bit exponent, but still close enough that incomplete information on the operand value can be conveyed). Thus, the IEEE format is not well suited to bit-serial processors, requiring many clock cycles to execute even the most basic operations.
Nonetheless, an ability to quickly perform multiple-bit computations on floating point numbers is often critical to the overall performance of a bit-serial signal processing implementation. Consequently, there is a need for improved methods and apparatus for performing floating point operations in processors.