Programmable logic devices (“PLDs”) are a well-known type of integrated circuit that can be programmed to perform specified logic functions. One type of PLD, the field programmable gate array (“FPGA”), typically includes an array of programmable tiles. These programmable tiles can include, for example, input/output blocks (“IOBs”), configurable logic blocks (“CLBs”), dedicated random access memory blocks (“BRAMs”), multipliers, digital signal processing blocks (“DSPs”), processors, clock managers, delay lock loops (“DLLs”), and so forth. Notably, as used herein, “include” and “including” mean including without limitation.
One such FPGA is the Xilinx Virtex® FPGA available from Xilinx, Inc., 2100 Logic Drive, San Jose, Calif. 95124. Another type of PLD is the Complex Programmable Logic Device (“CPLD”). A CPLD includes two or more “function blocks” connected together and to input/output (“I/O”) resources by an interconnect switch matrix. Each function block of the CPLD includes a two-level AND/OR structure similar to those used in Programmable Logic Arrays (“PLAs”) and Programmable Array Logic (“PAL”) devices. Other PLDs are programmed by applying a processing layer, such as a metal layer, that programmably interconnects the various elements on the device. These PLDs are known as mask programmable devices. PLDs can also be implemented in other ways, for example, using fuse or antifuse technology. The terms “PLD” and “programmable logic device” include but are not limited to these exemplary devices, as well as encompassing devices that are only partially programmable.
For purposes of clarity, FPGAs are described below though other types of PLDs may be used. FPGAs may include one or more embedded microprocessors. For example, a microprocessor may be located in an area reserved for it, generally referred to as a “processor block.”
Floating-point operations based upon values normalized in accordance with IEEE-754 floating-point format are well known. Generally, there is single precision or double precision, which respectively are 32-bit wide and 64-bit wide values. A floating-point number consists of three distinct fields referred to as the sign (s), significand (b) and exponent (E). The value of a valid floating point number is defined as V=(−1)s*2E*(b0.b1b2 . . . bp-1) where b0 . . . bp-1 is the binary representation of the significand (referred to as f). The value p is the number of bits in the floating point number.
In single precision, bit positions 0 through 22 [22:0] are for the significand, and bit positions 23 through 30 [30:23] are for the exponent field. Bit position 31 is a sign bit, which is a logic 0 for positive values or a logic 1 for negative values. Additionally, for the exponent value, a bias value of 2^127 is used as is known. In double precision, bit positions 0 through 51 [51:0] are for the significand, and bit positions 52 through 62 [62:52] are for the exponent field. Bit position 63 is a sign bit, which is a logic 0 for positive values or a logic 1 for negative values. Additionally, for the exponent value, a bias value of 2^1023 is used as is known. Thus, the format for both single and double precision is basically the same other than the number of bits and the bias.
For example, the value of 9.9 in single precision is a logic 0 for the sign bit, a 10000010 for the exponent field, and a 1.00111100110011001100110 for the significand. The IEEE-754 standard requires that all valid floating-point numbers have a single unique representation so a leading logic 1 is assumed as part of the significand. This floating-point representation is gotten by continually dividing the original value 9.9 by 2 until a leading one is generated in the answer. The number of times this is done gives the unbiased exponent. The remaining value is then multiplied by 224 for single precision numbers and then converted to binary to get the significand. Following this procedure gives the following:                9.9/2=4.95 (unbiased exponent=1): continue        4.98/2=2.475 (unbiased exponent=2): continue        2.475/2=1.2375 (unbiased exponent=3): stop                    i. Biased exponent E=3+127=130=b10000010            ii. Signifand f=1.2375. Multiple by 223 to get 20761804.8=b1001111001100110011001100                        Sign=0        The final single precision representation of 9.9 is        
S EEE EEEE E fff ffff ffff ffff ffff ffff ffff 0 100 0001 0 001 1110 0110 0110 0110 0110 0110
A value which is expressed as a single digit in front of the radix, with or without some fractional remainder (or mantissa), multiplied by a base number (“base”) to a power is considered normalized form. By convention, values are stored as normalized numbers. If this is base 10, then this normalized form is known as scientific notation. Scientific notation is used in the following examples, as it is more easily followed than base 2, which is used by binary-based computers.
For example, the decimal value of 100 in scientific notation is 1×10^2. However, there are other possible expressions for the decimal value of 100. For example, 100×10^0, and 0.1×10^3 are two alternative ways of expressing the decimal value of 100. However, expressions of 100×10^0 and 0.1×10^3 for 100 are not normalized to what is the accepted standard format, namely a single digit in front of the radix followed by a mantissa multiplied by a base number with an exponent. Only the value 1×102 conforms to this requirement and is so considered the unique normalized form of this number.
However, as is known, if two numbers have the same base and the base is raised to the same power, the numbers may be added (or subtracted) using the significand only. So, for example, the normalized value of 2000, namely 2×10^3, may have its significand directly added (or subtracted) to the non-normalized value of 20, namely 0.02×10^3, to yield 2.02×10^3 (or 1.88×10^3). On the other hand these numbers may be multiplied (or divided) by performing the operation on the significand and then adding (or subtracting) the exponents. So, for example, 2×10^3 may be multiplied by 2×10^1 by multiplying the significands to get 2*2=4 and then adding the exponents to get 3+1=4. In normalized form this would be 4×10^4=4000. The same sort of logic applies to division.
Heretofore, for each floating-point operation that requires matching exponents (add, subtract), normalized inputs were obtained and if one or more of the exponents were different, then one or more normalized inputs is “exponent adjusted” to a non-normalized value so all inputs have equivalent exponents. This adjustment (also known as “exponent alignment”) means that significand values are shifted such that all inputs have a common exponent. The floating-point operation, which may be any of a variety of known arithmetic operations, is then performed on the sign bits, the significands, and the exponents of the inputs separately to provide a sign output, a significand output, and an exponent output. The three outputs are combined and normalized back to IEEE-754 compliant normalized form with a leading implied value of 1 in front of the radix. This normalization is done prior to a subsequent floating-point operation, and for the subsequent floating-point operation, the number may be exponent adjusted once again.
Floating-point operations that did not require matching exponents (multiply, divide) would perform their operation on their normalized inputs and then the possibly non-normalized result is normalized back to IEEE-754 compliant form with a leading implied decimal value of 1 in front of the radix. This normalization is done prior to a subsequent floating-point operation, and for the subsequent floating-point operation, the number may be exponent adjusted once again.
Current processor-based architectures use this set of three phases, namely exponent adjustment of input if needed, floating-point operation, and normalization of output if required, for each floating-point operation. Conventionally, floating-point operations are processed by a general-purpose floating-point processing unit (“FPU”). For a conventional general-purpose FPU, each value input to the FPU is in a normalized form and each output from the FPU is in a normalized form, and the normalized form for each input and each output is IEEE-754 compliant notation for each floating-point operation performed.
When instantiating circuitry for floating-point operations in programmable logic, conventionally a single general-purpose FPU core is used to process all floating point operations. However, such an FPU core may include unused functionality, and thus significant programmable logic resource overhead consumed by instantiation of such an FPU core may otherwise go unused. Also, the FPU core can create a computational bottleneck in the potentially parallel compute fabric as well as routing congestion to connect all of the disparate floating point operations to this single FPU core.
Accordingly, it would be desirable and useful to provide an FPU that involved less overhead, was not a computational bottleneck and did not require large amounts of connectivity to separate spatial locations.