In computing, a floating-point number generally includes a technique for representing an approximation of a real number in a way that can support a wide range of values. These numbers are, in general, represented approximately to a fixed number of significant digits and scaled using an exponent. The term “floating point” refers to the fact that a number's radix point (e.g., decimal point, or, more commonly in computers, binary point) can “float”; that is, it can be placed anywhere relative to the significant digits of the number. This position is indicated as the exponent component in the internal representation, and floating point can thus be thought of as a computer realization of scientific notation (e.g., 1.234×103 versus 1,234, etc.).
The Institute of Electrical and Electronics Engineers (IEEE) Standard for Floating-Point Arithmetic (IEEE 754) is a technical standard for floating-point computation established in 1985 by the IEEE. Many hardware floating-point units or circuits are substantially compliant with the IEEE 754 standard. Herein, the term “IEEE 754” refers to standards substantially complaint with the IEEE Standard for Floating-Point Arithmetic, IEEE Std. 754-2008 (29 Aug. 2008) or standards derived from or preceding that standard.
The IEEE 754 standard allows for various degrees of precision. The two more common levels of precision include a 32-bit (single) and 64-bit (double) precision. The 32-bit version of a floating point number includes a 1-bit sign bit (that indicates whether the number is positive or negative), an 8-bit exponent portion (that indicates the power of 2 where the radix point is located) and a 23-bits fraction, significant, or mantissa portion (that indicates the real number that is to be multiplied by 2 raised to the power of the exponent portion). The 64-bit version includes a 1-bit sign indicator, 11-bit exponent portion, and a 52-bit fraction portion. It is understood that the above are merely a few illustrative examples to which the disclosed subject matter is not limited.
Conversely, an integer number generally includes a number that may be written without a fractional or decimal component (e.g., 21, 4, −2048, etc.). This may be thought of as being comparable to the mantissa portion of a floating-point number. In computer science, the size or range of an integer number may be limited to the number of digits or bits used to represent the value (e.g., an 8-bit integer may represent 0-255, a 16-bit integer may represent 0-65,535, etc.). In general, an integer may be signed or unsigned. In such a system unsigned integers are understood to only include positive (or non-negative) numbers (e.g., 0-255, etc.), whereas signed integers generally offset the range at which numbers are represented such that the integer may include both positive and negative numbers (e.g., −128 to 127, −32,768 to 32,767, etc.).