In modern digital computers, the conversion of decimal numerals to a binary representation is not always exact. By performing arithmetic operations using longer binary string mappings of decimal numerals, this imprecision can be reduced, but not eliminated.
The imprecision of binary representations creates problems when two decimal numerals are compared, and an exact comparison is desired. Correct results from such comparisons cannot be guaranteed. Incorrect results are quite possible if the compared numerals are very close in value to each other. In digital computing hardware, decimal numerals are represented by a finite number of binary bits referred to as floating point numbers.
The set F of floating point numbers that can be represented on a digital computer is not a continuum, or even an infinite set. In fact, the total number of floating point numbers in F that can be represented on a digital computer can be calculated if machine details are available. Unfortunately, these numbers are not even equally spaced in F. Thus, there is no possibility of representing the continuum of real numbers in any detail. Indeed, each number in F has to represent a whole interval of real numbers. Moreover, real numbers in absolute value larger than the maximum number of F cannot be said to be represented at all. And, for many purposes, the same is true of non-zero real numbers smaller in magnitude than the smallest positive number in F.
Forsythe et al (Forsythe, G E, Malcolm, M A, and Moler, C B, Computer Methods for Mathematical Computations, Prentice-Hall, Inc., New Jersey, 1977) state at page 10: “The badly named real number system underlies the calculus and higher analysis to such an extent that we may forget how impossible it is to represent all real numbers in the real world of finite computers. However, as much as the real number system simplifies analysis, practical computing must do without it”. As a simple example, Forsythe et al indicate at page 12 that the floating-point number 0.1 summed 10 times does not result in 1 in a floating-point number system for which the number base is a power of 2, because 1/10 does not have a terminating expansion in powers of ½.
Forsythe et al further state at page 13 that: “The operations of floating-point addition and multiplication are commutative but not associative, and the distributive law also fails for them. Since these algebraic laws are fundamental to mathematical analysis, analyses of floating-point computations are difficult”.
Higher precision can be provided by using larger number of binary bits, typically some integer multiple (2 or 4) of the number of bits used for a single precision floating point number. In such cases, usually each precision level is treated as a separate data type and rules of interaction with other precision levels are coded into the compiler. For example, many compilers have provisions for single and double precision levels of floating point numbers, and also rules for the handling of mixed operations between single and double precision floating point numbers.
In view of the above observations, a need clearly exists for an improved manner of dealing with decimal numerals in arithmetic and relational operations in digital computers.