1. Field of the Invention
The present invention relates to the field of data processing. In particular, the invention relates to an apparatus and method for performing a convert-to-integer operation for converting a floating-point value to a rounded two's complement integer value.
2. Description of the Prior Art
A data processing apparatus may represent numbers in different ways. A number represented as an integral data value can only represent integer data values, since all the bits of the integral data value represent integer values and the radix point is positioned to the right of all the bits of the integral value. A fixed-point data value is assumed to have a radix point at a fixed location so that bits to the left of the radix point represent integer values and bits to the right of the radix point represent fractional values. In both an integral data value and a fixed-point data value, the position of the radix point (also known as a binary point) is fixed and it is not necessary to encode the position of the radix point in the data value itself.
On the other hand, in a floating-point representation, the radix point may float left and right within the data value. A floating-point value is represented using a significand and an exponent, with the significand representing the significant digits of the floating-point number and the exponent representing the position of the radix point relative to the significand. For a given number of bits, the floating-point representation is able to represent a wider range of numbers than the integral or fixed-point representation. However, the extra range is achieved at the expense of reduced precision since some of the bits are used to store the exponent, and so fewer bits are available for the significand.
Negative numbers are represented in a different way in a floating-point representation compared to an integral or fixed-point representation. In a floating-point representation, negative numbers are represented in a sign-magnitude form. A floating-point value has a sign bit which represents whether the floating-point number is positive or negative. The remaining bits representing the significand and the exponent then represent the magnitude of the value. That is, a floating-point value with sign bit S, exponent exp and significand f corresponds to a numeric value of N=(−1)S×2exp×(1+Σ(f[i]×2−i)), where i=1 . . . n, n is the number of bits used to represent the significand, and f[i]={0,1} is the ith most significant fractional bit of N. The exponent exp is the unbiased or true exponent of the floating-point value (in some representations the exponent may be biased so that the true exponent exp is obtained by subtracting a bias from the exponent value E of the floating-point value).
Hence, for a floating-point value, all the bits of the significand represent positive values with the magnitude indicated by the significand being multiplied by 2exp. Whether the number is positive or negative is indicated by the sign bit. Therefore, positive and negative numbers having the same magnitude have the same significand and exponent irrespective of whether the sign bit indicates a positive or negative number.
On the other hand, integral data values and fixed-point data values use two's complement representation to represent negative numbers. In two's complement representation, the most significant bit of the value represents a negative value, with all the other bits representing positive values, so that a two's complement number is considered to be a positive number if the most significant bit is 0 and to be negative if the most significant bit is 1. For an 8-bit two's complement value, the most negative number that can be represented is therefore 0b10000000 (−128), and the most positive number that can be represented is 0b01111111 (+127). To convert a positive number into a negative number of the same magnitude (i.e. to determine the two's complement of the positive number), all the bits of the positive number are inverted and a bit value of 1 is added to the least significant bit of the inverted value. For example, to convert 0b01011001 (+89) to its two's complement (−89), the bits are inverted to give 0b10100110 (−90) and one is added to the inverted value to give 0b10100111 (−89).
Hence, there are different ways in which numbers can be represented, and so it may be desirable to convert between different representations. For example, a floating-point value may be converted to an integral value or to a fixed-point value in two's complement form. Also, it may be desirable to round the fractional part of the floating-point number to an integer value in either the integral or fixed-point form. Different rounding techniques may be used to determine which of two adjacent integers a given fractional value should be rounded to. One such rounding mode is the round to nearest, ties away from zero, rounding mode (RNA rounding) in which a fractional value lying between two adjacent integers is rounded to the nearest of the adjacent integers, with a value lying half way between two integers being rounded away from zero. For example, a value of 2.2 would be rounded to the nearest integer value of 2, a value of 2.5 halfway between 2 and 3 would be rounded to an integer value of 3 (away from zero) and a value of −3.5 halfway between −3 and −4 would be rounded to an integer value of −4 (again, away from zero).
Hence, the present technique seeks to provide an efficient way of performing a convert-to-integer operation for converting a floating-point value to a rounded two's complement integer value, where RNA rounding is performed.