1. Field of the Invention
The present invention relates to a Single Instruction Multiple Data (SIMD) arithmetic processor, apparatus and a method, an arithmetic processing unit and a compiler used for the SIMD arithmetic processor that obtain a calculation error by a floating-point operation.
2. Description of Related Art
As a high precision operating method using a double precision arithmetic processor, a “double-double” operation is known. The double-double operation is a floating-point type operation that uses two words of double precision type 64 bits. The operation achieves 106 bit accuracy by using two words for high (the MSB side) and low (the LSB side). “MSB” means the “Most Significant Bit”, and “LSB” means the “Least Significant Bit”.
Double-double addition is expressed by a formula below:(c.hi, c.lo)=(a.hi, a.lo)+(b.hi, b.lo)a.hi+b.hi=fl(a.hi+b.hi)+err(a.hi+b.hi)=fl.hi+err.hi a.lo+b.lo=fl(a.lo+b.lo)+err(a.lo+b.lo)=fl.lo+err.lo 
A fl (op(A)) means a normalized value of op(A), and an err (op(A)) means a computing error of op(A)).
In order to execute the double-double addition, a roundoff error of the double precision addition result is calculated. For calculating the roundoff error, Dekker's algorithm and Knuth's algorithm are known.
Knuth's algorithm includes six instructions below.x←a+b bvirtual←x−a avirtual←x−bvirtual broundoff←b−bvirtual aroundoff←a−avirtual y←aroundoff+broundoff 
The computing error y is obtained from two inputs a and b, and “+” means addition and “−” means subtraction. Knuth's algorithm, however, has a drawback in that many operations are required for calculating the computing error.
On the other hand, Dekker's algorithm may be operated with three instructions as shown below.x←a+bbvirtual←x−ay←b−bvirtual 
It is assumed that |a|>|b| in Knuth's algorithm.
As such, Dekker's algorithm may have an advantage over Knuth's algorithm in that it has a smaller number of operations for calculating the computing error.
As a related art, a patent document 1 discloses a SIMD processor which includes a plurality of processor elements (PE) to process a plurality of data. Each of the processor elements (PE) of the SIMD processor of the patent document 1 includes a plurality of comparing elements and a plurality of arithmetic registers each of which is connected to each of the comparing elements. The comparing element compares a value of the arithmetic register and a single immediate. The immediate is a constant value which is included in a command being executed by the processor. When the command is fetched by the processor, the immediate is used immediately. The comparing element stores a result of a logic operation based on each of a result of the comparison to a condition register which is used for controlling whether an operation of each of the processor elements is executed or not.
A patent document 2 discloses a basic operation element of an SIMD-type parallel data operation apparatus.
[Non-Patent Document 1] Jonathan Richard Shewchuk, “Adaptive Precision Floating-Point Arithmetic and Fast Robust Geometric Predicates”, School of Computer Science Carnegie Mellon University Pittsburgh, Pa. 15213
[Patent Document 1] Japanese Patent Laid-Open No. 2004-192405
[Patent Document 2] Japanese Patent Laid-Open No. Hei-7-060430