Parallel processing system is used herein to describe a system in which a plurality of independent, interconnected arithmetical-logical processing elements operate in parallel to perform a multiplicity of processing functions. The processing elements in the system are, typically, substantially identical to one another and interconnected. In one type of parallel processing system known in the art as a single instruction multiple data (SIMD) system, a single sequence of instructions is provided to all processing elements. That is, all elements simultaneously receive and perform operations in accordance with the same sequence of instructions. However, each element may be performing the operations dictated by the instructions on different sets of data.
The individual processing elements of a SIMD parallel processing system typically have dedicated memories which may be loaded with data on which instructed operations can be performed. Also, each element can perform operations on data transmitted to it from another element, e.g., an adjacent element. Thus, there is flexibility in the operations performed by the elements insofar as derives from varying the data upon which each element operates. However, all processing elements must perform the same operations in accordance with the instructions. For example in the case of instructed arithmetic operations, one element cannot be instructed to perform addition while another is instructed to perform subtraction.
SIMD parallel processing systems may include the feature that a processing element can be conditioned to perform no operation at all in response to an instruction. Such a feature adds flexibility to system operation. For example, this feature facilitates the performance of calculations in which a series of arithmetic operations are performed, the nature of each operation being determined by the result of a previous operation. With the processing elements operating on different data, dfferent elements will need to perform different arithmetic operations. It is, therefore, not possible to prescribe a single sequence of instructions valid for all elements. In such a case, when an element is instructed to perform an inappropriate operation, it will instead perform no operation.
One example of such a calculation is the nonrestoring division method for binary numbers. One form of this method is generally described in "Digital Computer Arithmetic Design and Implementation" by Joseph J. F. Cavanagh, McGraw-Hill, 1984, pp. 252-258, incorporated herein by reference. Briefly, in the two's complement number system, the dividend and divisor are appropriately aligned, the dividend left-shifted one bit position and the divisor subtracted therefrom. The sign of the subtraction result determines whether the next operational step should be the addition or subtraction of the divisor to or from the result, after the result is shifted one position left. A quotient bit is derived from the sign of the result at each step. Thus, the result at each calculation step determines the type of operation (i.e., addition or subtraction) to be performed at the next step. If nonrestoring division is performed in a parallel processing system, with each element operating on different data, there is no single sequence of addition and subtraction instructions valid for all elements.
As discussed above, one solution to this problem is to condition each processing element to perform no operation if an instruction is inappropriate. For example, in nonrestoring division, an alternating sequence of addition and subtraction instructions is applied to the SIMD parallel processing system. For each instruction only those processing elements for which the instructed operation is appropriate, perform the operation. The balance of the elements perform no operation and wait for the next instruction. This solution results in inefficient sytem operation since twice as many addition and subtraction instructions are required as is necessary.