The present invention relates to a parallel processing system for use in a parallel computer that executes instructions in parallel with a plurality of processing elements.
FIG. 5 shows a basic concept of a parallel computer in which a group of instructions are processed in n processing elements. In the figure, reference numeral 1 denotes a group of instructions, and 2 processing elements. FIG. 6 is a conceptual view showing a basic processing operation that is executed in a typical conventional parallel computer, in which reference numeral la denotes an instruction group in which the instructions 1 are arranged in order in units of rows, 3 an instruction queue, and 2 processing elements. The instructions of the instruction group 1a are sent to the instruction queue 3 and then sent to the processing elements 2 where they are processed.
The operation of the conventional parallel processing system will next be explained. If the group 1 of instructions (see FIG. 5) which are to be inputted to and processed in the processing elements 2 are in the form, for example, of a matrix that comprises n rows and n columns, as illustrated, these instructions are arranged in units of rows to form a group 1a of instructions (see FIG. 6), and processing elements 2 in which the rows of instructions are to be processed are determined arbitrarily. Each row of instructions in the group 1a is temporarily stored in the instruction queue 3 until a processing element 2 becomes vacant. When the processing of the previous instruction is completed in each processing element 2, tile subsequent instruction is sent to the processing element 2 from the instruction queue 3 to execute processing of this instruction.
The conventional parallel processing system suffers, however, from the problems stated below. For example, when a group of instructions which are in the form of a matrix comprising n rows and n columns are executed, since the prior art gives no consideration to the sequence in which the instructions are arranged, the rows of instructions (i.e., processors) in many cases greatly differ from each other in terms of the length of processing time required, so that the processing elements cannot be used efficiently.