Conventionally known are parallel computer systems that connect computing nodes having an adjacent relation in an arbitrary-dimensional space with two-dimensional mesh connection, three-dimensional cube connection, torus connection, or the like, and cause another computing node located in a path to relay communication between computing nodes having no adjacent relation.
In the parallel computer systems, when one job is assigned to computing nodes that are not arranged continuously, a computing node assigned with another job relays communication between the computing nodes to which the job is assigned in some cases. In such a case, when the communication between the computing nodes to which the job is assigned causes disturbance for the computing node assigned with another job or abnormality is generated on the computing nodes that execute the job, another job is interrupted in some cases.
To address this, when the parallel computer system assigns a new job, it searches all regions in which idle nodes assigned with no job are continuous and selects an optimum region for execution of the job from the search regions. The parallel computer system then assigns the new job to the selected region.
The following describes an example of processing of searching all the regions as job assignment targets that is performed by the parallel computer system. In the following description, it is assumed that the parallel computer system has a two-dimensional mesh network in which four computing nodes are arranged in the X axial direction and three computing nodes are arranged in the Y axial direction and the adjacent computing nodes are connected.
For example, the parallel computer system selects one computing node and searches for the number of idle nodes continuous from the selected computing node in the Y axial direction while changing the number of computing nodes continuous therefrom in the X axial direction. Furthermore, the parallel computer system searches for the number of idle nodes continuous from the selected computing node in the X axial direction while changing the number of computing nodes continuous therefrom in the Y axial direction. The parallel computer system executes these pieces of processing for all the computing nodes so as to acquire pieces of search data indicating all the regions as the job assignment targets.
FIG. 28 is a view for explaining an example of the processing of searching the idle nodes. FIG. 28 illustrates pieces of search data that are acquired by the parallel computer system while they are grouped for respective computing nodes as origins of regions as job assignment targets when all the nodes included in the parallel computer system are the idle nodes. Among the pieces of search data as illustrated in FIG. 28, pieces of search data indicating regions that are the same as regions indicated by another search data are hatched as pieces of invalid data.
In the example illustrated in FIG. 28, when coordinates at which the computing node as the origin is arranged are expressed as (a, b), “e” as the number of idle nodes continuous in the Y axial direction that has been searched for “d” as the number of computing nodes in the X axial direction is expressed as T[a][b], xy[c]=(d, e). Furthermore, in the example illustrated in FIG. 28, “d” as the number of idle nodes continuous in the X axial direction that has been searched for “e” as the number of computing nodes in the Y axial direction is expressed as T[a][b], yx[c]=(d, e). It is noted that “c” is an array subscript for identifying search data.
For example, in the example illustrated in FIG. 28, the parallel computer system acquires (4, 3), (3, 3), (2, 3), (1, 3), (4, 3), (4, 2), and (4, 1) as pieces of search data indicating regions in which the computing node arranged at coordinates (0, 0) is set to the origin. That is to say, the parallel computer system acquires the pieces of search data indicating four regions having three computing nodes in the Y axial direction and one to four computing node(s) in the X axial direction and three regions having four computing nodes in the X axial direction and one to three computing node(s) in the Y axial direction.
In addition, the parallel computer system acquires pieces of search data indicating regions in which respective computing nodes are set to the origins for other computing nodes. Thereafter, the parallel computer system sorts the pieces of acquired search data in the descending order of the number of computing nodes included in the region. Thereafter, the parallel computer system selects search data indicating a minimum region satisfying the number of computing nodes as the job assignment targets from the pieces of sorted search data and assigns the job to the computing nodes included in the selected region.
Examples of the related techniques are described in Japanese Laid-open Patent Publication No. 2012-252591 and International Publication Pamphlet No. WO 2008/114440.
With the above-mentioned techniques of searching the regions as the job assignment targets, all the regions including the continuous idle nodes are searched. This causes a problem that the search cost is increased in accordance with the number of computing nodes.
Hereinafter, an example of processing of searching all the regions as the job assignment targets that is performed by the parallel computer system is described with reference to FIGS. 29 to 31. In the following example, the related parallel computer system searches rectangular regions including the continuous idle nodes.
First, the procedure of processing of selecting an information processing device that is set to an origin of a region is described with reference to FIG. 29. FIG. 29 is a first flowchart for explaining the procedure of related idle node search processing. In the example illustrated in FIG. 29, the parallel computer system initializes a Y value to “0” (step S1), and determines whether the Y value is smaller than 3 (step S2). If the Y value is smaller than 3 (Yes at step S2), the parallel computer system initializes an X value to “0” (step S3) and determines whether the X value is smaller than 4 (step S4).
If the X value is smaller than 4 (Yes at step S4), the parallel computer system executes Y-axis fixed search processing of searching the number of idle nodes continuous in the X axial direction while fixing the Y value (step S5). Thereafter, the parallel computer system executes X-axis fixed search processing of searching the number of idle nodes continuous in the Y axial direction while fixing the X value (step S6).
Subsequently, the parallel computer system adds 1 to the X value (step S7) and executes step S4. On the other hand, if the X value is equal to or larger than 4 (No at step S4), the parallel computer system adds 1 to the Y value (step S8) and executes step S2. Thereafter, if the Y value is equal to or larger than 3 (No at step S2), the parallel computer system finishes the processing.
Next, an example of the Y-axis fixed search processing is described with reference to FIG. 30. FIG. 30 is a second flowchart for explaining the procedure of the related idle node search processing. FIG. 30 illustrates the flowchart for explaining the procedure of the Y-axis fixed search processing at step S5 in FIG. 29.
For example, the parallel computer system initializes values of variables XMax, i, c, d, and e to set XMax=4−X, i=X, c=Y, d=0, and e=1 (step S10). The parallel computer system then determines whether the c value is smaller than 3 (step S11). If the c value is smaller than 3 (Yes at step S11), the parallel computer system determines whether the i value is smaller than 4 (step S12).
If the i value is smaller than 4 (Yes at step S12), the parallel computer system determines whether the i value is smaller than the XMax value (step S13). If the i value is smaller than the XMax value (Yes at step S13), the parallel computer system determines whether a computing node T[i][c] arranged at coordinates (i, c) is an idle node (step S14). If the computing node T[i][c] is the idle node (Yes at step S14), the parallel computer system adds 1 to each of the d and i values (step S15) and executes step S12.
On the other hand, if the i value is equal to or larger than 4 (No at step S12), if the i value is equal to or larger than XMax (No at step S13), or if T[i][c] is not the idle node (No at step S14), the parallel computer system executes the following processing. That is, the parallel computer system outputs (d, e) as search data T[X][Y]yx[c] and sets XMax=d, and then, initializes the i and d values to set i=X and d=0, and adds 1 to each of the c and e values (step S16). Subsequently, the parallel computer system executes step S11. If the c value is equal to or larger than 3 (No at step S11), the parallel computer system finishes the Y-axis fixed search processing.
Next, an example of the X-axis fixed search processing is described with reference to FIG. 31. FIG. 31 is a third flowchart for explaining the procedure of the related idle node search processing. FIG. 31 illustrates the flowchart for explaining the procedure of the X-axis fixed search processing at step S6 in FIG. 29.
For example, the parallel computer system initializes values of the variables YMax, i, c, d, and e to set YMax=3−Y, i=Y, c=X, d=1, and e=0 (step S20). The parallel computer system then determines whether the c value is smaller than 4 (step S21). If the c value is smaller than 4 (Yes at step S21), the parallel computer system determines whether the i value is smaller than 3 (step S22).
If the i value is smaller than 3 (Yes at step S22), the parallel computer system determines whether the i value is smaller than YMax (step S23). If the i value is smaller than YMax (Yes at step S23), the parallel computer system determines whether a computing node T[c][i] arranged at coordinates (c, i) is the idle node (step S24). If the computing node T[c][i] is the idle node (Yes at step S24), the parallel computer system adds 1 to each of the e and i values (step S25) and executes step S22.
On the other hand, if the i value is equal to or larger than 3 (No at step S22), if the i value is equal to or larger than YMax (No at step S23), or if T[c][i] is not the idle node (No at step S24), the parallel computer system executes the following processing. That is, the parallel computer system outputs (d, e) as search data T[X][Y]xy[c] and sets YMax=e, and then, initializes the i and e values to set i=Y, e=0, and adds 1 to each of the c and d values (step S26). Subsequently, the parallel computer system executes step S21. If the c value is equal to or larger than 4 (No at step S21), the parallel computer system finishes the X-axis fixed search processing.
When the parallel computer system searches the regions as the job assignment targets in the above-mentioned manner, the parallel computer system executes condition determination processing included in the Y-axis fixed search processing and the X-axis fixed search processing repeatedly. For this reason, the number of times of the condition determination processing that the parallel computer system executes is increased as the number of computing nodes is increased. Due to this, the parallel computer system is incapable of completing the process within practical time in some cases.