Conventionally, a parallel computer system is known in which computation nodes being in an adjacent relationship are connected together within a space of arbitrary dimensionality, by using a two-dimensional mesh connection, a three-dimensional cube connection, a torus connection, or the like, and in which communication between computation nodes that are not in an adjacent relationship is relayed by another computation node positioned on the path therebetween, where the torus connection is a mode of connection in which a plurality of computation nodes are connected together in a toric form.
In such a parallel computer system, when a job (hereinafter, “the first job”) is allocated to computation nodes that are not in successively-arranged positions, the communication between the computation nodes to which the first job is allocated may be relayed by another computation node to which another job (hereinafter, “the second job”) is allocated. In that situation, the communication between the computation nodes to which the first job is allocated may disturb the computation node to which the second job is allocated. Further, when an abnormality has occurred in any of the computation nodes executing the first job, the second job may be interrupted.
Thus, when allocating a new job, the parallel computer system searches, one by one in every region, for regions having successive free nodes to which no job is allocated and further selects an optimal region for executing the job from among the regions found in the search. After that, the parallel computer system allocates the new job to the selected region.
However, when allocating and freeing (when the execution of a job is finished) processes are repeatedly performed in units of rectangles or rectangular parallelepipeds containing successive computation nodes, more and more fragments occur over the course of time, and it becomes difficult to allocate large jobs (jobs that require a large number of computation nodes). FIG. 23A is a drawing of a situation where many fragments have occurred in a parallel computer system in which computation nodes are arranged two-dimensionally. In the example in FIG. 23A, because the positional arrangement of the four three-node jobs is not satisfactory, there are many fragments, and the largest free space, which is the largest rectangular space containing successive free computation nodes, has eight nodes.
To minimize occurrence of those situations, the parallel computer system can be configured to select computation nodes in such a manner that a large free space remains while smaller jobs (jobs that require a small number of computation nodes) are gathered to one or more sides. FIG. 23B is a drawing of a situation where computation nodes are selected in such a manner that a large free space remains while the smaller jobs are gathered to the side. In FIG. 23B, because the four three-node jobs are gathered to the side, the largest free space has eighteen nodes. Thus, the largest free space is larger than in the example illustrated in FIG. 23A where many fragments have occurred.
A two-dimensional torus connection is a mode of connection in which computation nodes positioned on two ends in the X direction and the Y direction in a two-dimensional mesh connection are connected to each other, the two-dimensional mesh connection being included in a two-dimensional torus connection. A three-dimensional torus connection is a mode of connection in which computation nodes positioned on two ends in the X direction, the Y direction, and the Z direction in a three-dimensional mesh connection are connected to each other, the three-dimensional mesh connection being included in a three-dimensional torus connection. In other words, a torus connection is a mode of connection in which elements are connected in a toric form via a connection end in a direction of a predetermined axis. Thus, considering that the connection is in such a mode where the two ends in a mesh connection included in a torus connection are connected to each other, the parallel computer system conducts a search by crossing the connection end represented by the two ends connected to each other in the mesh connection included in the torus connection. In the above description, an X direction refers to the direction of the X-axis, while a Y direction refers to the direction of the Y-axis, and a Z direction refers to the direction of the Z-axis. FIG. 23C is a drawing of an example of a largest free space positioned across the X-axis in a two-dimensional torus connection. In this example, as a result of searching for successive free computation nodes by crossing the X-axis, the largest free space has fifteen nodes.
In a parallel computer system in which connections are made with a two-dimensional torus connection network, to perform a process of searching for successive free computation nodes, search data having a data structure such as that illustrated in FIG. 24A is used. FIG. 24A is a drawing of search data about the number of successive free computation nodes toward the maximum X-Y coordinates for the computation node at each set of X-Y coordinates, in an example of a parallel computation system of which the X size is 4, whereas the Y size is 3, where the phrase “the X size is 4” means that four computation nodes are arranged in the X direction, whereas the phrase “the Y size is 3” means that that three computation nodes are arranged in the Y direction.
The data for each of the X-Y coordinate positions includes search data about a successive X size obtained by incrementing the successive Y size by 1 at a time from the pair of coordinates toward the Y=2 coordinates. In other words, the data for each of the X-Y coordinate positions includes search data T[a][b].yx[c]=f(d,e) expressing a successive X size “d” and the number of successive free nodes “f” obtained while the successive Y size “e” is fixed, where “a” denotes the X-coordinate of the computation node, “b” denotes the Y-coordinate of the computation node, and c is an array subscript having a value from 0 to 2. The successive Y size is the size of successive free computation nodes in the Y direction, whereas the successive X size is the size of successive free computation nodes in the X direction. Further, all the values except for “f” and “d” are each a fixed value and are configured in initial setting. The values of “f” and “d” are each set to “0” in the initial setting. In FIG. 24A, the values of “f” and “d” found in the search are indicated in bold text.
For example, the data for the origin T[0][0] includes search data about the successive X size corresponding to when the successive Y size=1, 2, or 3 is satisfied. The data for T[3][2], which is the maximum X-Y coordinates, includes search data about the successive X size corresponding to when the successive Y size=1 is satisfied. FIG. 24A illustrates search data corresponding to the situation where all the computation nodes are in a free state. Further, the areas indicated with hatching in FIG. 24A represent pieces of invalid data. For example, if Y=1, the successive Y size “e” is either 1 or 2 and cannot be 3. Thus, the array subscript is either 1 or 2. If Y=2, because the successive Y size “e” can only be 1, the array subscript is only 1.
FIG. 24B is a drawing of a result obtained by sorting the search data illustrated in FIG. 24A according to the numbers of successive free computation nodes. It is possible to leave a large free space, i.e., a rectangle containing successive free computation nodes, by selecting the smallest rectangle containing successive free computation nodes that satisfies a shape and the number of computation nodes required by a user's job, from the sorted result illustrated in FIG. 24B.
It should be noted, however, that the data structure of the search data illustrated in FIG. 24A does not include any search data with X-axis crossing and Y-axis crossing. To process the X-axis crossing, because the computation node positioned on the right end is continuous with the computation node positioned on the left end, it is necessary to conduct a search by turning around from the right end to the left end, starting with each of the virtual origins, as illustrated in FIG. 25A. Similarly, to process the Y-axis crossing, because the computation node positioned on the upper end is continuous with the computation node positioned on the lower end, it is necessary to conduct a search by turning around from the upper end to the lower end, starting with each of the virtual origins, as illustrated in FIG. 25B.
FIG. 26 is a flowchart of a flow in a rectangle searching process for two-dimensional successive free computation nodes. As illustrated in FIG. 26, a searching unit configured to search for a rectangle in a parallel computer system initializes a variable Y to 0, the variable Y indicating the coordinate position of the computation node in the Y direction (step S201). After that, the searching unit judges whether Y is smaller than Ymax, which denotes the number of computation nodes arranged in the Y direction (step S202). If Y is not smaller than Ymax, because it means that the search has been conducted with respect to the Y direction while all the computation nodes are used as the search target, the process is ended.
On the contrary, if Y is smaller than Ymax, the searching unit initializes a variable X to 0, the variable X indicating the coordinate position of the computation node in the X direction (step S203). After that, the searching unit judges whether X is smaller than Xmax, which denotes the number of computation nodes arranged in the X direction (step S204). If X is smaller than Xmax, the searching unit performs a Y-fixed searching process (X,Y) with respect to the computation node at the coordinates (X,Y), to search for successive free computation nodes in the X direction while the successive Y size is fixed (step S205). After that, the searching unit increments X by 1 (step S206), and the process returns to step S204, so that the searching unit conducts a search by using the next computation node in the X direction as a starting point. On the contrary, if X is not smaller than Xmax, because it means that the search has been conducted with respect to the X direction while all the computation nodes are used as the search target, the searching unit increments Y by 1 (step S207), and the process returns to step S202.
FIG. 27 is a flowchart of a flow in the Y-fixed searching process (X,Y). As illustrated in FIG. 27, the searching unit initializes “XSize” to Xmax-X, “XSize” indicating the search size in the X direction. Further, the searching unit initializes an iteration variable “i” in the X direction to X, initializes an iteration variable “c” in the Y direction to Y, initializes the successive X size “d” to 0, and initializes the successive Y size “e” to 1 (step S211).
After that, the searching unit judges whether c is smaller than Ymax (step S212). If c is not smaller than Ymax, because it means that the search has been conducted with respect to all the successive Y sizes, the process is ended.
On the contrary, if c is smaller than Ymax, the searching unit judges whether i is smaller than Xmax (step S213). If i is smaller than Xmax, the searching unit judges whether i is smaller than XSize (step S214). If i is smaller than XSize, the searching unit judges whether T[i][c] indicates a free state, T[i][c] expressing whether the computation node at the coordinates (i,c) is free or not (step S215). If T[i][c] indicates a free state, the searching unit adds 1 to d and adds 1 to i (step S216). After that, the process returns to step S213, so that the searching unit checks the free state of the next computation node in the X direction.
On the contrary, if T[i][c] does not indicate a free state, if i is not smaller than XSize, or if i is not smaller than Xmax, because it means that the search for the successive X size has been finished, the searching unit calculates f by multiplying d by e. After that, the searching unit sets the values of f and d in the search data T[X][Y].yx[c], initializes XSize, i, and d to d, X, and 0, respectively, and further adds 1 to c and to e (step S217). The process then returns to step S212, and the searching unit conducts a search for the next successive Y size.
FIG. 28 is a drawing for explaining an outline of a process in a Y-fixed searching process (0,0). As illustrated in FIG. 28, to search for a successive X size corresponding to when the successive Y size is 1, the searching unit initializes XSize (1), judges whether i is smaller than XSize (2), and judges whether the computation node is free or not (3). After that, the searching unit repeatedly performs judging processes (4), (6), and (8) to judge whether i is smaller than XSize and judging processes (5), (7), and (9) to judge whether the computation node is free, until either i is no longer smaller than XSize or the computation node is no longer free.
After that, to search for a successive X size corresponding to when the successive Y size is 2, the searching unit initializes XSize to the successive X size corresponding to when the successive Y size is 1 (10). The reason why the searching unit initializes XSize to the successive X size corresponding to when the successive Y size is 1 is that there is no possibility that the successive X size corresponding to when the successive Y size is 2 is larger than the successive X size corresponding to when the successive Y size is 1. After that, the searching unit repeatedly performs judging processes (11), (13), (15), and (17) to judge whether i is smaller than XSize and judging processes (12), (14), (16), and (18) to judge whether the computation node is free, until either i is no longer smaller than XSize or the computation node is no longer free.
After that, to search for a successive X size corresponding to when the successive Y size is 3, the searching unit initializes XSize to the successive X size corresponding to when the successive Y size is 2 (19). After that, the searching unit repeatedly performs judging processes (20), (22), (24), and (26) to judge whether i is smaller than XSize and judging processes (21), (23), (25), and (27) to judge whether the computation node is free, until either i is no longer smaller than XSize or the computation node is no longer free.
Incidentally, a conventional technique is known by which processes are automatically arranged in optimal positions in order to shorten the time period required by the execution of parallel programs, in a PC cluster system having a three-dimensional torus connection structure (see, for example, Patent Document 1). Further, another conventional technique is also known by which one-to-many or many-to-many communication is efficiently performed in a parallel computer system configured with a torus connection network (see, for example, Patent Document 2). Furthermore, yet another conventional technique is also known by which a plurality of rectangles having mutually-different shapes are efficiently arranged, while ensuring that no rules are violated, into a rectangular region serving as a positioning region (see, for example, Patent Document 3).