Parallel processing systems, which have a plurality of computers (sometimes referred to as “computing nodes”) connected to a network and perform large-scale computation through parallel operation of the plurality of computing nodes, are already in use. Jobs are inputted into a parallel processing system from one or more users. When a job is inputted, launch information, such as the file name of a user program to be launched, may be designated, as well as execution conditions, such as the number of used nodes. The parallel processing system performs so-called “job scheduling” that assigns one or more computing nodes to each of the jobs that have been inputted with consideration to the designated execution conditions.
Note that a management server that assigns tasks to a plurality of operators has also been proposed. When a new task has occurred, the proposed management server searches for existing tasks that are similar to the new task and preferentially assigns the new task to an operator or operators assigned similar existing tasks. To calculate similarity, the management server acquires character information relating to the existing tasks and character information relating to the new task. The management server performs morphological analysis to extract words from the character information and calculates similarity based on the proportion of words that commonly appear in both an existing task and the new task. Alternatively, the management server calculates similarity according to a method such as n-gram or edit distance.
A parallel processing system that estimates resource usage of a new job has also been proposed. The proposed parallel processing system stores an execution history including job attributes and a resource usage state of jobs that have been executed. The parallel processing system searches for executed jobs whose job attributes are similar to a new job and estimates the resource usage of the new job based on the resource usage state of a similar executed job or jobs. The job attributes include a program name.
See, for example, the following documents:
International Publication Pamphlet No. WO2013/128555; and
Japanese Laid-open Patent Publication No. 2016-42284.
However, the higher the usage rate of the computing nodes (i.e., the higher the proportion of computing nodes that have been assigned a job and therefore are not free), the higher the power consumption of the parallel processing system. On the other hand, for economic and equipment-related reasons, it is not realistic for a parallel processing system to make unlimited use of power. For this reason, when scheduling jobs, it is preferable for a parallel processing system to adjust the combination of jobs to be simultaneously executed so that the total power consumption does not exceed an upper limit. To do so, the parallel processing system estimates the power consumption of jobs awaiting execution.
However, in addition to the number of used nodes, the power consumption of a job also depends on the characteristics of the user program to be launched. As one example, the power consumption of a job may depend on the memory access frequency, the disk access frequency, and/or the communication frequency, and may even depend on the pipeline processing efficiency of the user program and the extent to which SIMD (Single Instruction Multiple Data) is used. The jobs inputted into a parallel processing system are not limited to repeatedly designating the same program names, so that jobs may be inputted with program names that gradually change. This results in the problem of how to estimate the power consumption of jobs awaiting execution.