Multicore systems are widely used in order to speed-up computational operations. Conventional multicore systems usually comprise at least one processor with a plurality of identical cores. By distributing machine code to the individual cores and executing machine code on each of these multiple cores in parallel, parallel computation can be achieved. As long as all of these cores are identical, a supervising instance only has to identify an available core and transfer the respective instructions to this core in order to execute machine code on the core. Since all of these cores are identical, all cores can perform the same machine code, and usually all of these cores will require the same amount of time for completing an operation.
Furthermore, multicore systems having a heterogeneous architecture have become more popular. Such heterogeneous multicore systems may comprise a plurality of cores which may run at different clock rates and/or which may have different instruction sets. Due to such a heterogeneous architecture, a same operation may be finished within a different amount of time depending on the core performing such an operation.
However, since the cores have different instruction sets, an optimum core may depend on the operation to be performed. This means, that for a first operation, a first core could be an appropriate core performing such an operation in a minimum duration. Further, a second core might be a better choice for another, second type of operation. Such operation may be for example a mathematical operation which will be computed very efficiently on an arithmetical core or the operation may be a processing of video data which will be efficiently performed on a graphical core. Hence, to improve computational speed, it is important to choose the optimum core for performing an operation.
Due to the different instruction sets of the individual cores in heterogeneous multicore systems, the machine code for performing an operation has to be adapted to the respective core at the compiling stage when generating machine code for the whole computer program. However, the assignment of a particular core when compiling the code is a big challenge. A very precise knowledge of the executing system is necessary in order to estimate an optimum execution time of the individual cores and to select an appropriate core for each task.
CA 2631255 A describes a task-to-device mapping based on predicted estimation of running time. A running time of a task is estimated for each available device and the task is assigned to the device having the minimal estimated running time.
US 2006/0123401 A describes a method for parallelization of program code. Program code is analysed to determine an optimal strategy for parallelization when compiling the code.
US 2007/0283358 A describes a task-to-device mapping when compiling program code for a heterogeneous multicore system. A compiler estimates a required running time on each device by a static prediction. Based on this prediction an optimum hardware device is selected and machine code is generated for the selected device.
US 2007/0283337 A describes a method for automatically identifying tasks which can be executed in parallel. Based on this analysis, execution time is estimated and the task is assigned to a processing unit.
However, a prediction of an execution time on a particular device requires a complex model describing the properties of the respective device. Furthermore, many models describing the properties of the individual devices are required for each device which should be considered when compiling the program code. For each newly introduced device, a user has to provide input describing the properties of the new device. Nevertheless, estimation of running time will lead to large inaccuracies, and thus it will be very difficult to generate an optimum machine code for heterogeneous multicore systems.
Accordingly, it is an object of the present invention to provide an enhanced assignment of a core in a heterogeneous multicore system.