This invention relates to a method of controlling a heterogeneous multiprocessor system containing a plurality of different sorts of processor units, which enables efficient operation of the plurality of processor units, and to a compiler for producing a program.
Due to production of finer semiconductor elements along with progress in semiconductor manufacturing techniques, a large number of transistors can be integrated. At the same time, operation frequencies of processors are increased. However, since power during operation of the processors is increased and power in waiting statuses is increased due to a leakage current, further improvement in performance may not be expected, which has been achieved through improvements of logical systems and operating frequencies in conventional processors.
Thus, at present, as means for realizing improvement in performance and low power consumption, multiprocessor systems (single chip multiprocessor systems) are expected, in which a plurality of processor elements (hereinafter, referred to as “PEs”) as conventional CPUs and DSPs are mounted on a single chip to perform processing in parallel, whereby high calculation performance can be obtained even without increasing the operating frequencies. Among those multiprocessor systems, a multiprocessor system which is composed of a plurality of processor elements whose structures and calculation performance are identical to each other, and which are executed by the same instruction set is a homogeneous multiprocessor system. A multiprocessor system which is composed of a plurality of processor elements having different structures and instruction sets is a heterogeneous multiprocessor system.
Homogeneous multiprocessor systems capable of improving process performance by promoting parallel processings are known (refer to, for example, JP 2004-252728 A or JP 2001-175619 A).
In the homogeneous multiprocessor systems, research of automatic parallelizing compilers has already been commenced, in which input programs are analyzed, parallel-operable program portions are extracted from the analyzed input programs, and the program portions are allocated to a plurality of PEs, whereby the program portions can be executed at the same time. For instance, JP 2004-252728 A discloses a compile system for producing an object program with which a multiprocessor system can be operated in a higher efficiency, by analyzing an input source program, dividing the program into various grains of blocks (tasks) such as sub-routines and loops, analyzing parallelism among a plurality of tasks, dividing the tasks and data to be accessed by those tasks into sizes which can be fitted to a cache memory or a local memory, and optimally allocating those divided tasks to the respective PEs. Also, JP 2001-175619 A discloses an architecture of a chip multiprocessor for supporting a function of a multigrain parallel processing.
On the other hand, in the heterogeneous multiprocessor systems, general-purpose CPUs capable of executing various tasks are combined with PEs such as DSPs which execute specific processings at a high speed. Among those heterogeneous multiprocessor systems, such multiprocessor systems capable of optimizing processings are disclosed in, for example, JP 2004-171234 A and JP 2004-252900 A.
JP 2004-171234 A discloses the following technique. That is, tasks are temporarily allocated to processors whose instruction sets are different from each other, a judgement is made on whether or not allocation is to be changed to the processors of the different instruction set when the task is executed, and in a case where execution efficiency can be increased if the allocation is changed, the allocation change is performed.
Also, JP 2004-252900 A discloses the following technique. That is, tasks divided from a source program are allocated to the respective processors based on power consumption. Further, the tasks are preferentially allocated to a processor which can execute only few sorts of tasks than a core capable of executing various sorts of tasks.