1. Field of the Invention
This invention relates to the field of compiling computer programs. More particularly, this invention relates to the compilation of computer programs to exploit available parallelism within a target processing system without exceeding the available processing resources within that target processing system.
2. Description of the Prior Art
It is known to provide compilation techniques by which a computer program written in a high level computer language can be automatically analysed by a compiler program to generate low level program, or machine instructions, which can be used by processing hardware to execute the high level program. It is important that the compiler should generate code which is able to efficiently and rapidly execute the desired high level program. Efficiency can be considered in a variety of different ways, and includes fast execution and execution which does not consume excessive processing resources, such as register resources, memory, bandwidth or the like.
A known class of data processing systems are VLIW (very long instruction word) processors which provide a high degree of parallelism through the provision of multiple processing units and independent data paths. Such VLIW processors are particularly well suited for many DSP type operations where large volumes of data need to be processed rapidly and efficiently. It is important within such VLIW processors with their high degree of parallelism that the computer program compilers are able to generate appropriate VLIW instructions that properly exploit the parallelism potential of the hardware concerned.
Within compiler operations it is known that a computer program to be compiled will be analysed to generate a data flow graph in which vertices represent data processing operations performed and the links between these vertices (edges) represent data sources or data syncs. When such a data flow graph has been generated, it can be used by the compiler to properly analyse the data flow and dependencies associated with the computer program to be compiled and accordingly facilitate the generation of efficiently compiled computer code as output. A typical known strategy when analysing a data flow graph and attempting to schedule the various processing operations represented by the vertices is to schedule these to be performed either as soon as possible, or as late as possible, in accordance with the dependencies discovered. Whilst this tends to improve the degree of parallelism achieved it can result in the available processing resources of the system being exceeded, e.g. the available number of registers for storing operands may be exceeded resulting in slow and inefficient spills to memory. It will be appreciated that simply scheduling an operation to be performed as soon as possible may not be efficient since that operation will consume register resources as soon as it is scheduled and may compete with other operations which will require those resources even though the overall speed of operation would not be adversely impacted if the processing operation was scheduled to commence later.
Within the field of VLIW processor, clustering is known based upon data flow and matrix heuristics. Such matrix heuristic assume certain behaviour and characteristic which may not in fact be generally applicable and may focus on only some limitations.
The present technique seeks to address the above problems and provide a system which is able to generate efficiently compiled computer code which exploits parallelism to high degree whilst avoiding exceeding available processing resources in an undesirable fashion.