The invention relates to a method for structuring a multi-instruction computer program as containing a plurality of basic blocks, that each compose from internal instructions and external jumps organized in an internal directed acyclic graph. Structuring such multi-instructional computer programs for faster execution is a continual target of industry. A particular feature is to enable parallel processing on the level of a single instruction, which has become feasible by the introductions of so-called Very Long Word Instruction (VLIW) processors and so-called SuperScalar processors. State of the art is the book by David A. Patterson & John L. Hennessy, Computer Architecture, a Quantitative Approach, Morgan Kaufmann 1996, p. 240-288, herein incorporated by reference. Patterson and Hennessy describes how VLIWs use multiple, independent functional unit which packages multiple operations into one long instruction. The parallelism in Superscalars may be attained in a program of which the scheduling is being executed at actual execution. Alternatively, in VLIW, the effects may be partially exploited by scheduling at compiling time. A general rule is that parallelism may be exploited better when a greater number of operations can be processed coexistently, given the available extent of hardware facilities. Such amount of operations will hereinafter be called a scheduling unit or basic block. In its most simple embodiment such a scheduling unit may be organized on a Directed Acyclic Graph (DAG) that consists of internal operations and one or more external (conditional) jumps to other scheduling units. The graph may be reached from one or more other graphs via respectively associated input operations, that read an initial value from an associated specific register. Likewise, output will also involve a write operation to a possibly selectible specific register.
P. Y. T. Hsu and E. S. Davidson, Highly Concurrent Scalar Processing, Univ. of Illinois at Urbana-Champaign, Proc. 13th Ann. Int. Symp. on Computer Architecture, June 1986, p.386-395, have proposed to expand the size of scheduling units by introducing guarded instructions to reduce the penalty of conditional branches, in combination with decision tree (dtree) scheduling.
Alternatively, S. A. Mahlke et al, Effective Compiler Support for Predicated Execution Using the Hyperblock, Univ. of Illinois at Urbana-Champaign, Proc. 25th Ann. Int. Workshop on Microprogramming, Portland OR Dec. 1992, p.45-54, have mapped their basic blocks on a linear chain of basic blocks by duplicating basic blocks, so that each internal basic block has only a single predecessor.
However, the present inventors have found that in many cases the above guarding may be amended as well as amplified to attain an improved degree of parallelism, by mapping a Directed Acyclic Graph of basic blocks on a single higher level basic block for inclusion in a higher level tree of higher level basic blocks.