Compiler optimization has as its goal transforming code to increase its performance. One important factor in optimization is scheduling operations to increase the speed of program execution by utilizing predicated and speculative operations. The present invention relates to optimizing code executed on an Explicit Parallel Instruction Computing (EPIC) architecture with full predication and speculation support and performs the global task of detecting and refining potential parallelism of the source code being compiled.
The present compiler transforms the source-code program represented as a set of Basic Blocks into Extended Scalar Blocks (ESBs) by applying a compiler technique called if-conversion which replaces conditional branches in the code with comparison instructions which set a predicate. Each predicated instruction is guarded by a Boolean source operand having a value which determines whether the instruction is executed or nullified.
Extended Scalar Blocks are regions of the predicated code where all dependencies between operations are represented explicitly as a relation between two operations for a considerable number of operations. For each ESB the compiler works out the critical path which is defined as a sequence of operations that will take the longest CPU time and cannot be executed in parallel because of dependencies.
The ESBs ale blocks in a control flow graph where control may enter only from the top but may exit from one or more locations. A control flow graph is a set of nodes and directed edges. For an ESB, a node represents some speculative and predicated code that is executed from the beginning of the block to one of its exits. An edge between a first node and a second node indicates that the first node may pass control to a second node. FIG. 1 depicts a control flow indicating some properties of the ESB. The ESB is preceded by control flow predecessor blocks CFP 1 and 2. Note that control only enters from the top of the ESB. In this example control exits from the side and bottom of the ESB to control flow successors CFS 1 and 2.
ESBs may have different lengths and summary run times depending on various factors such as the execution counter, the number of operations, and dependencies between them. The executable code will be more efficient when the execution workload is balanced between blocks in the control flow, i.e, the execution workload of some xe2x80x9chardxe2x80x9d blocks will be thrown on to other ones, which are executed less frequently and/or have some free execution resources.
Migration of operations from between Basic Blocks (blocks not in predicated form) are disclosed in U.S. Pat. No. 5,557,761. However, no techniques applicable to ESBs are disclosed. Accordingly, existing compilers do not have facilities for efficiently balancing the ESBs in the control flow.
According to one aspect of the invention, a variant of code motion optimization is developed for more powerful regions than Basic Blocks, i.e., for Extended Scalar Blocks, which are a predicated form of the intermediate representation of a program based on such features of modem architectures as speculative execution, full predicated execution and enough processor resources for instruction level parallelism (ILP). Therefore, optimization is performed on a considerable number of operations with explicitly expressed dependencies between the operations, that gives more precise criteria to select xe2x80x9chardxe2x80x9d regions (taking into account profiling information) and to select needed amount of migrated operations.
According to another aspect of the invention, the xe2x80x9chardnessxe2x80x9d of various blocks in a control flow is analyzed to identify which blocks consume more than a threshold level of execution resources. Blocks identified are then analyzed to determine whether resource consuming operations can be xe2x80x9cunloadedxe2x80x9d to be executed in parallel with operations of the blocks which are control flow predecessors.
According to another aspect of the invention, operations are unloaded to control flow predecessors when the predecessors have free resources to execute the unloaded operations without increasing the overall execution time of the program.
According to one aspect of the invention, an optimizing compiler includes code for migrating operations out of a hard ESB, i.e., an ESB having an excessive number of operations, to control flow predecessors of the hard ESB.
According to another aspect of the invention, critical operations are identified in a hard ESB and successively migrated out of the hard ESB to reduce its height.
According to another aspect of the invention, criteria for identifying critical operations include determining whether an operation requires either dynamic memory access, multiple cycles, or has a large number of successors.
According to another aspect of the invention, migrated operations are followed by a register write operation in the predecessor block and replaced by a register read operation in the hard ESB.
According to another aspect of the invention, operations migrated to a control flow predecessor block store result operands in virtual registers and critical operations in a source block are replace by read operations.
Other features and advantages of the invention will be apparent in view of the following detailed description and appended drawings.