An overview of the technology relating to the present invention will be stated first. In order to decide the order of execution of operations and perform scheduling for deciding the execution steps from a control/data flow graph (CDFG), which represents the flow of control and data of a circuit that includes conditional branches in behavioral synthesis, it is necessary to analyze the mutual exclusiveness of execution of operations in the control/data flow graph and determine whether identical arithmetic units can be shared. Such mutual exclusiveness is able to be determined whether a target operation belongs to a different branch by directly examining the control/data flow graph.
<Speculative Execution of Single Conditional Branch>
Speculative execution of a single conditional branch will be described. Speculative execution in scheduling means scheduling operations in a branch and operations in a conditional clause in control steps that precede a determination as to whether a condition is true or false.
In a case where code motion is carried out, instruction codes inside a condition are actually moved outside of the conditional clause.
The types of speculative execution will be described first. Speculative execution is divided broadly into the following types:                speculative execution with respect to a single conditional branch statement; and        speculative execution among multiple conditional branch statements (parallelization of conditional branch statements).        
FIGS. 1A-1E illustrate several typical examples of speculative execution with respect to a single conditional branch. In FIGS. 1A-1E, the rectangles indicate basic blocks, and characters T and F represent true (T) and false (F) branches, respectively, of a conditional branch. The solid-line arrows between the basic blocks indicate control dependency between the basic blocks.
In FIG. 1A illustrates a case where operations are executed “in order”, i.e., in the order set forth in the behavioral description.t=b+c;  (1)if (c>a*b){t1=x+y; t2=t1+z; t3=t2+u; o=t3+s;}else{t1=x+t; o=t1+3;}y=x*c; 
Examples in which this is speculatively executed are as shown in FIG. 1B to FIG. 1E. It should be noted that in FIG. 1B to FIG. 1E, all of the images shown are those obtained after code motion.
“Branch node speculation and push-down” in FIG. 1B is an example (“Push-down”) in which two operations (t1=x+y and t2=t1+z) of the Then clause of a conditional statement have been speculatively executed and an operation [t=b+c; (1) in FIG. 1] that precedes the conditional statement [if (c>a*b)] has been moved to the side of the ELSE clause. This utilizes the fact that the variable t is referred to solely by the ELSE clause. By scheduling the addition operation (b+c) at (1) to a cycle later than the IF clause, it is caused to be executed solely by the ELSE clause. Meaning is not changed by this code motion. In this case, code correction referred to as “bookkeeping” is unnecessary.
“Late conditional execution” in FIG. 1C is speculative execution in which the THEN clause ({t1=x+y; t2=t1+z; t3=t2+u; o=t1+3;} and the ELSE clause {t1=x+t; o=t1+3} are both executed in the cycle preceding the operation of the conditional clause. This is a useful conversion in a case where there are plentiful arithmetic units and execution of the conditional clause is not performed in an earlier cycle.
“Early conditional execution” in FIG. 1D corresponds to a case that is the opposite of FIG. 1C. Here the conditional operation is executed in a cycle that precedes the operation [the statement t=b+c; (1) in FIG. 1] of the statement preceding the conditional branch statement. That is, the conditional operation is executed as early as possible. Although t=b+c; (1) is executed by either the THEN clauses or the ELSE clause, it is copied to the THEN clause and ELSE clause in a case where code motion is performed, and thus is moved to both of these clauses. In the case of a processor having condition-flagged instructions, etc., the results of conditional branches may be retained and operations inside a conditional may execute the condition-flagged instructions.
“Duplication-up” in FIG. 1E is speculative execution in which the operational instruction (y=x*c;) of the statement that succeeds the conditional statement is moved by being copied into the respective conditional clauses of THEN and ELSE.
Depending on how FIGS. 1D and 1E are utilized, it is possible to reduce the number of execution cycles with the same number of arithmetic units.
Although an example of an IF statement (two branches) is illustrated in FIGS. 1A-1E, similar speculative execution is conceivable also with multiple branch instructions such as a switch statement in the C language.
As illustrated above, a variety of types of speculative execution are conceivable with respect a single conditional branch. In this specification, speculative execution with respect to a single conditional branch is referred to as “local” speculative execution.
It should be noted that with scheduling according to the present invention, speculative executions including ones later described all be implemented efficiently, as will be set forth below.
<CDFG>
A control/data flow graph (CDFG) will be described next. A CDFG represents control and data dependencies simultaneously. FIG. 2 illustrates one example of this. A CDFG utilized by a behavioral synthesis will be described below. CDFGs have a format in which the interior of a basic block is held by the DFG and the control relationship between basic blocks is held by the CFG (Control Flow Graph) (a directed graph having basic blocks as nodes) (see FIG. 1A-1E).
However, with the CDFG representation format, the control-dependency relationship between basic blocks is mainly expressed and data lines between basic blocks are not directly connected. In order to perform global scheduling, it is necessary to move several operators (apply code motion) between basic blocks, and it is difficult to find motion that is effective as a whole.
For this reason, a method (trace scheduling) of creating specific paths (traces) separately has been adopted. With this method, however, all traces cannot be optimized. The CDFG described below is data representation that overcomes the above problems and readily takes global scheduling into consideration.
With the CDFG, the control structure of a conditional branch is given for every data line.
In FIG. 2,
Δ (referred to as a “fork node” represents data FORK, and
∇ (referred to as a “join node”) represents data JOIN.
Here JOIN is considered. That is, the values of variables a, b, and c are forked (FORK) at the value of the conditional operation (a>b) and are selected (JOIN) by the result of the conditional operation (a>b) after both branches are computed.
The lines from “a>b”, which is a conditional operation, to the FORK nodes and JOIN node are referred to as “condition lines”. The lines indicating flow of data are referred to as “data lines”.
With the conventional method, basic blocks are expressed explicitly. With a CDFG, however, it is noteworthy that basic blocks are not expressed explicitly but that data dependency relationship among a plurality of basic blocks is expressed directly.
A plurality of basic blocks can be represented as a single CDFG without grouping being performed on a block-by-block basis. This CDFG is suited for performing global scheduling.
In a case (“in-order”) where a conditional is resolved before an operation inside a conditional branch, an ordinary branch operation is carried out. When the conditional is executed late [“late conditional execution” at FIG. 1C], FORK and JOIN operations are actually performed (both are executed and the value is selected last).
The CDFG is capable of representing these various speculative executions with the form of the speculative execution of a single conditional branch statement being kept as is (i.e., without code motion). (In actuality, it is possible for a “condition vector”, described below, to be appended to each operation node.)
<Condition Vector>
A condition vector is vector representation introduced in order to detect mutual exclusiveness of operations that are based upon a conditional branch. By attaching a condition vector to the operation node of a CDFG, all speculative executions can be represented in a unified manner and scheduling for the purpose of optimizing all paths can be carried out efficiently. In other words, as mentioned above, a scheduler can handle operations inside multiple conditional branches in the manner of operations inside a single basic block without being aware of the basic block.
First, a condition vector will be described and then scheduling that is based upon a condition vector will be described.
<Basic Condition Vector>
A condition vector represents, in the form of a vector, the execution conditions of a conditional branch (an IF statement or SWITCH statement in C language). This will be described with reference to FIGS. 3A-3D. The behavioral description on the left side in FIG. 3A indicates three levels of nested IF statements by pseudo-C code (see Patent Documents 1 and 2, etc.).
In FIG. 3A, “a;”, “b;”, etc., designate “statements” in the C language, and “c1”, “c2”, etc., signify operations in a conditional. In FIG. 3C, the control structure of the IF statements of this behavioral description is represented by binary trees.
A condition vector (CV) is attached as follows: First, a one-hot encoded vector (a vector of a code in which only one component is “1”) is created at a node where the “leaf” of a “tree” indicates the conditional branch on the right.
In the example in FIG. 3C, there are three levels of four branches, c, e, f and g are leaf nodes and the following vectors are applied to the respective nodes, by way of example:                c: [1,0,0,0]        e: [0,1,0,0]        f: [0,0,1,0]        g: [0,0,0,0]Each vector has only one “1” component, and the vectors are mutually perpendicular (the inner product of any two vectors is equal to zero; the vectors are linearly independent).        
The values of a bitwise logical sum (OR) between the vectors of child nodes are applied to the node of the respective parent. For example, the condition vector of node d is calculated by the bitwise logical ORed value between the vector [0,1,0,0] of e, which is the child node of d, and the vector [0,0,1,0] of node f, which is the child node of d, and therefore the condition vector of node d is as follows:                d: [0,1,1,0]        
Furthermore, since the condition vector of node b is calculated by the bitwise logical ORed value between the vector [1,0,0,0] of c and the vector [0,1,1,0] of d, then the condition vector of node b is as follows:                b: [1,1,1,0]        
Node a is the result of all the bitwise logical ORs and is [1,1,1,1].
The condition vector of conditional c1 is the same as that of a, namely [1,1,1,1], and the condition vector of conditional c2 is the same as that of b, namely [1,1,1,0].
Illustrated in FIG. 3B are two-level nested IF statements (four branches) and the corresponding basic condition vectors. Although FIGS. 3A and 3B have different conditional branch structures, the number of leaf nodes are equal. The numbers of dimensions therefore are the same and the condition vectors of the leaf nodes are similarly assigned one-hot codes. Nodes other than leaf nodes also are obtained by taking the bitwise logical OR between child BCVs (Basic Condition Vectors).
The number of dimensions of a vector is equal to the number of leaves. In a case where IF statements are nested, therefore, we have the following:number of dimensions of vector=“number of conditionals of IF statements+1”.
When there is a SWITCH statement, it is defined in a similar manner. Whereas an IF statement is two-branch structure, there are multiple branches with a SWITCH statement. Since the number of branches in the case of SWITCH only is the number of leaves, this is equal to the “number of CASE clauses of the SWITCH statement”. If, in a case where IF statements overlap in a SWITCH statement, a plurality of SWITCH statements overlap, then the number of dimensions is reduced by one for each overlap. What is noteworthy is that this becomes a number of linear dimensions, namely number of IF statements+1, without becoming exponent of the number of the IF statements.
The components of a condition vector are the binary values “0” and “1” and are expressed usually by 32 dimensions of int (integer type) 32 bits (or 64 bits), and a bitwise logical OR operation necessary for computation also can be implemented at high speed (in one cycle). In the implementation of a program, a tree structure composed of IF statements and SWITCH statements is depth-first searched (or tree walked) and the number of leaves is counted. Further, one-hot encoding in which “1” is appended is performed in the order of depth-first search starting higher bits. More specifically, [1,0,0 . . . ], [0,1,0 . . . ] are allocated in order starting from leaves that arrive at the top of the behavioral description.
<BCV>
A condition vector thus defined in terms of the behavioral description is referred to as a “basic condition vector”. A basic condition vector is defined as follows with respect to one nested conditional branch:
<Definition of Basic Condition Vector (BCV)>                number of dimensions=number of leaves;        BCV(n)=one-hot encoding (in case of an operation in which operation n is at a leaf);        =BCV(s) or BCV(t) or BCV(u) or BCV(v) (in a case where n is other than a leaf);        
where
operations s, t, u, v are operations that follow n (along depth direction of the branch), and
or denotes a bitwise logical OR.
One component of a condition vector is such that the execution condition of an operation is indicated by the OR of leaf conditions. Accordingly, in a case where the same components in relation to condition vectors of two operation nodes do not have a 1 in common, operations are executed only mutually exclusively (if one operation node is executed, then the other is not). Such mutual exclusiveness is referred to as “conditional mutual exclusiveness”.
On the other hand, mutual exclusiveness in a case where an operation is executed in a different cycle is referred to as “temporal mutual exclusiveness”.
Conditional mutual exclusive operations are executed simultaneously in a manner similar to the case of temporal mutual exclusiveness, and therefore it is possible to share the same arithmetic unit.
When a condition vector is utilized, the exclusiveness of two operations can be discriminated immediately owing to the fact that the bitwise logical AND of the condition vectors is “0” (operations can be executed by a single clock instruction of the CPU).
Conditional mutual exclusiveness based upon a branch instruction includes conditional mutual exclusiveness based upon data dependency besides that described explicitly in a behavioral description to the effect that the operations are in respectively different branches. This is implicit mutual exclusiveness that does not appear explicitly in terms of the behavioral description. This is referred to as “data-dependent conditional mutual exclusiveness” or “implicit conditional mutual exclusiveness”.
The mutual exclusiveness based upon data dependence is conditional mutual exclusiveness that appears between operations in a branch, when the result of the operations is not used in the branch. For example, in the behavioral description at FIG. 4A, the result of “t=b+c;” is not used in the THEN clause of the succeeding IF statement and is used only by the ELSE clause (o=x+t;). In other words, the value of the variable t is utilized only in a case where the result of the conditional “(a*b<c)” is false; the value of variable t is not utilized in a case where the result of the conditional is true.
With regard to this implicit mutual exclusiveness, conditional mutual exclusiveness can be output explicitly without changing the meaning of the behavior by code-moving “t=b+c” into the ELSE clause, as in the example of implementation based upon code motion in FIG. 4B. However, code motion is not necessarily advantageous in raising speed and is disadvantageous in a case where it is more advantageous to execute “t=b+c” before the conditional.
From a global view, whether code motion is to be performed or not is comparatively difficult to select so as to be actually advantageous for scheduling.
<ECV>
Data-dependent mutual exclusiveness is analyzed by applying condition vectors in a CDFG. A condition vector defined in a CDFG is referred to as an “Extended Condition Vector”.
How an ECV is applied will be described with reference to FIG. 4C. First, a basic condition vector (BCV) is applied to the input of a JOIN node, and the BCV is applied as the ECV of the operation node directly connected to the JOIN node (this BCV is a value the same as that of the BCV possessed by the operation node itself).
The ECVs of other operation nodes can be found by the bitwise logical ORs of ECVs of the succeeding operations of the data flow.
A BCV obtained in the behavioral description is decided by the position at which an operation is described (e.g., the position in a tree structure). By contrast, with an ECV, a logical OR is taken along the data line of a CDFG, and therefore implicit conditional mutual exclusiveness that takes data dependency into account can be detected.
In FIG. 4A, the BCV of “t=b+c” mentioned earlier is [1,1]. The ECV, however, is [0,1], as indicated at FIG. 4C. The [0,1] of the ECV is merely the ELSE clause and indicates that the result “t” of this operation (“t=b+c”) is being used.
Here the addition operation “t=b+c” cannot actually share an arithmetic unit with the addition operation of the THEN clause immediately.
The reason for this is that it is necessary to compute and store the value of t before it is resolved whether the result of the conditional is true or false. That is, in order to share an arithmetic unit based upon data-dependent mutual exclusiveness, it is necessary that the condition clause (“a*b<c”) be executed (scheduled) before “t=b+c”.
This relationship indicates that conditional mutual exclusiveness changes dynamically depending upon whether the reverted condition clause has been resolved or not.
An extended condition vector ECV(n) of a certain operation node n is defined as set forth below.
ECV(n)=[1] in a case where n is a node immediately preceding the terminal node;
ECV(n)=BCV(n) in a case where n is a node immediately preceding a JOIN node or is a conditional operation; and
ECV(n)=ECV(k) or EVC(1) or . . . or ECV(m) in a case other than the two cases mentioned above;
(where k, 1 and m are successor nodes to the node n, and [1] and [0] indicate vectors in which all of the respective components are 1, 0).
<ACV>
Dynamic mutual exclusiveness is mutual exclusiveness decided dynamically depending upon whether the true/false decision regarding a conditional clause in a conditional branch has been completed or not. More specifically, each type of speculative execution possesses this dynamic mutual exclusiveness, and what expresses this dynamic mutual exclusiveness is an Active Condition Vector.
A case where a behavioral description in FIG. 5A is speculatively executed will be considered. In a case where only the two additions in “x+y+z” of the THEN clause are speculatively executed before the conditional clause, the result is FIG. 5B if code motion is performed. Whether or not such code motion is effective changes depending upon the number of arithmetic units which can be used. This means that performing code motion uniformly is not advisable.
On the other hand, such speculative execution can be implemented utilizing an active condition vector (ACV).
In FIG. 5C illustrates an example in which scheduling is performed “in order” with the constraint that the number of adders is one. In FIG. 5D illustrates result of scheduling that takes speculative execution into consideration with the constraint that the number of adders is one.
In view of FIGS. 5C and 5D, the maximum number of cycles can be reduced from “5” to “3” while the number of adders is kept the same.
In-order execution is such that after the conditional clause “(a*b<c)” ends, the THEN clause or ELSE clause is executed in line with the result of the condition test. The execution cycles necessary for each are five cycles or three cycles.
In a case where the conditional clause has been resolved, the active condition vector ACV has the same value as that of the extended condition vector ECV.
In a case where speculative execution is carried out, the active condition vector ACV changes dynamically in dependence upon whether speculative execution has been executed or not, as illustrated in FIG. 5D.
This indicates that the active condition vector ACV expresses that the execution condition actually changes in a case where speculative execution has been performed.
In FIG. 5D, the initial addition “x+y” and the second addition “+z” of the THEN clause in FIG. 5A has been scheduled speculatively into the first cycle (a cycle into which an operation is scheduled is referred to as a “control step”) or into the second cycle.
In the operation (“a*b<c”) of the conditional clause, “*” and “<” have been scheduled in the first control step and the second control step, respectively and the true/false of the conditional operation has not been resolved.
In this case, whether the THEN clause or the ELSE clause is to be executed in unclear. In FIG. 5D, therefore, the two addition operations of the THEN clause must always be executed regardless of the result of the condition. In other words, the operation is that of FIG. 5B, which is an example in which code motion has been implemented.
That is, if a certain operation is scheduled by speculative execution in a control step for which a condition has not been resolved, then the conditional mutual exclusiveness of the THEN clause and ELSE clause vanishes in this control step.
In FIG. 5D, the condition vector of the addition node (+) of control step 2 is [1,1] as a result of taking the bitwise logical OR between [1,0] and the condition vector [1,1] of the indeterminate condition [(a*b<c)].
What expresses mutual exclusiveness converted in accordance with whether a conditional is resolved or unresolved is the active condition vector ACV.
The active condition vector ACV of an operation node is calculated by the bitwise logical Ored value between its extended condition vector ECV and ECV of an unresolved conditional operation of the conditional clause to which the operation reverts. In FIG. 5D, the ECVs of the two additions (the two additions in s=x+y+z) which speculatively are executed in control steps 1 and 2 are both [1,0]. However, since the OR is taken with the ECV [1,1] of the operation “>” of the unresolved condition of the conditional clause to which the operation reverts, the ACVs of the two additions are both [1,1].
The meaning of this is that the two additions (the two additions in s=x+y+z) are originally contained in the THEN clause and therefore this ACV is [1,0]. However, if the additions are speculatively executed and scheduled to control steps for which conditions have not been resolved, then the operations should have been executed under the conditions of the both the THEN clause and ELSE clause. Therefore the ACV may be considered to have been a vector [1,1].
That is, the active condition vector ACV of an operation node can be computed by a simple (executable by a single cycle instruction of the CPU) operation, namely bitwise logical OR operation of the ECV of the node and the ECV of an unresolved conditional operation reverting to the node, and conditional mutual exclusiveness at the time of a speculative execution can be expressed correctly.
At this time it is important that expression be performed with no change in the configuration or topology (connection relationship) of the CDFG and only with a change in the value of the condition vector attached to the operation node.
Since the ACV is just calculated one time when an operation node is scheduled, it is not necessary to re-perform computation again and again. In other words, the amount of computation involving the ACV is very small [proportional to the number of conditionals (usually on the order of two to seven)].
In FIGS. 5A-5D, scheduling is carried out under an arithmetic-unit constraint that the number of utilizable adders is one. In FIG. 5C, however, an adder having the [1,0] vector and an adder having the [0,1] vector are utilized mutually exclusively in step 3 (mutually exclusively because the same components do not have a 1 in common).
In FIG. 5D, [1,0] and [0,1] share a single adder in step 3. In steps 1 and 2, on the other hand, addition is always executed and therefore the arithmetic unit cannot be shared.
Another feature of the ACV is that the necessary number of arithmetic units can be computed upon taking into consideration these dynamically changing possibilities of sharing of the resources. This will be described later.
<Definition of Active Condition Vector ACV(n) with Respect to Operation Node n>:                ACV(n)=ECV(n) or ECV(Ci) or ECV(Cj) or . . . or ECV(Ck)        
where Ci, Cj, . . . and Ck indicate the final node (an operation that decides the true/false of a condition) of a conditional clause of a conditional branch statement to which a node n belongs, and is unresolved (not yet scheduled) in the control step, the ACV of which is to be obtained.
Illustrated in FIG. 6A are basic condition vectors BCV, extended condition vectors ECV and active condition vectors ACV with respect to the following behavioral description:t=b+c;  1:if (a*b<c)/*c1*/  2:o=x+y+z+s+u;  3:else  4:o=x+t;  5:Here ACV indicates the value when the condition “c1” is unresolved.
The BCV of the addition (“t=b+c”, referred to as “addition 1”) on the first line is (1,1), and the ECV thereof is (0,1), indicating that there is data-dependent mutual exclusiveness.
With respect to this behavioral description, FIG. 6B illustrates an example in which speculative execution of the two additions of the THEN clause (the two addition operations that constitute v=x+y+z) and push-down of addition 1 (t=b+c) to the ELSE clause have been performed by code motion. It can be said that it is difficult to perform such code motion with the aim of achieving overall optimization.
In FIGS. 6C and 6D show the results of scheduling, performed by a technique that uses condition vectors, under the constraint that there is one adder available as an arithmetic unit.
In FIG. 6C is a diagram illustrating an example of in-order scheduling and (d) a diagram illustrating an example of scheduling in which speculative execution of a THEN clause and push-down to an ELSE clause have been performed.
With in-order execution in FIG. 6C, a maximum of six cycles are required. In out-of-order execution in FIG. 6D, an improvement is made to a maximum of four cycles (in this case, however, the minimum number of cycles increases from three to four).
In FIG. 6C, operation 1 is executed before condition “c1” is resolved. The ACV, therefore, is [1,1].
In FIG. 6D, on the other hand, out-of-order scheduling is executed, the two additions of the THEN clause (the two additions in v=x+y=z) are assigned to control steps S1 and S2, and the ACV thereof is [1,1]. On the other hand, since operation 1 (t=b+c;) has been scheduled to control step 3 following the conditional operation “c1”, ACV is [0,1] and data-dependent mutual exclusiveness can be utilized effectively.
With scheduling using a condition vector, calculation and management of necessary number of arithmetic units need only be performed using the ACV without there being any need to be explicitly aware of control dependency.
With the method of performing code motion in a case where various speculative executions have been performed, an operation referred to as “bookkeeping” is necessary in order to assure consistency of operations. In a case where a condition vector is used, however, code correction (compensation) corresponding to bookkeeping of a compiler is not required. Control in which all execution paths have been optimized systematically can be generated correctly utilizing a scheduled ACV or the results of addition (described later) of ACV when control logic is finally generated.
Scheduling is carried out by comparing the delay time and clock period of an arithmetic unit and taking into consideration whether several sequential operations are allocated to one control step. Accordingly, in a case where a conditional operation and an operation inside a branch have been scheduled in the same cycle, whether the conditional operation has been resolved or is unresolved is decided upon taking delay time of the operation into account.
FIGS. 7A and 7B illustrate the relationship among operation delay, clock period, active condition vector ACV and data path (circuit configuration) synthesized. Behavioral description is illustrated at the top, the corresponding CDFG is illustrated therebelow, and the data path is shown at the bottom.
In FIGS. 7A and 7B, except for the fact that the delays of the conditional operations [(a>b) and (a&&c)] are different, namely 5 ns and 0.5 ns, respectively, the CDFGs are of the same type.
In FIGS. 7A and 7B, it is assumed that the clock period is 7 ns and that the delay of the adder is 5 ns. In the case of FIG. 7A, the arithmetic sum of the delay of the conditional operation a>b and the delay of the addition is 10 ns, which is greater than the clock period of 7 ns. Accordingly, execution cannot be performed sequentially and is performed in parallel. The operation of the conditional a>b therefore is unresolved in a case where the addition operation is executed. Accordingly, the active condition vector ACV of the adder is calculated by the bitwise logical ORed value between its ECV and the ECV [1,1] of the conditional operation and is [1,1].
In the example of FIG. 7B, on the other hand, the sum of the delay of the conditional operation && and the delay of the addition is less than the clock period of 7 ns. Accordingly, the selecting and adding of variables a and c after the condition has been resolved is possible within one clock. [At the time of actual scheduling, scheduling is performed taking the delay of the selector (multiplexer) into consideration. For the sake of simplicity, however, the delay of the selector will not be discussed here.] The active condition vectors ACV of an addition node remain [1,0] and [0,1] because the condition a&&c has been resolved.
These data paths (circuit diagrams) are as illustrated at the bottom of FIGS. 7A and 7B. In FIG. 7A, two adders are required. In FIG. 7B, adder sharing is possible and implementation can be performed with one adder. Delay of an operation thus falls within the clock period.
The arrangement of FIG. 7A means that speculative execution in one clock period is executed. The addition operations (a+b and b+c) are speculatively executed and one is selected by a selector (MUX) depending upon the result of the conditional operation. These data paths also can be formed in a simple manner from the active condition vector ACV.
In FIG. 7A also, in a case where the clock period is 13 ns and greater than the sum of the delays of the comparator and adder, it is possible to begin the addition after the comparison operation is resolved. In this case, therefore, the ACVs of the addition operations remain [1,0], [0,1] and implementation is possible with even one adder. However, there are also cases where even when the clock period is large, it is better to utilize two adders and not share arithmetic units.
That is, the ACV only indicates that sharing is possible. Just because an ACV indicates that sharing is possible does not necessarily mean that sharing is necessary.
As mentioned above, the ACV is decided depending upon whether a conditional operation is resolved or unresolved upon taking into consideration the relationship between delays of a conditional operation and operation inside a branch and the clock period, even in one control step.
A scheduler is adapted to investigate delay and try various schedulings even in one control step. If an ACV is utilized, however, it is possible for mutual exclusiveness in such case to be expressed accurately.
This decision regarding conditional mutual exclusiveness is capable of being applied to operations such as arithmetic operations and also to all operations array access and signal input/output. Condition vectors are attached to all operation nodes of a CDFG and various processing is executed efficiently.
Described next is a method of calculating the minimum necessary number of arithmetic units at each scheduled control step utilizing condition vectors.
FIGS. 8A and 8B illustrate the number of operations f (function) utilized in a behavioral description in the example of FIG. 3C. For example, “TWO” is written above operation node c. This indicates that two of the operations f exist at this position. The same is true for the other nodes as well. This diagram illustrates that operation f does not exist at nodes not having attached numerals indicating number of operations.
Described next is a technique for computing how many arithmetic units to which the operation f is allocated are necessary in a case where two operations f at node c, one operation f at node d, two operations f at node e, one operation f at node f and three operations f at node g in FIG. 8 are scheduled in a certain control step for which all of the conditions c1, c2 and c3 have been resolved.
It is necessary to obtain the minimum necessary number of arithmetic units taking into consideration the fact that c and d, e, f, g are mutually exclusive while d and e, f are not exclusive. This computation can be performed easily and efficiently using condition vectors.
Specifically, first the vector sums of condition vectors of all nodes of the certain operation f are obtained. Since the sum vector represents the necessary number of arithmetic units for every leaf condition, the largest component thereof indicates the smallest necessary number of arithmetic units.
The circumstances of computation of the sum vector are illustrated in FIG. 8B. For example, since “TWO” holds for node c, there are two of c:[1,0,0,0] and the vector is [2,0,0,0].
Furthermore, if all of the vectors of c:two, d:one, e:two, f:one, g:three are vector-summed, we obtain [2,3,2,3].
In other words, this indicates that three of the operations f are required by leaf conditions e and g and two of the operations f are required by the leaf conditions c and f. Accordingly, it will be understood that it will suffice if a minimum of three are prepared.
<FUV>
A vector sum of condition vectors of all operation nodes allocated to a certain arithmetic unit is referred to as an FUV (Function unit Utility condition Vector). The example of FIG. 8 is a basic condition vector BCV. A method of calculating the necessary number of arithmetic units using the FUV can be handled entirely similarly even in a case where the active condition vector ACV is used.
Accordingly, even in the case of speculative execution, the necessary number of arithmetic units can be calculated merely by obtaining the FUV in similar fashion.
It should be noted that the FUV is similarly defined and utilized not only by an arithmetic unit but also by a register to which a variable is allocated, a memory to which an array is allocated and an input/output port to which an input/output variable is allocated. By obtaining the maximum values of the components of these FUVs, it is possible to calculate also the necessary numbers of registers, memories and input/output ports.
<FUV Definition>FUV(f)(i)=ΣACV(f)(i)
Here FUV(f) is the FUV of an operation node allocated to an arithmetic unit f in a certain step i. ACV(f) is the ACV of an operation node that has been scheduled in the certain step i.
<FUVall>
A vector sum of ACVs of all operation nodes of the CDFG scheduled to the certain step i is expressed by FUVall(i).
FUVall is utilized when the control logic (Finite State Machine) of a scheduled circuit is generated for example. The details of this will be described in relation to the generation of control logic (see FIGS. 10A and 10B).
The minimum necessary number of arithmetic units is calculated as set forth below.“necessary number of arithmetic unit f”=maximum value of components of FUV(f)(i)
At each control step, the maximum value of the components of FUV(f) is the maximum value of the necessary number of operations at this control step. If the maximum value is found among all of the control steps, this will be the necessary number of arithmetic units overall.
<Division of Operating Nodes, Arithmetic-Device Use Vector and Necessary Number of Control Cycles>
Although various examples of speculative execution are illustrated in FIGS. 1A-1E, the speculative executions (out-of-order execution) of FIGS. 1B to 1D can be expressed by an active condition vector ACV.
However, “Duplication-up” at FIG. 1E cannot be implemented by expression of the ACV alone.
Further, “Push-down” at FIG. 1B also can be expressed by the ACV since operation (1) is merely placed inside the ELSE clause in this case. However, if this is inserted by being copied to both the THEN clause and ELSE clause (Duplication-down), expression by the ACV is not possible.
Accordingly, the portions corresponding to Duplication-up and Duplication-down are dealt with by a scheduling technique that utilizes the properties of the condition vector.
An operation for which the number of “1” components of a condition vector is equal to or greater than two can be divided into two or more operation nodes without changing the original operation.
By way of example, the condition vector [1,1,0,1] can be decomposed in the following manner: [1,0,0,0] and [0,1,0,1] or [1,0,0,0] and [0,1,0,0], etc.
This resolution means Duplication-up, Duplication-down as is.
For example, a case where an operation node has been divided into nodes in which there is only one “1” component in the vector, such as in [1,0,0,0], corresponds to division into leaf nodes.
Further, if the vector is decomposed into [1,1,0,0], this corresponds to node division to a level higher than a leaf node.
If node division is performed before scheduling, this runs somewhat counter to the principle that an operation node of a CDFG is not subjected to code motion at the time of scheduling. In a scheduling algorithm, however, this node division is performed “virtually”, optimum scheduling is carried out for every “1” component of a condition vector and node division is performed only in a case where it is judged to be advantageous.
Since node division can be scheduled in different control steps every “1” component of the condition vector, scheduling conforming to each path is possible and all paths can be speeded up.
FIGS. 9A-9C illustrate an example of node division and scheduling utilizing this node division.
In FIG. 9C illustrates an example in which an addition (n1) which is the last line s=s+5; of the following behavioral description:if (c1) x=a+b−c+d else if (c2) x=y+z+u−v; s=s+5;/*n1*/is divided into conditional branches of the upper three branches.
Since the addition s=s+5; (n1) in FIG. 9A is outside the branch, it is always executed and the condition vector is [1] (where [1] is a vector in which all components are 1).
The addition node (n1) is divided into the conditional branches of three branches. That is, it is divided into the three nodes [1,0,0], [0,1,0], [0,0,1].
In FIG. 9B is the result of scheduling the behavioral description in FIG. 9A under the constraint that there is one adder and one subtractor. Since one adder is used in the control steps S1 to S3, the addition (n1) can only be allocated to control step S4.
The reason is that the ACV vector [1,1,1] of the addition n1 is such that the maximum component of FUV+ (arithmetic-device utilization-degree vector) is “2” in any of the control steps of control steps S1 to S3. It should be noted that the “+” of FUV+ is indicative of the FUV of the adder.
However, if an operation is divided into three nodes, as illustrated at FIG. 9C, scheduling is performed randomly to control cycles of “0” components (a condition for which addition is not performed) of arithmetic-device utilization-degree vector FUV+ with respect to the addition, whereby the maximum component of FUV+ is held at “1” and allocation is performed from control step S1 to S3. The three divided nodes (the ACVs are [0,0,1], [1,0,0] and [0,1,0], respectively) are allocated from control step S1 to S3 and the respective FUV+'s are [1,1,1], [1,0,0] and [1,1,0]. It is possible to perform scheduling to the three control steps S1, S2 and S3.
That is, in the above branches of the behavioral description, adders are not used under all conditions in all states. Rather, by performing scheduling utilizing a portion that is not used under a certain condition in a certain state, as illustrated at FIG. 9C, the number of execution cycles can be reduced.
Since an FUV indicates the necessary number of arithmetic units for every condition, it is easy to optimize scheduling of every condition as in this example.
FUVall on the right side at FIGS. 9B and 9C is the overall vector sum of ACVs of all operation nodes scheduled to this control step.
For example, at control step S2 in FIG. 9B, there is subtraction at [1,0,0] and addition at [0,1,0]. Accordingly, FUVall(S2) is given by the vector sum [1,0,0]+[0,1,0] and we obtain FUVall(S2)=[1,1,0]. The same is true for the others as well.
Consider a case where a component of FUVall is “0”. In FUVall, the condition vectors of all operations are summed. Therefore, at control step S2, for example, a leaf condition of a “0” component indicates that no operation whatsoever is carried out. That is, with this leaf condition, it is indicated that this control step can be skipped.
Accordingly, a path-length vector, which is a vector obtained by adding the number of non-zero components of FUVall with respect to all control steps, indicates the necessary number of steps of every leaf condition of the condition vector.
The path-length vector is [4,4,1] in FIG. 9B and [3,3,1] in FIG. 9C.
This indicates that while four cycles, four cycles and one cycle were required for every leaf condition in FIG. 9B, the numbers are reduced to three cycles, three cycles and one cycle, respectively, in FIG. 9C.
FIGS. 10A and 10B illustrate an example in which a state transition of an FSM (Finite State Machine) of a control circuit is generated from FUVall, which has been obtained from a certain scheduling result, by skipping zero components of FUVall. In FIG. 10A is a diagram illustrating FUVall with respect to a scheduling result, and 10B is a diagram illustrating generated state transitions of an FSM for control.
A control-logic generating algorithm with respect to a single (nested) conditional branch comprises steps 1) to 5) below.
1) Adopt a first control step as a first state.
2) Current_step=first state
3) Find a control step in which FUVall components other than “0” are found for every leaf condition of the condition vector of Current_Step, and enter next_step. Obtain the next state of every leaf condition from Current_Step.
4) current_step=current_step+1;
5) End if current_step is the final step, or transition to step 3) if current_step is not the final step.
In the example of FIGS. 10A and 10B, the initial state is assumed to be S1. Further, the leaf conditions of the condition vectors are denoted c1, c2 and c3 in order.
The next-state transition destinations of initial state S1 (FUVall is [1,2,3]) are S3, S4 and S3 in the order of the bits of the condition vectors.
The transition destination from state S2 is state S3 only.
From step S3 a transition is made to step S4 with conditions c1 and c3. Accordingly, the following state transitions are obtained:
  switch (ST)  {  case S1:  if (c1) goto S2;  if (c2) goto S4;  if (c3) goto S3;  case S2 : goto S3;  case S3 : goto S4  case S4: end, go to next statement}
Here ST indicates a state register, and goto S2 indicates transition to S2.
This example indicates a control-logic generating method with respect to the conditional branch of one top level. (As will be described later, the present invention proposes a method of parallelizing multiple top-level conditional branches.)
With regard to FUV and FUVall, refer to the content of Non-Patent Document 1, etc.