1. Field of the Invention
The present invention relates to a high-level synthesis method for automatically synthesizing a digital circuit based on an operating description of an LSI, and a storage medium storing the method.
2. Description of the Related Art
Conventionally, high-level synthesis methods are known as techniques which are particularly effective in designing application-specific integrated circuits (ASIC) and the like where a short period of time is required for designing.
The high-level synthesis is a technique for automatically synthesizing a circuit based only on an operating description of a processing algorithm without information on hardware structure. The high-level synthesis technique is described in detail in xe2x80x9cHigh-level Synthesisxe2x80x9d (Kluwer Academic Publishers).
Hereinafter, for example, a method for high-level synthesizing a circuit automatically based on an operating description represented by the following expression (1), will be described:
x=(a+b)*(b+c)xe2x80x83xe2x80x83(1)
A typical high-level synthesis method is performed in accordance with a flowchart shown in FIG. 1. In the high-level synthesis method, control and data flow in executing an operating description are initially analyzed and converted to a model called a control data flow graph (CDFG) (see step S1 in FIG. 1). A CDFG is a graph similar to a flowchart. A CDFG is comprised of nodes and input/output (I/O) branches. The I/O branches indicate the flow of data or control signals. Nodes indicate operations. Inputs and outputs of the operations correspond to I/O branches of the nodes.
For example, an operating description shown in FIG. 1 is represented by a CDFG shown in FIG. 2. The CDFG of FIG. 2 includes first and second addition nodes 11 and 12 each representing an addition operation and a multiplication node 13 representing a multiplication operation. In the CDFG of FIG. 2, an input xe2x80x9caxe2x80x9d and an input xe2x80x9cbxe2x80x9d are added together, an input xe2x80x9cbxe2x80x9d and an input xe2x80x9ccxe2x80x9d are added together, and the results of both addition operations are multiplied together, the result of the multiplication operation being represented by an output xe2x80x9cxxe2x80x9d.
After the operating description represented by expression (1) has been converted to the CDFG shown in FIG. 2, scheduling is performed (see step S2 in FIG. 1). Scheduling determines the timings of executing operations corresponding to the nodes of a CDFG, i.e., to determine in which clock step each operation corresponding to a node of a CDFG is executed. In this case, operations at all of the nodes need to be completed in a clock period, taking into account the operating time of each operation.
FIG. 3 shows an example of scheduling the CDFG of FIG. 2. In FIG. 3, two addition operations and one multiplication operation are scheduled to be executed within a clock step (step 1). In this case, the scheduling is performed in such a manner that the total of the operating times of such arithmetic executions does not exceed the period of one clock step. For example, when the operating times of an adder and a multiplier are 5 nsec and 60 nsec, respectively, and the clock period is 65 nsec or more, all of the operations can be scheduled within one clock step (step 1) as shown in FIG. 3.
Identical operations which are scheduled in different clock steps can be executed in a single operator. For this reason, as shown in FIG. 5, a first addition node 11 representing a first addition and a second addition node 12 representing a second addition are scheduled to be performed in different steps 1 and 2, respectively, so that both the operations can be executed in a single adder. In this case, if the clock period is 65 nsec or more, the second addition whose operating time is 5 nsec and a multiplication whose operating time is 60 nsec can be scheduled to be performed in a second clock (clock step 2).
After an operating description has been scheduled, allocation is performed (see step 3 in FIG. 1). Allocation generates operators, registers and the like required for executing a scheduled CDFG. For example, in an allocation, operators are allocated as operations of a CDFG, and registers, selectors and the like are allocated as I/O branches across borders between adjacent clock steps. By allocation, a circuit for executing the operating description represented by expression (1) is generated (see step 4 in FIG. 1).
FIG. 4 shows the result of the scheduling in FIG. 3. In FIG. 4, first and second adders 14 and 15 corresponding to the first and second addition operations, respectively, are generated and allocated as the first and second addition nodes 11 and 12, and a multiplier 16 corresponding to the multiplication operation is generated and allocated as the multiplication node 13. In this case, no register or selector is generated since there is no branch across a border between adjacent clock steps.
FIG. 6 shows the result of the scheduling in FIG. 5. In FIG. 6, the first and second addition nodes 11 and 12 corresponding to the first and second addition operations are scheduled to be performed in different steps 1 and 2. Therefore, only one adder 14 is generated for the addition nodes 11 and 12. There are I/O branches 21 and 22 (FIG. 5) across a border between clock steps 1 and 2. Therefore, a selector 23 (indicated by xe2x80x9cselxe2x80x9d in FIG. 6) and a register 24 (indicated by xe2x80x9cregxe2x80x9d in FIG. 6) are generated for the I/O branches 21 and 22, respectively.
In the circuit shown in FIG. 6, inputs xe2x80x9caxe2x80x9d and xe2x80x9ccxe2x80x9d are supplied to the selector 23, and the output of the selector 23 is input to the adder 14. Input xe2x80x9cbxe2x80x9d is also supplied to the adder 14. The output of the adder 14 is input to the register 24 and the multiplier 16. In this case, a controller 25 (indicated by xe2x80x9cControllerxe2x80x9d in FIG. 6) which produces control signals for each of the generated selector 23 and register 24 is generated. The controller 25 outputs a select signal k1 to the selector 23. The selector 23 outputs input xe2x80x9caxe2x80x9d when the select signal k1 is xe2x80x9c1xe2x80x9d, or outputs input xe2x80x9ccxe2x80x9d when the select signal k1 is xe2x80x9c0xe2x80x9d. The controller 25 also outputs an enable signal k2 to the register 24. The register 24 stores an input value at the rising of a clock signal when the enable signal k2 is xe2x80x9c1xe2x80x9d, or maintains a stored value and outputs the stored value when the enable signal k2 is xe2x80x9c0xe2x80x9d.
The operation of the circuit shown in FIG. 6 will be described. In clock step 1, the select signal k1 and the enable signal k2 are both xe2x80x9c1xe2x80x9d, and the selector 23 selects input xe2x80x9caxe2x80x9d and outputs it to the adder 14. Then, the adder 14 outputs a value xe2x80x9ca+bxe2x80x9d, resulting from an addition of xe2x80x9caxe2x80x9d and xe2x80x9cbxe2x80x9d, to the register 24. The register 24 stores xe2x80x9ca+bxe2x80x9d output from the adder 14 since the enable signal k2 is xe2x80x9c1xe2x80x9d.
In clock step 2, the select signal k1 and the enable signal k2 are both xe2x80x9c0xe2x80x9d, and the selector 23 selects the input value xe2x80x9ccxe2x80x9d and outputs it to the adder 14. Then, the adder 14 outputs a value xe2x80x9cc+bxe2x80x9d, resulting from an addition of xe2x80x9ccxe2x80x9d and xe2x80x9cbxe2x80x9d, to the multiplier 16. The stored value xe2x80x9ca+bxe2x80x9d is output to the multiplier 16 since the enable signal k2 is xe2x80x9c0xe2x80x9d. The multiplier 16 multiplies the input value xe2x80x9cc+bxe2x80x9d by the input value xe2x80x9ca+bxe2x80x9d, and outputs the resultant product as xe2x80x9cxxe2x80x9d.
In the circuit shown in FIG. 4, the operating description is completed in one clock step (for example, within 100 nsec), but two adders 14 and 15 are required. In contrast, in the circuit shown in FIG. 6, two clock steps (for example, 200 nsec) are required, but only one adder 14 is used. In such a high-level synthesis method, the circuit of FIG. 4 is generated when a high-speed circuit is required, while the circuit of FIG. 6 is generated when a circuit having a small area is required.
As is understood from the scheduling result shown in FIG. 5, two addition operations can be performed using a single adder. However, when the number of operations is increased, a procedure for determining which operations are executed in a single operator is required. This procedure is allocation. Scheduling and allocation are well known in the high-level synthesis field (see the above-mentioned xe2x80x9cHigh Level Synthesisxe2x80x9d (Kluwer Academic Publishers)).
Next, a description is given of the case where an operating description includes a conditional branch. A method for high-level synthesizing an operating description including conditional branching requires generation of a control signal for controlling the conditional branching. The method is basically the same as the method shown in FIG. 1 which does not include conditional branching. The method including conditional branching is, for example, disclosed in Japanese Laid-Open Publication No. 11-250112.
Hereinafter, a method for high-level synthesizing an operating description including conditional branching will be described with reference to FIG. 7. In an operating description shown in FIG. 7, in the case of the if-condition where xe2x80x9cd greater than exe2x80x9d, x=axe2x88x92b, while in the case of the else-condition, x=a*(bxe2x88x92c). FIG. 8 shows a CDFG corresponding to such an operating description and a result of scheduling the CDFG. As shown in FIG. 8, the CDFG includes an IF node 32 as conditional branching.
Referring to FIG. 8, a comparison node 31 representing a comparison operation is scheduled in clock step 1, and the IF node 32 is scheduled in clock step 2. The comparison node 31 receives I/O branches from inputs xe2x80x9cdxe2x80x9d and xe2x80x9cexe2x80x9d, and is also connected with a control I/O branch 33 for transmitting a control signal k. The control I/O branch 33 is indicated by a dashed line in FIG. 8.
The IF node 32 scheduled in clock step 2 includes first and second sub-CDFGs 34 and 35. The first sub-CDFG 34 can include an IF node. The first sub-CDFG 34 is executed when the control signal k is xe2x80x9c1xe2x80x9d (true). The second sub-CDFG 35 is executed when the control signal k is xe2x80x9c0xe2x80x9d (false). Thus, the comparison node 31 and the IF node 32 require a controller, so that the nodes 31 and 32 are scheduled in different clock steps.
A first subtraction node 36 receiving inputs xe2x80x9caxe2x80x9d and xe2x80x9cbxe2x80x9d is provided in the sub-CDFG 34 indicated by xe2x80x9ctruexe2x80x9d. A second subtraction node 37 and a multiplication node 38 are provided in the sub-CDFG 35 indicated by xe2x80x9cfalsexe2x80x9d. The second subtraction node 37 receives I/O branches from inputs xe2x80x9cbxe2x80x9d and xe2x80x9ccxe2x80x9d. The multiplication node 38 receives I/O branches from the input xe2x80x9caxe2x80x9d and the second subtraction node 37. Since the sub-CDFGs 34 and 35 are not simultaneously executed, the same single subtracter 45 (FIG. 9) can be allocated as each of the first and second subtraction nodes 36 and 37 provided in the sub-CDFGs 34 and 35, respectively.
FIG. 9 shows a circuit for executing the scheduling result of FIG. 8. In the circuit of FIG. 9, a comparator 41 is allocated as the comparison node 31, a first selector 43 is allocated for selecting inputs xe2x80x9caxe2x80x9d and xe2x80x9cbxe2x80x9d, and a second selector 44 is allocated for selecting inputs xe2x80x9cbxe2x80x9d and xe2x80x9ccxe2x80x9d. Further, the single subtracter 45 is allocated as each of the first and second subtracters 36 and 37. A multiplier 46 is allocated as the multiplication node 38. A third selector 47 is allocated for selecting the outputs of the subtracter 45 and the multiplier 46. Each of the first through third selectors 43, 44, and 47 are controlled by a control signal k from a controller 42. When the value of the control signal k is xe2x80x9c1xe2x80x9d, the first selector 43 selects the input xe2x80x9caxe2x80x9d, the second selector 44 selects the input xe2x80x9cbxe2x80x9d, and the third selector 47 selects the output of the subtracter 45.
Next, such a circuit will be described. In clock step 1, the input values xe2x80x9cdxe2x80x9d and xe2x80x9cexe2x80x9d are compared by the comparator 41, and the result of the comparison is output to the controller 42. The controller 42 generates a control signal k based on the comparison result input from the comparator 41 and outputs the control signal k in the next clock step 2. The control signal k is xe2x80x9c1xe2x80x9d (true) when the comparison result of the comparator 41 is xe2x80x9cd greater than exe2x80x9d, and otherwise is xe2x80x9c0xe2x80x9d (false). In clock step 2, when the control signal k is xe2x80x9c1xe2x80x9d, the first and second selectors 43 and 44 select inputs xe2x80x9caxe2x80x9d and xe2x80x9cbxe2x80x9d, respectively, the subtracter 45 executes the subtraction xe2x80x9caxe2x88x92bxe2x80x9d, and the third selector 47 selects the output of the subtracter 45 and outputs the result of the subtraction xe2x80x9caxe2x88x92bxe2x80x9d as output xe2x80x9cxxe2x80x9d.
When the value of the control signal k is xe2x80x9c0xe2x80x9d, the first and second selectors 43 and 44 select inputs xe2x80x9cbxe2x80x9d and xe2x80x9ccxe2x80x9d, respectively, the subtracter 45 executes the subtraction xe2x80x9cbxe2x88x92cxe2x80x9d, and the subtraction result is output to the multiplier 46 which in turn multiplies the subtraction result of the subtracter 45 with the input value xe2x80x9caxe2x80x9d and outputs the multiplication result to the third selector 47. The third selector 47 selects the output of the multiplier 46 and outputs the result of the multiplication xe2x80x9ca*(bxe2x88x92c)xe2x80x9d as output xe2x80x9cxxe2x80x9d.
Scheduling and allocation in a high-level synthesis method are very complicated. When an operating description includes a number of operations, the number of nodes in a corresponding CDFG is large, so that computation of the high-level synthesis requires a significantly long time.
For a high-level synthesis, a total operating time is calculated based on the operating times of operators corresponding to the respective nodes in a CDFG. For example, referring to FIG. 10, first and second xe2x80x9cNOTxe2x80x9d nodes 51 and 52 each having an operating time of 2 nsec and an xe2x80x9cANDxe2x80x9d node 53 having an operating time of 8 nsec are scheduled within a clock period of 10 nsec. As a result of scheduling, a circuit is generated as shown in FIG. 11A. In FIG. 11A, xe2x80x9cNOTxe2x80x9d operators 54 and 55 are allocated as the xe2x80x9cNOTxe2x80x9d nodes 51 and 52, respectively, and an xe2x80x9cANDxe2x80x9d operator 56 is allocated as the xe2x80x9cANDxe2x80x9d node 53.
The circuit shown in FIG. 11A can be optimized into a circuit shown in FIG. 11B in which a pair of xe2x80x9cNOTxe2x80x9d operators 51 and 52 and one xe2x80x9cANDxe2x80x9d operator 56 are logically equivalent to a single xe2x80x9cNORxe2x80x9d operator 57. As a result, the total operating time is reduced to the operating time of the single xe2x80x9cNORxe2x80x9d operator 57 which is 6 nsec, for example, which is thus different from the operating time (10 nsec) estimated in the high-level synthesis. That is, the estimated operating time is more than necessary. This leads to an increase in the number of clock steps in a scheduled CDFG, which slows the operation of a resulting circuit.
Moreover, in a high-level synthesis, a selector is sometimes inserted before an operator such as an adder, as shown in FIG. 6 where the selector 23 is provided before the adder 14, for example. In this case, the operating time is increased, so that the circuit is not likely to operate normally.
When an operating description including conditional branching as shown in FIG. 7 is subjected to high-level synthesis, if the number of I/O branches of the conditional branching node is large, a synthesized circuit includes a number of selectors. Thus, there is a problem that the scale of the circuit is enlarged.
Furthermore, when a control signal for conditional branching is generated by the controller 42 as shown in FIG. 9, the controller 42 typically operates in synchronization with a clock. Therefore, a condition needs to be judged in a clock step before executing one of the conditional branches. Thus, a clock step in which the control signal for the conditional branching is generated is separated from a clock step in which one of the conditional branches is executed. In this case, the circuit operation is likely to be slow.
According to one aspect of the present invention, a high-level synthesis method comprises the steps of: converting an operating description describing one or more operations to a control data flow graph (CDFG) including one or more nodes representing the one or more operations and one or more I/O branches representing a flow of data; scheduling the CDFG obtained by the converting step; and allocating one or more logic circuits required for executing the CDFG obtained by the scheduling step. A portion of the CDFG in the converting step is subjected to logical synthesis in advance to generate a node, and the portion of the CDFG is replaced with that node.
In one embodiment of this invention, the portion of the CDFG corresponds to conditional branching.
In one embodiment of this invention, the conditional branching includes a branch to be executed when a condition of the conditional branching is satisfied, and a branch to be executed when the condition of the conditional branching is not satisfied, and the branches are processed in a same clock period.
In one embodiment of this invention, a specific node originally included in the portion of the CDFG is placed outside the portion of the CDFG.
In one embodiment of this invention, a specific node originally included in the portion of the CDFG is placed outside the portion of the CDFG by dividing the portion of the CDFG.
In one embodiment of this invention, when a plurality of identical operators are allocated in the portion of the CDFG in the allocating step and when an area of one of the plurality of identical operators is as large as or larger than a constant times an area of a selector, nodes corresponding to the plurality of identical operators are placed outside the portion of the CDFG.
According to another aspect of the present invention, a storage medium stores a program including a high-level synthesis method comprising the steps of: converting an operating description describing one or more operations to a control data flow graph (CDFG) including one or more nodes representing the one or more operations and one or more I/O branches representing a flow of data; scheduling the CDFG obtained by the converting step; and allocating one or more logic circuits required for executing the CDFG obtained by the scheduling step. A portion of the CDFG in the converting step is subjected to logical synthesis in advance to generate a node, and the portion of the CDFG is replaced with that node.
Thus, the invention described herein makes possible the advantages of providing a high-level synthesis method in which an operating time required is short, the accuracy of estimation of the operating time is high, the operating speed of a high-level synthesized circuit is high, and the size of the high-level synthesized circuit is small.
These and other advantages of the present invention will become apparent to those skilled in the art upon reading and understanding the following detailed description with reference to the accompanying figures.