1. Field of the Invention
The present invention generally relates to electronic design automation (EDA). More particularly, the present invention relates to dynamically changing the evaluation period to accelerate design debug sessions.
2. Description of Related Art
In general, electronic design automation (EDA) is a computer-based tool configured in various workstations to provide designers with automated or semi-automated tools for designing and verifying user""s custom circuit designs. EDA is generally used for creating, analyzing, and editing any electronic design for the purpose of simulation, emulation, prototyping, execution, or computing. EDA technology can also be used to develop systems (i.e., target systems) which will use the user-designed subsystem or component. The end result of EDA is a modified and enhanced design, typically in the form of discrete integrated circuits or printed circuit boards, that is an improvement over the original design while maintaining the spirit of the original design.
The value of software simulating a circuit design followed by hardware emulation is recognized in various industries that use and benefit from EDA technology. Nevertheless, current software simulation and hardware emulation/acceleration are cumbersome for the user because of the separate and independent nature of these processes. For example, the user may want to simulate or debug the circuit design using software simulation for part of the time, use those results and accelerate the simulation process using hardware models during other times, inspect various register and combinational logic values inside the circuit at select times, and return to software simulation at a later time, all in one debug/test session. Furthermore, as internal register and combinational logic values change as the simulation time advances, the user should be able to monitor these changes even if the changes are occurring in the hardware model during the hardware acceleration/emulation process.
Co-simulation arose out of a need to address some problems with the cumbersome nature of using two separate and independent processes of pure software simulation and pure hardware emulation/acceleration, and to make the overall system more user-friendly. However, co-simulators still have a number of drawbacks: (1) co-simulation systems require manual partitioning, (2) co-simulation uses two loosely coupled engines, (3) co-simulation speed is as slow as software simulation speed, and (4) co-simulation systems encounter race conditions.
First, partitioning between software and hardware is done manually, instead of automatically, further burdening the user. In essence, co-simulation requires the user to partition the design (starting with behavior level, then RTL, and then gate level) and to test the models themselves among the software and hardware at very large functional blocks. Such a constraint requires some degree of sophistication by the user.
Second, co-simulation systems utilize two loosely coupled and independent engines, which raise inter-engine synchronization, coordination, and flexibility issues. Co-simulation requires synchronization of two different verification enginesxe2x80x94software simulation and hardware emulation. Even though the software simulator side is coupled to the hardware accelerator side, only external pin-out data is available for inspection and loading. Values inside the modeled circuit at the register and combinational logic level are not available for easy inspection and downloading from one side to the other, limiting the utility of these co-simulator systems. Typically, the user may have to re-simulate the whole design if the user switches from software simulation to hardware acceleration and back. Thus, if the user wanted to switch between software simulation and hardware emulation/acceleration during a single debug session while being able to inspect register and combinational logic values, co-simulator systems do not provide this capability.
Third, co-simulation speed is as slow as simulation speed. Co-simulation requires synchronization of two different verification enginesxe2x80x94software simulation and hardware emulation. Each of the engines has its own control mechanism for driving the simulation or emulation. This implies that the synchronization between the software and hardware pushes the overall performance to a speed that is as low as software simulation. The additional overhead to coordinate the operation of these two engines adds to the slow speed of co-simulation systems.
Fourth, co-simulation systems encounter set-up, hold time, and clock glitch problems due to race conditions among clock signals. Co-simulators use hardware driven clocks, which may find themselves at the inputs to different logic elements at different times due to different wire line lengths. This raises the uncertainty level of evaluation results as some logic elements evaluate data at some time period and other logic elements evaluate data at different time periods, when these logic elements should be evaluating the data together.
Another problem encountered by a typical designer is the relatively slow speed of logic evaluators. The typical logic evaluator has a common execution flow involving:
(1) taking the input signals, both clock and data,
(2) evaluating the design logic until all output signals stabilize, and
(3) go to step 1 and repeat the process.
The amount of time needed in step 2 (evaluation step) determines the speed of the logic evaluator; that is, the shorter the evaluation time, the faster the logic evaluator. Several factors determine the evaluation time. These factors include the interconnect technology between the FPGA logic devices and chips, the speed of the FPGA components, and the logic evaluation method. So, if faster FPGA components are used, the evaluation time should generally decrease.
Based on these factors, current logic evaluators utilize a fixed and statically calculated evaluation time for all possible input signals. This evaluation time may vary from one logic evaluator to another based on the factors mentioned above. So, a logic evaluator designed and manufactured by one company may be faster than a logic evaluator designed and manufactured by another company. However, within a logic evaluator, the evaluation time is fixed. Thus, having selected the interconnect technology, the FPGA components, and the logic evaluation method, the designer of the logic evaluator would calculate a constant time that would be needed to evaluate the inputs to this logic evaluator. For example, the designer may have to determine the longest trace length or circuit path from input to output to determine the longest evaluation time for this logic evaluator. By compensating for the longest possible circuit path, the designer has ensured that the calculated evaluation time is sufficiently long for all of the possible inputs to be evaluated to a stable output. This constant and statically calculated evaluation time raises two problemsxe2x80x94performance and static loop.
With respect to performance, the logic evaluator must be designed with an evaluation time that is long enough to handle the worst possible evaluation time needed for the inputs to be processed and stabilize at the output. So, for example, the longest trace length or circuit path must be considered in calculating the worst possible evaluation time. However, this approach is inefficient and sacrifices performance. Some internal studies have been done on a large number of ASIC designs and indicate that this statically calculated evaluation time is indeed inefficient and unnecessary.
For most input sequences to a given design, a very small percentage (about 1%) of the inputs requires the worst possible evaluation time. So, essentially 99% of all inputs are subject to the longer-than-necessary evaluation times. Indeed, a large percentage (about 80%) of all the inputs requires less than {fraction (1/100)} of the worst possible evaluation time. Similarly, a significant percentage (about 20%) of all the inputs requires between {fraction (1/100)} to {fraction (1/10)} of the worst possible evaluation time. By designing the evaluation cycle for the worst possible time, the logic evaluator is forced to execute in the slowest possible speed that is not warranted by 99% of its inputs. This is highly inefficient.
On a related matter, the worst possible evaluation time is difficult to calculate with the existence of static loops. As mentioned above, the worst possible evaluation time is typically calculated by statically analyzing the design and determining the worst possible propagation delay after the design is mapped to the logic evaluator. In many cases, a design can have many static combinational feedback loops. Generally speaking, the worst propagation time is exponential to the nesting level of the loops. This not only makes the delay calculation difficult, but the calculated worst possible delay is too long to be practical for either simulation acceleration or emulation applications. On the other hand, for most practical designs, the static feedback loops are just false paths that cannot be resolved at compile time and does not exist at run time.
Accordingly, a need exists in the industry for a system or method that addresses problems raised above by currently known simulation systems, hardware emulation systems, hardware accelerators, co-simulation, and coverification systems.
One embodiment of the present invention provides a dynamic logic evaluation system and method which dynamically calculates the minimum evaluation time for each input. Thus, this system and method will remove the performance burden that a fixed and statically calculated evaluation time would introduce. By dynamically calculating different evaluation times based on the input, the overall evaluation time is shortened by 10 to 100 times compared to the current statically calculated constant evaluation time techniques. In addition, the static loop problem will no longer be an issue.
In accordance with one embodiment of the present invention, the dynamic logic evaluation system and method comprises a global control unit coupled to a propagation detector, where the propagation detector is placed in each FPGA chip. The propagation detector in the FPGA chip alerts the global control unit of any input data that is currently propagating within the FPGA chips. A master clock controls the operation of this dynamic evaluation system and method. As long as any input data is propagating, the global control unit will prevent the next input from being provided to the FPGA chips for evaluation. In effect, so long as the output has not stabilized with the given input, the next set of inputs will not be processed. Once the output has stabilized, the global control unit will then instruct the system to accept and process the next set of input data.
Thus, the global control unit in conjunction with the propagation detectors can dynamically provide varying evaluation time periods based on the needs of the input data. Whether the system needs longer or shorter evaluation times, the system will dynamically adjust the amount of time necessary to properly process that input and then move on to the next evaluation time for the next set of inputs. As signals stabilize sooner, the faster the logic evaluation process. For the 1% case where the input requires the worst possible evaluation time, the global control unit will delay the expiration of the evaluation time until the output has stabilized.
The global control unit includes a global propagation delay register (PDR) and a global propagation delay counter (PDC). The PDR contains the value of a particular number of cycles. This number can range from 1 to 10, however, other values beyond 10 are also possible. The PDC is a down counter. The PDC counts down at every master clock cycle from whatever value is in the counter. The PDC normally gets the counter value from the PDR. When the down counter PDC reaches 0, the signal to process the next input is triggered. However, until this down counter PDC reaches 0, the next set of inputs will not be processed.
The propagation detector (PD) tells the global control unit when the system still contains data that has not stabilized yet; in other words, the input data is still being evaluated and the output has not stabilized yet. When the PD informs the global control unit that data is still propagating in the circuit design, the global control unit will load the value in the PDR into the down counter PDC.
These and other embodiments are fully discussed and illustrated in the following sections of the specification.