With innovative technologies, ASIC manufacturers are able to deliver system complexities up to 15-million gates on a single chip. Complete systems comprising cores, memories, and random logic are integrated on a single piece of silicon. Designers may design and verify a complete system for sign-off by submitting complex system models to electronic design automation (EDA) and verification and floor-planning tools to design and verify the complete system.
FIG. 1 is a block diagram illustrating a conventional ASIC design flow. The design flow includes a front-end design process that creates a logical design for the ASIC, and a back-end design process that creates a physical design for the ASIC. The front-end design process begins with providing a design entry 10 for an electronic circuit that is used to generate a high-level electronic circuit description, which is typically written in a Hardware Description Language (HDL)                12. Although many proprietary HDLs have been developed, Verilog HDL and VHDL are the major standards.        
The design includes a list of interconnections that need to be made between the cells of the circuit; but physical properties for the interconnects have yet to be determined. Therefore, the designer needs an estimation of physical properties to help determine timing within circuit. Interconnect data from previous designs are used to generate interconnect statistical data to use as the estimation in step 14. The interconnect statistical data is used to create a wire load model 16, which defines the resistance, capacitance, and the area of all nets in the design. The statistically generated wire load model 16 is used to estimate the wire lengths in the design and define how net delays are computed.
The HDL 12 and the wire load model 16 are then input into a logic synthesis tool 18 to generate a list of logic gates and their interconnections, called a “netlist” 20. It is important to use wire load models 16 when synthesizing a design, otherwise, timing information generated from synthesis will be optimistic in the absence of net delays. The timing information will also be inaccurate when a poor wire load model 16 is used.
Next, system partitioning is performed in step 22 in which the physical design is partitioned to define groupings of cells small enough to be timed accurately with wire load models 16 (local nets). The resulting design typically includes many cells with many interconnect paths. A prelayout simulation is then performed in step 24 with successive refinement to the design entry 10 and to logic synthesis 18 to determine if the design functions properly.
After prelayout simulation 24 is satisfactory, the back-end design process begins with floor planning in step 26 in which the blocks of the netlist 20 are arranged on the chip. The locations of the cells in the blocks are then determined during a placement process in step 28. A routing process makes connections between cells and blocks in step 30. Thereafter, circuit extraction determines the resistance and capacitance of the interconnects in step 32.
After circuit extraction, a parasitic extraction process is performed in which a delay prediction application calculates the net delays across the entire design in step 33. The net delays are then input to a post-layout simulation in step 34, with successive refinement to floor planning 26 as necessary.
FIG. 2 is a block diagram illustrating a conventional delay prediction process performed after parasitic extraction. Delay prediction is typically performed by a monolithic software application 40 running on a server 42 or mainframe that estimates the delays across all the gates represented in the netlist 20, and outputs the estimated delays as a standard delay format (SDF) output file 44.
Conventional delay prediction process is slow and resource intensive. For example, delay prediction for a 10 million gate ASIC design may take up to two days to complete. If the subsequent post layout simulation proves inaccurate or the netlist 20 is changed, then the entire physical design process, including the delay prediction, must be repeated. The process is resource intensive because it requires huge physical memory configuration and extremely fast CPUs to run the delay prediction software application 40. For example, to adequately perform a delay prediction calculation for a 10 million gate design, an enterprise class server with 16 GB of physical memory is required, such as a SUN E4500 server. And because the delay prediction is so resource intensive, the use of other software tools on the server are precluded. Therefore, the delay prediction process is not only time-consuming, but is also expensive in terms of hardware requirements.
Previous attempts have been made to speed the delay prediction process. One approach attempted to increase the efficiency of the sequential delay prediction calculations by identifying performance bottlenecks in the delay prediction application 40 and optimizing them to speed up the overall process. This approach only resulted in minimal improvements and failed to have a significant effect on the overall time for the process 40 to run. Another approach attempted to improve runtime by using a multi-threaded delay prediction application, rather than a single-threaded application. Although multithreading improved runtime somewhat, multithreading was not sufficient for reducing delay prediction runtimes on very large designs. In addition, both approaches still required expensive hardware to run the delay prediction application 40.
Accordingly, what is needed is an improved method for performing delay prediction of multi-million gate sub-micron ASIC designs. The delay prediction process should result in significant time improvements over the monolithic application approach and require less expensive hardware resources. The present invention addresses such a need.