For many years, the performance of digital machine designs has been evaluated by performing static timing analysis on the designs. Timing analysis is a design automation tool which provides an alternative to the hardware debugging of timing problems. The program is intended to establish whether all the paths within the design meet stated predetermined timing criteria, i.e., whether data signals arrive at storage elements early enough for a valid capture, but not so early as to cause premature capture.
Propagation Segments, Timing Points, and Timing Graph:
Static timing analysis (STA) of digital systems has been used for decades to analyze the performance of digital designs. When performing an STA, elements of the design which can delay signals, such as logic gates, wires, and combinations thereof, are represented by propagation segments, also referred to as ‘psegs’, and which represent the delays of the elements. Known in the art are systems where delays are determined by circuit simulations, as well as systems where delays are computed by various approximations to a simulated result. Each pseg includes two timing points, a “from” timing point representing an input to the delay element and a “to” timing point representing an output of the delay element. The timing graph of the design as a whole includes a full set of all the timing points and psegs of all logic gates, interconnect wires, and like delay elements in the design.
Delay Models, Delay Model Construction:
For large hardware designs, static timing analysis is preferably done hierarchically. In a hierarchical static timing analysis, the STA is performed by generating abstracted timing models for each subsection of the design, starting at the lowest level with the logic macros. An abstracted timing model is a representation of the effect of a subsection of the design on the timing analysis of the design as a whole. For example, it may consist of a subset of the psegs copied from the timing graph of that subsection, typically omitting psegs which purely affect the internal timing of that subsection. The models are then combined into successively higher level timing analyses and abstracted timing models until the analysis of the entire system is complete. This is necessary in order to keep the memory requirements of the analysis affordable. Hierarchical static timing analysis requires that delay models be generated for lower level subsections of the design in order to use them for performing the static timing analysis of larger sections of the design that contains them. A delay model represents the delay through a circuit topologically with psegs, but unlike the STA of a lowest level section of a design, the delay model calculates the delay through a pseg without circuit simulations at the time that the delay model is used. Instead, a delay model calculates delays based on stored information from circuit simulations which were done when the delay model was being constructed.
Delay Sampling:
When the delay model is being constructed for a subsection of a design, various parameters affecting the delays through the subsection in the complete design are generally not known. For instance, the input slew (a measure of the time it takes for the input to transition) and output load for a subsection of a design are generally unknown at the time when the delay model is constructed. This is generally handled by sampling the variation of delay with respect to these parameters, varying these parameters through a number of values, performing a circuit simulation for each combination, and capturing the delay in the model.
Single Input Switching, Multiple Input Switching, Input Skew, MIS Penalty:
In the simplest form of STA, only the input at the “from” end of a pseg switches in any given circuit simulation. In this form, any other inputs are held at constant logical values. For example, a NOR gate with two inputs A and B, and with an output Z will be simulated with input A rising while input B is held at zero to produce one of the delays from A to Z on the pseg from A to Z. This is referred to as a single input switching (SIS) event or simulation. More sophisticated analyses also includes simulations where more than one input switches. For example, if two inputs to a NOR gate rise, the output also falls, but with a different delay than if a single input switches. This is referred to as a multiple input switching (MIS) event or simulation. The value of the delay from the simulation depends upon the relative timing between the two switching inputs, the skew therebetween is illustrated with reference to FIGS. 1a and 1b. The delay model for a subsection of the design must report the worst possible delay for each pseg. Adding information about MIS simulations, in addition to the information about the SIS simulations, makes the delays worse. This is the MIS penalty for a pseg (see FIGS. 1c and 1d). In other words, the MIS penalty is the difference between the worst possible pseg delay, including both MIS and SIS simulation results, and the worst delay when only the SIS results are included. Static timing analysis typically is used to find both maximum (late mode) and minimum (early mode) arrival times. The MIS penalty may be either positive (increasing the SIS delay) or negative (decreasing the SIS delay), and a positive MIS penalty would preferably be used only for late mode analysis, while a negative delay would preferably be used only for early mode analysis. A delay change due to MIS typically occurs because multiple transistors within a gate, or CCC (channel connected component) can contribute to an output transition. Multiple input switching on inputs connected to gates of transistors driving an output through a series path, such as the PFETs in the NOR gate of FIG. 1a will typically cause a positive MIS penalty. Multiple input switching on inputs connected to gates of transistors driving an output through parallel paths, such as the NFETs in the NOR gate of FIG. 1a will typically cause a negative MIS penalty. The example of FIGS. 1a-d illustrates a falling output transition driven by one or both of the parallel NFETs, and a resulting negative MIS penalty.
MIS Window:
MIS simulations include switching events on a number of circuit inputs, separated by the skews between the input switching events. If the skews become sufficiently large, the circuit has time to settle between each successive input switch. In this limit, the circuit is essentially responding to a sequence of SIS events, and the MIS penalty no longer applies. This occurs when the skew between any two inputs becomes sufficiently large in either a positive or negative direction. Thus, there is a window of skew within which the MIS penalty applies, as illustrated in FIG. 1d. 
MIS simulations create skew dependence problems for delay model generation in certain portions of a macro. In the portion of the macro from the PIs (primary inputs) to the data inputs of the first level of non-transparent latches or flip-flops (represented by paired boxes), the skew between inputs to each CCC typically depends on the skew, as seen in FIG. 2, between the macro PIs, which is not normally known when the abstracted timing rule is being generated. Even more seriously, since some macros are used in multiple places within an overall design, and multiple usages may have different skews between their PIs' arrival times (ATs), the skew cannot be uniquely specified when the abstract is generated, as illustrated in FIG. 3. Delay models for individual gates are also typically reused for many different instances of a gate within a design with different skews between the ATs of the gate inputs, and thus the MIS penalty will differ between different instances of the gate and cannot be determined at model generation time.
There are other portions of a macro, latch to latch and latch to output paths, paths beyond the first level of non-transparent latches, where relative skew can be determined safely when the abstract is generated. In these areas, all CCC primary input arrival times are traceable directly or through LCBs (local clock buffers) to the macro clock PI such that the relative skew is fixed, and this relative skew can be applied as a known quantity in the simulation of each CCC in these portions of the macro, as illustrated in FIG. 4. In some cases macros may have multiple clock inputs, and if the skews between these clock inputs can vary or is not known at model generation time, the input AT skews for CCCs driven by latches controlled by different input clocks and the resultant MIS penalties will also be unknown at model generation time.
More generally, any non-transparent or edge-triggered logic structure (such as some domino circuits, if modeled non-transparently) can also bound the portion of the macro affected by macro PIs' ATs' skew. Any CCC whose input ATs are determined by the AT of a single PI (data or clock) will have a fixed skew which can be applied as a known quantity during delay model generation.
Capturing delays due to MIS simulations where the skews cannot be predetermined poses a problem for the delay model construction. Because these delays are sensitive to the skew between inputs, for maximum accuracy, one may like to sample various combinations of these skews as one does for the other parameters that the delay depends on. The difficulty is that the number of samples needed grows exponentially with the number of inputs (see FIG. 5). This greatly increases the cost of constructing the delay model. The prior art includes a number of approaches:                i) Creating a delay model based on single input switching delays and ignoring MIS or adding an averaged penalty factor. This has the advantage of avoiding the cost of any multiple input switching simulations during abstract rule generation. This has the disadvantage of approximating the MIS effects with a global average insensitive to the magnitude of the effect on each specific circuit.        ii) Creating a delay model by explicitly sampling the possible skews that might feed into each CCC, performing each of these simulations during abstract generation, and capturing the resulting delays as a table for interpolation during delay model use. For a sensitization with N inputs switching there are N−1 relative skews to be sampled (ignoring differences in slews on the inputs), so this expands the space to be sampled during abstract generation by N−1 dimensions. For example, creating a delay model for a 4-input NAND while accounting for only SIS simulations requires a delay rule which depends upon input slew and output load, 2-dimensions. In contrast, creating a rule which also accounts for MIS simulations requires a delay rule which depends upon input slew, output load, the skew between input 1 and input 2, the skew between input 2 and input 3, and the skew between input 3 and input 4, a total of five dimensions which must be sampled, for an addition of three extra dimensions in addition to those needed for the SIS rule. This has the advantage of capturing the full skew dependence, but the disadvantage of very high simulation costs during abstract generation and very high model size, as both the number of simulations and the size of the resulting data table grow exponentially with the number of dimensions. For K sampling points in each of D dimensions, the number of required simulations and table entries is KD.        iii) Similar to (ii), but fitting the delay model to a linear or other functional form within a window. This has the advantage of capturing most of the dependence on macro PI AT skew for use in the upper level timing analysis and reducing the size of the resulting delay model, but retains the disadvantage of performing many simulations to sample possible skews between inputs.        iv) Creating a delay model with a single simulation of the MIS sensitization with a worst case alignment of the side input ATs, and worst-casing the resulting delay value into the delay model for use in all higher level timing analyses. This has the advantage of confining the cost to a single simulation, but it also has the disadvantage of pessimism, applying this worst (minimum for early mode or maximum for late mode) result in the higher level analyses regardless of the macro PI AT skew. Note that because an MIS penalty is normally applied only to one of the early and late mode analyses, the SIS delay characterization is still needed for the other mode.        v) Similar to (iv) but when characterizing the delay from a particular input, allowing all other side inputs ATs to move in optimistic directions (move earlier for late mode analysis and later for early mode analysis) to reduce the penalty from the MIS simulations when determining the delay from the particular input. Typically this movement is allowed until the AT is the same as that of the particular input, or until it would (using delays from SIS modeling) cause the same AT at the output. For the input with the worst AT (maximum for late mode or minimum for early mode) the MIS penalty would be the same as for (iv). Again, this has the advantage of confining the cost to one MIS simulation per input. Although it reduces the pessimism of the abstracted rule, it also creates the possibility of optimism.        
In view of the foregoing, there is a need to improve the trade-off between the cost of generating an abstracted timing rule for a logic macro from a transistor level timing which contains contributions from multiple input switching simulations and the pessimism of timing rules.