Logic devices such as field programmable gate arrays (FPGAs) are used to implement large systems that may include million of gates and megabits of embedded memory. The complexity of large systems often requires the use of electronic design automation (EDA) tools to create and optimize a design for the system onto physical target devices. Among the procedures performed by EDA tools in a computer aided design (CAD) flow are synthesis, placement, and routing.
Timing analysis is an important aspect of design that allows the EDA tools to determine whether certain synthesis, placement, and/or routing decisions allow a design to satisfy system timing requirements. If a particular synthesis, placement, and/or routing decision does not satisfy system timing requirements, alternate strategies may be explored and/or notification may be provided to the system designer. Timing analysis may be performed during or after synthesis, placement, and routing.
One component in timing analysis is the computation of LUT delay. LUT delay is a numerical delay through a LUT, an elementary programmable logic block on an FPGA. A LUT may have multiple binary inputs and a single output. A LUT is programmable to represent any Boolean function through its LUTMASK, which is a vector of binary values. A LUT delay is specified by input and output port names, input and output signal transition, and CRAM bits selected.
One approach used by timing analyzers in the past was to use a predetermined delay value that represented the worst delay for all paths through a LUT for a data input. This approach reduced the amount of work required of the timing analyzer, but produced a result that was pessimistic and less accurate. Another approach used by timing analyzers was LUTMASK based delay modeling which computed more accurate actual delays by taking into account actual CRAM bit paths. When performing, LUTMASK based delay modeling using the Liberty Model, timing analyzers treated LUTs as black boxes and computed delays for all possible LUTMASK configurations. This required 216 delay computations for a single LUT delay. Given that each 4 input LUT has 32 delays (4 inputs*4 transitions*2 min/max values), timing analyzers were required to compute and store at least 32 *216 LUT delay values. This approach required additional run time and memory resources.