1. Technical Field
The present invention relates generally to the field of static timing analysis of logic circuitry, and more particularly to a method of distributed static timing analysis for a network which has been partitioned into at least two partitions with each partition being assigned to a separate timing analysis process.
2. Background of Invention
Static timing analysis may be applied to a logic circuit to determine if it meets predetermined specifications. Two broad classes of static timing analysis methods are known, path oriented and block oriented. The path oriented method traces each path through the logic network and computes the arrival times at nodes separately for each path. These methods often take a long time to complete as the number of paths in a network can rise exponentially with the size of the network. The number of paths referred to in a timing analysis can be pruned down, but this cannot always be done perfectly and some paths may be missed. In block oriented methods, the arrival times at nodes other than the primary inputs are computed only once, as the maximum, over all in-edges of the in-edge source arrival time plus the logic circuit delay. This method gives much better control over the run time of the analysis, and is the preferred approach.
In the block oriented methods, there are two techniques which are commonly used to control the order of the initial arrival time calculation. One technique requires that the network be levelized so that each node is assigned a number which is greater than that of the nodes which precede it. The arrival times are then computed in the order of the level of the node. When using this technique, another method must be used in order to detect loops in the network.
The second technique performs a recursive depth first search (DFS) from the point where an arrival time is requested. The current invention is based on a block oriented method using this latter technique. The arrival times for a node are computed when all predecessor nodes have had their arrival times computed. If a node is encountered for which an arrival time has already been computed, the DFS along that branch is terminated and the stored arrival time is used. If no arrival time is present, the in-edges of he delay graph are looped through and a request is made for the source arrival time and the logic circuit delay. These are added and the maximum sum is used as the arrival time at the node. The DFS can also detect and cut local loops. Although the above described methods work well in certain situations, they are not without limitations.
Design of logic networks is becoming increasingly complex and may include thousands of logic elements and many interconnections between the elements. As a result, actually performing a static timing analysis in one of the above described ways cannot be done efficiently as the analysis may become too large to be held in memory, run too slowly on one processor, or require too many resources.
U.S. Pat. No. 5,602,754 to Beatty et al. proposes a method to solve these problems. This method partitions the network in some manner. Each partition of the network is then assigned to a separate timing analysis process, each of which may be running on separate computers. Communications are established between the processes in order to exchange data therebetween. This allows the complex task of timing analysis to be processed in parallel by the separate processes, thereby increasing the system performance.
However, the method in Beatty has several limitations. It is not able to handle global loops in the network, i.e., a loop between two or more of the partitions. Global loops are becoming increasingly common with the use of transparent single-latch designs. If the method of Beatty et al. encounters a global loop, it gets into an infinite recursion loop, i.e., the analysis fails. Also, Beatty et al. does not handle inexact synchronization between portions of a distributed timing analysis process. Additionally, Beatty et al. cannot perform incremental initial timing analysis. As described, Beatty et al. always performs an analysis on the entire network, even when only a subset of the timing information is needed.
There is a need for a method of static timing analysis which can be distributed to run on several processors in parallel, which can detect and cut global loops, and which can perform incremental initial timing analysis and timing updates on a portion of the logic network.
The invention comprises a method of distributed timing analysis for a network which has been partitioned into at least two partitions, with each partition being assigned to a separate timing analysis process which communicates with the other processes. In accordance with one embodiment, a request is made for timing information at a node in the logic network and a Depth First Search (DFS) is performed from the requested node. This is done in a known manner and the loops which are local to a process, i.e. within a given partition, are detected and cut. Local primary inputs with known timing values are added to a propagation queue. A request to other processes, i.e. for other partitions is issued in order to process nodes which require timing information from other partitions. When a request for timing information from another partition is answered, the associated node is placed in the propagation queue. Each node in the propagation queue is processed and a timing value is computed for it. As a node is processed, those of its successors which have had all their predecessors processed are added to the propagation queue. This continues until a timing value is computed for the requested node and is returned to the requester.
According to another embodiment of the invention, global loops which exist between the partitions are detected and cut as follows. Each process generates an input/output (I/O) connection map for its partition listing each input/output pair between which a path exists in the partition. This I/O map is then communicated to each neighboring partition. Each neighboring partition receives an I/O map from its neighboring partitions and merges these with its own I/O map, which is then passed on, and so on. Each process checks the I/O maps it receives and when it receives a map which includes some of its own inputs or outputs, it determines whether these connections result in a global loop. The processes then negotiate with each other to determine the best place to cut the loop. Thereafter, the DFS may continue as already described.
In another embodiment of the present invention, a method is provided which can provide updated timing value information for the network after changes have been made to the network. This is done using the most current available information from all the other processes. A global request is transmitted to all processes to process their timing queues up to the necessary level and send the information to each partition subject to change.
In another embodiment, the updated timing values may be generated using the most current information already received from the other processes without waiting for other information which has not yet been propagated. This is done by having the processes send unrequested updates to processes which depend on them for values.