1. Technical Field
The present invention relates generally to the automated layout of integrated circuits. In particular, the present invention is directed toward automatic generation of optimized wire routing in very large scale integration (VLSI) circuits.
2. Description of Related Art
In recent years, it has become commonplace for integrated circuit designers to build an integrated circuit layout from libraries of reusable high-level modules, sometimes referred to as “macro blocks.” Proprietary macro blocks are often referred to as “intellectual property blocks” (“IP blocks”), to emphasize their relatively intangible, yet proprietary nature. Computerized integrated circuit design tools may be used to store, retrieve, and combine macro blocks into complete integrated circuits. This design philosophy of combining reusable macro blocks to produce a complex integrated circuit is known as “system-on-a-chip” (SoC) design.
Designing a “system-on-a-chip” involves designing the interconnections between macro blocks. Despite the apparent simplicity of SoC design, this is often not a trivial task. The reason for this is that the connections themselves are physical components (i.e., wires) with non-ideal properties. Like all electrical conductors, integrated circuit connections suffer from delay and signal loss due to physical properties such as resistance, capacitance, and relativistic limitations on the speed at which electrons are able to travel. In order to ensure that all components in an integrated circuit are properly synchronized to work properly, it is important to take these factors into account when designing interconnections between macro blocks to minimize signal loss and to allow operation within acceptable timing specifications.
The “Fast Path” algorithm, described in Hai Zhou, D. F. Wong, I-Min Liu, and Adnan Aziz, “Simultaneous Routing and Buffer Insertion with Restrictions of Buffer Locations,” IEEE Trans. Computer Aided Design, vol. 19, pp. 819-824, July 2000, hereby incorporated by reference, is an algorithm that computes a path connecting two nodes in an integrated circuit layout, where the path is optimized for minimal delay. The Fast Path algorithm also allows for an optimal placement of buffers within the path in order to further minimize delay.
The Fast Path algorithm is based on the observation that minimization of delay in a circuit layout is a special case of the well-known problem in computer science of finding the “shortest path” in a weighted graph. In an integrated circuit, points along the surface of the integrated circuit may be thought of as vertices in a graph. The interconnections made between the points may be thought of as edges. Each possible interconnection has an associated delay value, which may be thought of as an edge weight. When this graph representation is adopted, finding a minimum-delay path between two points reduces down to the problem of finding the minimum total-weight path between the two vertices representing the two points (i.e., finding the shortest path).
Dijkstra's Algorithm, one of the true classics of computer science, is an algorithm for finding the shortest path from a single source vertex in a weighted directed graph where the weights are non-negative(such as in the case of delays). Dijkstra's Algorithm and single-source shortest-paths algorithms in general are described in Cormen, Leiserson, and Rivest, Introduction to Algorithms, MIT Press, 1990, pp. 514-532. The Fast Path algorithm is based on Dijkstra's Algorithm, and an understanding of Dijkstra's Algorithm goes a long way in helping one to understand the Fast Path algorithm and its limitations.
Dijkstra's Algorithm is what is known as a “greedy algorithm,” because it exploits a property of the shortest path problem that is known as a reedy-choice property. A problem has a “greedy-choice property” if finding an optimal solution to some sub-problem (called making a “greedy choice”) always yields an optimal solution to the problem as a whole. In the case of the shortest-paths problem, a subpath of the shortest path between two vertices in a graph is itself the shortest path between its end vertices.
Dijkstra's Algorithm, in its most general sense, takes as an input a graph G=(V,E), where V is the set of vertices and E⊂{(u,v)|u,vεV} is the set of edges in the graph, a source vertex sεV and a weight function w mapping each edge with a non-negative weight value. Dijkstra's Algorithm also maintains a set S of vertices for which the shortest path has already been determined, a data structure d that maps a vertex into a current estimate of the total weight of the shortest path from the source vertex s, and a priority queue Q that contains all the vertices in V-S, keyed by their d values. The solution may be represented using a predecessor function π, mapping each vertex v to its predecessor vertex π(v) in the shortest path from the source vertex s to vertex v. Priority queue is a data structure that allows the vertex with the lowest value of to be extracted from the data structure using an “EXTRACT_MIN” function. One particularly useful data structure that may be used to implement a priority queue is known as a “Fibonacci heap,” and is described in Cormen, Leiserson, and Rivest, Introduction to Algorithms, MIT Press, 1990, pp.420-439. Pseudocode for Dijkstra's Algorithm is provided in Table I, below:
TABLE IDIJKSTRA(G = (V, E), w, s) {1.for each vertex v ε V {d[v] ← ∞π[v] ← undefined}d[s] ← 02.S ← ∅Q ← V3.while Q ≠ ∅ {4.u ← EXTRACT_MINS ← S U {u}5.for each edge e = (u,v) ε E adjacent to u {if d[v] > d[u] + w(e) {d[v] ← d[u] + w(e)π[v] ← u}}}}
In each iteration of Dijkstra's Algorithm, the vertex with the shortest estimated weight of its shortest path, u, is chosen using priority queue Q (step 4 in Table I). When the algorithm is first started, this vertex is the source vertex. Each edge e=(u,v) that proceeds from u is then examined to see if the path from s to u to v has a total weight that is less than the current estimated weight d[v] of the short path from s to v (step 5). If the total weight of the path from s to u to v is lower than d[u], then d[v] and π[v] to reflect that the path from s to v through u is now the shortest known path from s to v. This modification of d[v] to reflect the shortest path currently known from s to v is the “greedy choice.”
The Fast Path algorithm extends Dijkstra's Algorithm to the problem domain of integrated circuit routing. In the Fast Path algorithm, an integrated circuit is modeled as a “grid graph” G=(V,E), where each vertex vεV represents a position on a Cartesian grid and each vertex v is connected to each orthogonally adjacent vertex in the Cartesian graph. Intuitively, a grid graph can be pictured as a sheet of graph paper, where the intersections between the lines are the vertices and the line segments connecting adjacent vertices are the edges. The weight of each edge in the Fast Path algorithm is the delay associated with the wire connecting the two points on the integrated circuit surface represented by the two end-vertices of the graph edge. The Fast Path algorithm also takes into account the existence of physical obstacles, such as IP blocks, that may constrain routing choices. A label function p is defined such that p(v)=0 if v overlaps a physical obstacle and p(v)=1 otherwise, for all vεV.
The delay associated with a particular edge is, in the Fast Path algorithm, calculated using the Elmore delay metric. The Elmore delay metric is described in R. Gupta, B. Tutuianu, and L. T. Pileggi, “The Elmore Delay as a Bound for RC Tree with Generalized Input Signals,” IEEE Transactions on Computer Aided Design of Integrated Circuits and Systems, vol. 16, no. 1, pp. 95-104 (January 1997), which is hereby incorporated by reference. The delay of a particular edge in a grid graph representing a circuit is affected by both the geometry of the physical conductor associated with that edge and whether any buffers have been placed at the circuit nodes represented by the edge's end vertices. The insertion of buffers along a route in an integrated circuit is one means of reducing the Elmore delay associated with that route, as described in L. P. P. P. van Ginneken, “Buffer Placement in Distributed RC-tree Networks for Minimal Elmore Delay,” Proc. Int. Symp. Circuits and Systems, 1990, pp. 865-868.
For each edge (u,v)εE, let R(u,v) and C(u,v) denote the capacitance and resistance of a wire connecting u to v. Let R(g), K(g), and C(g) respectively denote the resistance, intrinsic delay, and input capacitance of each buffer gεB, where B is a library of non-inverting buffers. Then the Elmore delay of each possible wire or buffer may be calculated (e.g., using a resistance-capacitance (RC) π-model to represent the wires and a switch-level model to represent the gates).
A “path” from node s to t in the grid graph G is a sequence of nodes (s=v1,v2, . . . ,vk=t) with an associated labeling m(s)=gs, m(t)=gt, and m(vi)εB∪0. B is the set of buffers that may be inserted on a node in the path between s ant t. gs is the driving circuit (logic gate) at s,gt is the sink circuit (logic gate) at t, and each internace node v may either have a buffer from the set B or no buffer at all, denoted by m(v)=0. A path is “feasible” if and only if p(v)=1 whenever m(v)εB.
The main idea behind the Fast Path algorithm is to extend Dijkstra's shortest path algorithm to do a general labeling based on Elmore delays. The priority queue Q is used to store partial solutions to the problem as quadruples defined as follows. In priority queue Q, each quadruple α=(c,d,b,v) represents a partial solution to the routing problem at node v where c is the current input capacitance seen at v,d is the delay from v to t, and m is a labeling function for the buffered path from v to t. The priority queue Q is used to extract the partial solution quadruple having the minimum delay (d).
An additional optimization is obtained by “pruning” priority queue Q to eliminate inferior partial solutions. The partial solution α1=(c1,d1,m1,v) is said to be inferior to α2=(c2,d2,m2,v) if c1≧c2 and d1≧d2. Any buffered path from s to v to t that uses the subpath represented by α1 to go from v to t is guaranteed to be no better than a path from s to t containing the same subpath from s to v, but using the subpath represented by α2 to go from v to t.
Pseudocode for the Fast Path algorithm is provided in Table II below:
TABLE IIFASTPATH(G = (V,E),B,s,t,m′) {1.Q ← {C(m′(t)),0,m′,t)}2.while Q ≠ ∅ {3.(c,m,b,u) ← EXTRACT_MIN(Q)4.if c = 0 {return labeling m}5.if u = s {d′ ← d + R(m(s)).c + K(m(s))push (0,d′,m,u) onto Q and prunecontinue}6.for each (u,v) ε E {c′ ← c + C(u,v)d′ ← d + R(u, v)(c + C(u, v))/2push (c′,d′,m,v) onto Q and prune}7.if p(u) = 1 and m(u) = 0 {8.for each b ε B {c′ ← C(b)d′ ← d + R(b) . c + K(b)m(u) = bpush (c′,d′,m,v) onto Q and prune}}}}
The algorithm begins by initializing Q to hold a partial solution corresponding to the sink alone, having an initial labeling function m′ representing a graph that is devoid of buffers, with the exception of the source and sink circuit which are already known (step 1). Each iteration (step 2), the partial solution having the minimum delay is extracted from Q (step 3). This partial solution is then extended to either add an edge (step 6) or a buffer from the library (steps 7 and 8). If the source is reached, the corresponding solution is pushed onto Q in step 5, and when the solution is eventually extracted from Q, the solution is returned as the optimum solution (step 4). With each addition to the queue, candidates for the current vertex are checked for inferiority and then pruned accordingly. If it is assumed the G has n vertices, |E|≦4n (which is true for a grid graph), and |B|=k, the complexity of Fast Path is O(n2k2 log nk).
The Fast Path algorithm thus provides a simple solution to the routing problem for a path including wires and buffers from a buffer library. The Fast Path algorithm is somewhat limited in its application, however. In a large, high-speed integrated circuit, the overall delay associated with a wiring route may exceed the circuit's clock cycle. In such a case, synchronizing elements such a registers may need to be inserted in the path. The Fast Path algorithm is not adapted for use in the situation where one or more registers may need to be inserted in the path. In addition, some circuits, particularly those utilizing a combination of IP blocks, will require that signals be transmitted between differing clock domains. Special synchronization circuitry is needed in such instances, and the Fast Path algorithm is not adapted to design optimal routing paths under those circumstances, either. Thus, a need exists for an automated system for designing optimal routing paths over multiple clock cycles of delay and in multiple clock-domain circuits.