1. Field of the Invention
The present invention generally relates to the fabrication and design of semiconductor chips and integrated circuits, and more particularly to a method of designing the physical layout (placement) of latches in a net having a common clock domain.
2. Description of the Related Art
Integrated circuits are used for a wide variety of electronic applications, from simple devices such as wristwatches to the most complex computer systems. A microelectronic integrated circuit (IC) chip can generally be thought of as a collection of logic cells with electrical interconnections between the cells, formed on a semiconductor substrate (e.g., silicon). An IC may include a very large number of cells and require complicated connections between the cells. A cell is a group of one or more circuit elements such as transistors, capacitors, resistors, inductors, and other basic circuit elements grouped to perform a logic function. Cell types include, for example, core cells, scan cells and input/output (I/O) cells. Each of the cells of an IC may have one or more pins, each of which in turn may be connected to one or more other pins of the IC by wires. The wires connecting the pins of the IC are also formed on the surface of the chip. For more complex designs, there are typically at least four distinct layers of conducting media available for routing, such as a polysilicon layer and three metal layers (metal-1, metal-2, and metal-3). The polysilicon layer, metal-1, metal-2, and metal-3 are all used for vertical and/or horizontal routing.
An IC chip is fabricated by first conceiving the logical circuit description, and then converting that logical description into a physical description, or geometric layout. This process is usually carried out using a “netlist,” which is a record of all of the nets, or interconnections, between the cell pins. A layout typically consists of a set of planar geometric shapes in several layers. The layout is then checked to ensure that it meets all of the design requirements, particularly timing requirements. The result is a set of design files known as an intermediate form that describes the layout. The design files are then converted into pattern generator files that are used to produce patterns called masks by an optical or electron beam pattern generator. During fabrication, these masks are used to pattern a silicon wafer using a sequence of photolithographic steps. The process of converting the specifications of an electrical circuit into a layout is called the physical design.
Cell placement in semiconductor fabrication involves a determination of where particular cells should optimally (or near-optimally) be located on the surface of a integrated circuit device. Due to the large number of components and the details required by the fabrication process for very large scale integrated (VLSI) devices, physical design is not practical without the aid of computers. As a result, most phases of physical design extensively use computer-aided design (CAD) tools, and many phases have already been partially or fully automated. Automation of the physical design process has increased the level of integration, reduced turn around time and enhanced chip performance. Several different programming languages have been created for electronic design automation (EDA) including Verilog, VHDL and TDML. A typical EDA system receives one or more high level behavioral descriptions of an IC device, and translates this high level design language description into netlists of various levels of abstraction.
Placement algorithms are typically based on either a simulated annealing, top-down cut-based partitioning, or analytical paradigm (or some combination thereof). Recent years have seen the emergence of several new academic placement tools, especially in the top-down partitioning and analytical domains. The advent of multilevel partitioning as a fast and extremely effective algorithm for min-cut partitioning has helped spawn a new generation of top-down cut-based placers. A placer in this class partitions the cells into either two (bisection) or four (quadrisection) regions of the chip, then recursively partitions each region until a global (coarse) placement is achieved. Analytical placers may allow cells to temporarily overlap in a design. Legalization is achieved by removing overlaps via either partitioning or by introducing additional forces and/or constraints to generate a new optimization problem. The classic analytical placers, PROUD and GORDIAN, both iteratively use bipartitioning techniques to remove overlaps. Eisenmann's force-based placer uses additional forces besides the well-known wire length dependent forces to reduce cell overlaps and to consider the placement area. Analytical placers optimally solve a relaxed placement formulation, such as minimizing total quadratic wire length. Quadratic placers generally use various numerical optimization techniques to solve a linear system. Two popular techniques are known as conjugate gradient (CG) and successive over-relaxation (SOR). The PROUD placer uses the SOR technique, while the GORDIAN placer employs the CG algorithm.
While these techniques provide adequate placement of cells with regard to their data interconnections, there is an additional challenge for the designer in constructing a clock network for the cells and this challenge is becoming more difficult with the latest technologies like low-power, 65-nanometer integrated circuits. FIG. 1 illustrates a typical clock tree for a net of clock sinks such as latches 2 in a common clock domain that have been placed using conventional techniques. The clocking source 4 (e.g., an oscillator signal or a gating signal used to gate a clock) is located in a centralized area of the latches and branches out to multiple buffers 6 which are further connected to other buffers 8 or clusters of latches 2. Placement algorithms have a tendency to spread out the latches, creating a relatively large clock domain size which is undesirable since a larger clock domain can result in increased power consumption and lead to problems caused by variations in the delays to various clock sinks of paths originating at source 4. There are additional advantages of having a smaller clock domain size such as a smaller number of clock buffers and shorter clock wires leading to smaller clock tree latency, and less clock skew.
The traditional approach to clock tree construction relies on movebounds to simply constrain the placement of domains. This approach, however, is cumbersome and cannot produce optimal results due to overly restrictive constraints. It can also be difficult to predict where to place the domains. An alternative approach creates an artificial net connecting all latches in the same domain, but it is difficult to control the degree of attraction imposed by the artificial net which leads to poor wirelength, congestion and timing. A third approach is to interleave clock tree construction with placement as taught in U.S. Pat. No. 6,536,024. While this method provides some optimization of clock power, it is hard to properly represent clock tree structures in placement engines which leads to an undue amount of runtime overhead.
In light of the foregoing, it would be desirable to devise an improved placement method which could take clock tree construction into consideration to reduce the clock domain size without requiring excess runtime. It would be further advantageous if the method could optimize signal paths for timing closure without imposing severe design constraints.