1. Field of Invention
The present disclosure is generally directed to monolithic integrated circuits, and more specifically to a repeated, product-term processor and macrocell module design for use within Programmable Logic Devices (PLD""s). It is even more specifically directed to a product-term processor and macrocell module design as applied to a subclass of PLD""s known as High-Density Complex Programmable Logic Devices (HCPLD""s).
2a. Cross Reference to Related Applications
The following U.S. patent application is owned by the owner of the present application, and its disclosure is incorporated herein by reference:
(A) Ser. No. 09/721,153 filed Nov. 22, 2000 by Om P. Agrawal et al. and originally entitled, xe2x80x9cSCALABLE ARCHITECTURE FOR HIGH DENSITY CPLD""s HAVING TWO-LEVEL HIERARCHY OF ROUTING RESOURCESxe2x80x9d.
2b. Cross Reference to Related Patents
The disclosures of the following U.S. patents are incorporated herein by reference:
(A) U.S. Pat. No. 6,184,713 B1 issued Feb. 6, 2001 to Om P. Agrawal et al, and entitled, xe2x80x9cSCALABLE ARCHITECTURE FOR HIGH DENSITY CPLD""s HAVING TWO-LEVEL HIERARCHY OF ROUTING RESOURCESxe2x80x9d;
(B) U.S. Pat. No. 6,150,841 issued Nov. 21, 2000 to Om P. Agrawal et al, and entitled, ENHANCED MACROCELL MODULE FOR HIGH DENSITY CPLD ARCHITECTURES;
(C) U.S. Pat. No. 5,811,986 issued Sep. 22, 1998 to Om Agrawal et al, and entitled, FLEXIBLE SYNCHRONOUS/A SYNCHRONOUS CELL STRUCTURE FOR HIGH DENSITY PROGRAMMABLE LOGIC DEVICE;
(D) U.S. Pat. No. 5,764,078 issued Jun. 9, 1998 to Om Agrawal et al, and entitled, FAMILY OF MULTIPLE SEGMENTED PROGRAMMABLE LOGIC BLOCKS INTERCONNECTED BY A HIGH SPEED CENTRALIZED SWITCH MATRIX;
(E) U.S. Pat. No. 5,818,254 issued Oct. 6, 1998 to Om Agrawal et al, and entitled, MULTI-TIERED HIERARCHICAL HIGH SPEED SWITCH MATRIX STRUCTURE FOR VERY HIGH DENSITY COMPLEX PROGRAMMABLE LOGIC DEVICES;
(F) U.S. Pat. No. 5,789,939 issued Aug. 4, 1998 to Om Agrawal et al, and entitled, METHOD FOR PROVIDING A PLURALITY OF HIERARCHICAL SIGNAL PATHS IN A VERY HIGH DENSITY PROGRAMMABLE LOGIC DEVICE;
(G) U.S. Pat. No. 5,621,650 issued Apr. 15, 1997 to Om Agrawal et al, and entitled, PROGRAMMABLE LOGIC DEVICE WITH INTERNAL TIME-CONSTANT MULTIPLEXING OF SIGNALS FROM EXTERNAL INTERCONNECT BUSES; and
(H) U.S. Pat. No. 5,185,706 issued Feb. 9, 1993 to Om Agrawal et al.
2c. Reservation of Extra-patent Rights and Resolution of Conflicts
After this disclosure is lawfully published, the owner of the present patent application has no objection to the reproduction by others of textual and graphic materials contained herein provided such reproduction is for the limited purpose of understanding the present disclosure of invention and of thereby promoting the useful arts and sciences. The owner does not however disclaim any other rights that may be lawfully associated with the disclosed materials, including but not limited to, copyrights in any computer program listings or art works or other works provided herein, and to trademark or trade dress rights that may be associated with coined terms or art works provided herein and to other otherwise-protectable subject matter included herein or otherwise derivable herefrom.
If any disclosures are incorporated herein by reference and such incorporated disclosures conflict in part or whole with the present disclosure, then to the extent of conflict, the present disclosure controls. If such incorporated disclosures conflict in part or whole with one another, then to the extent of conflict, the later-dated disclosure controls.
3. Description of Related Art
Field-Programmable Logic Devices (FPLD""s) have continuously evolved to better serve the unique needs of different end-users. From the time of introduction of simple PLD""s such as the Advanced Micro Devices 22V10(trademark) Programmable Array Logic device (PAL), the art has branched out in several different directions.
One evolutionary branch of FPLD""s has grown along a is paradigm known as Field Programmable Gate Arrays or FPGA""s. Examples of such devices include the XC2000(trademark) and XC3000(trademark) families of FPGA devices introduced by Xilinx, Inc. of San Jose, Calif. The architectures of these devices are exemplified in U.S. Pat. Nos. 4,642,487; 4,706,216; 4,713,557; and 4,758,985; each of which is originally assigned to Xilinx, Inc.
An FPGA may be generally characterized as a monolithic, integrated circuit that has an array of user-programmable, lookup tables (LUT""s) that can each implement any Boolean function to the extent allowed by the address space of the LUT. User-programmable interconnect is typically provided for interconnecting primitive, LUT-implemented functions and for thereby defining more complex functions.
Because LUT-based function implementation tends to be functionally more exhaustive (broader) but speed-wise slower than gate-based (e.g., AND/OR-based) function implementation, FPGA""s are generally recognized in the art as having a relatively more expansive capability of implementing a wide variety of functions (broad functionality) but at relatively slower speed. Also, because length of signal routings through the programmable interconnect of an FPGA can vary significantly, FPGA""s are generally recognized as providing relatively inconsistent signal delays whose values can vary substantially depending on how partitioning, placement and routing software configures the FPGA.
A second evolutionary chain in the art has branched out along a paradigm known as Complex PLD""s or CPLD""s. This paradigm is characterized by devices such as the Lattice Semiconductor ispMACHT(trademark) family. Examples of CPLD circuitry are seen in U.S. Pat. No. 5,015,884 (issued May 14, 1991 to Om P. Agrawal et al.) and U.S. Pat. No. 5,151,623 (issued Sep. 29, 1992 to Om P. Agrawal et al.) as well as in other CPLD patents cited above, including U.S. Pat. No. 5,811,986.
A CPLD device can be generally characterized as a monolithic, integrated circuit (IC) that has four major features as follows.
(1) A user-accessible, configuration-defining memory means, such as EPROM, EEPROM, anti-fused, fused, SRAM, or other, is provided in the CPLD device so as to be at least once-programmable (if not reprogrammable) by device users for defining user-provided configuration instructions. Static Random Access Memory or SRAM is of course, a form of reprogrammable memory that can be differently programmed many times. Electrically Erasable and reprogrammable ROM or EEPROM is an example of nonvolatile reprogrammable memory. The configuration-defining memory of a CPLD device can be formed of a mixture of different kinds of memory elements if desired (e.g., SRAM and EEPROM). Typically it is of the nonvolatile, In-System reProgrammable (ISP) kind such as EEPROM.
(2) Input/Output means (IO""s) are provided for interconnecting internal circuit components of the CPLD device with external circuitry. The IO""s may have fixed configurations or they may include configurable features such as variable slew-output drivers whose characteristics may be fine tuned in accordance with user-provided configuration instructions stored in the configuration-defining memory means.
(3) Programmable Logic Blocks (PLB""s) are provided for carrying out user-programmed logic functions as defined by user-provided configuration instructions stored in the configuration-defining memory means. Typically, each of the many PLB""s of a CPLD has at least a Boolean sum-of-products generating circuit (e.g., an AND/OR array) or a Boolean product-of-sums generating circuit (e.g., an OR/AND array) that is user-configurable to define a desired Boolean function, xe2x80x94to the extent allowed by the number of product terms (PT""s) or sum terms (ST""s) that are acquirable and combinable by that circuit.
Each PLB may have other resources such as input signal pre-processing resources and output signal post-processing resources. The output signal post-processing resources may include result storing and/or timing adjustment resources such as clock-synchronized registers. Although the term xe2x80x98PLBxe2x80x99 was adopted by early pioneers of CPLD technology, it is not uncommon to see other names being given to the repeated portion of the CPLD that carries out user-programmed logic functions and timing adjustments to the resultant function signals.
(4) An interconnect network is generally provided for carrying signal traffic within the CPLD between various PLB""s and/or between various IO""s and/or between various IO""s and PLB""s. At least part of the interconnect network is typically configurable so as to allow for programmably-defined routing of signals between various PLB""s and/or IO""s in accordance with user-defined routing instructions stored in the configuration-defining memory means.
In contrast to LUT-based FPGA""s, gate-based CPLD""s are generally recognized by workers in the art as having a relatively less-expansive capability of implementing a wide variety of functions, in other words, not being able to implement all Boolean functions for a given input space, but being able to do so at relatively higher speeds. Wide functionality is sacrificed to obtain shorter, pin-to-pin signal delays. Also, because length of signal routings through the programmable interconnect of a CPLD is often arranged so it will not vary significantly despite different signal routings, CPLD""s are generally recognized as being able to provide relatively consistent signal delays whose values do not vary substantially based on how partitioning, placement and routing software configures the CPLD. Many devices in the Lattice/Vantis ispMACH(trademark) family provide such a consistent signal delay characteristic under the Lattice trade name of SpeedLocking(trademark). The more generic term, Speed-Consistency will be used interchangeably herein with the term, SpeedLocking(trademark).
A newly evolving sub-branch of the growing families of CPLD devices is known as High-Density Complex Programmable Logic Devices (HCPLD""s). This sub-branch may be generally characterized as monolithic IC""s that have large numbers of I/O terminals (e.g., Input/Output pins) in the range of about 50 or more (e.g., 64, 96, 128, 192, 256, 320, etc.) and/or have large numbers of result-storing macrocells in the range of about 200 or more (e.g., 256, 320, 512, 1024, etc.). The process of concentrating large numbers of I/O pins and/or large numbers of macrocells into a single CPLD device raises new challenges for achieving relatively broad functionality, high speed, and Speed-Consistency (SpeedLocking(trademark)) in the face of wide varieties of configuration software.
A more detailed discussion is provided in the above-cited U.S. application Ser. No. 09/721,153 concerning the various operations performed by CPLD configuring software. As such they will not be repeated here except to briefly note the following.
Configuration software can produce different results, good or bad, depending in part on what broadness of functionalities, what timing flexibilities, and what routing flexibilities are provided by the architecture of a target CPLD. The present disclosure focuses on the broadness of functionalities and timing flexibilities that are provided by repeated structures referred to herein as product-term processors and macrocell modules.
When confronted with a given design problem, CPLD-configuring software typically cycles through a series of phases, referred to commonly as xe2x80x98partitioningxe2x80x99, xe2x80x98placementxe2x80x99, and xe2x80x98routingxe2x80x99. Differently designed CPLD""s can have differently designed PLB""s with respectively different, logic-implementing capabilities, and/or timing capabilities. Partitioning software may have to comply with certain, fixed floor-plan constraints placed on where certain functionalities are to be implemented, for example, next to a particular pin and/or pad whose location and use are pre-specified. Partitioning software has to account for the maximum size and speed of circuitry that each PLB is able to implement within the specific CPLD device that has been designated to implement the original and whole circuit design.
By way of example, each PLB of a given, first CPLD architecture may be able to generate in one pass (where the one pass does not include the use of a feedback loop) a sum-of-products (SoP) function signal of the expressive form:                               f                      SoP            ⁢            .1                    N                =                              ∑            N                    ⁢                                    (                              PTi                                                      Ki                    /                    K                                    ⁢                                      xe2x80x83                                    ⁢                                      max                    /                    L                                                              )                        .                                              {                  Exp          .                      xe2x80x83                    ⁢          A                }            
In this sum-of-products expression (Exp. A), the capital N factor represents a maximum number of product terms (PT""s) that can be generated within, and thereafter summed by a respective PLB for defining the one sum-of-products function signal, fNSoP.1. (A PLB may be able to output more than one fSoP signal of course, each with its own N value and its own Ki value.) The Kmax factor represents in the same Exp. A, a maximum number of independent, PLB input signals that can be acquired from a set of L available lines extending besides the PLB. Ki is the number of actual signals that are used as a subset of Kmax for defining a corresponding, i-th product term, PTi. The acquired subset of Ki signals are ANDed together in the respective PLB to define each respective, i-th product term (PTi). If Ki=0, then PTi=0 and that PTi does not contribute to the Boolean sum.
In order to fit partitioning results inside the maximal fSoP capabilities of each PLB, the partitioning part of CPLD configuring software has to cast its primitive sums-of-products such that they are each equal to or less than the N-defined and Kmax-defined limits of the fSoP results that can be produced by respective PLB""s of the targeted CPLD. If the architecture of the targeted CPLD is such that each of the above-described factors, N, Kmax and L (Exp. A) is relatively large, then the maximal fSoP results per PLB will tend to be relatively large and the design partitioning phase will be advantageously allowed to work with larger-sized, partition chunks. Less, inter-PLB routing resources will be needed. And that will make the job of the post-partitioning router easier. It will also tend to minimize the signal propagating delay through the CPLD because intra-PLB delays (due to routing within the PLB) tend to be smaller than inter-PLB delays (due to routing outside and between plural PLB""s).
Designing a CPLD with the ability to only provide maximal fSoP results per PLB is not a good idea however. Silicon resources may be wasted and speed may be sacrificed if the to-be-partitioned, original design calls mostly for small chunks rather than PLB-consuming large chunks. So a judicious balance has to be struck between: (1) being able to make large the number, N, of summable product terms per sum-of-products function signal, fSoP output by a each programmable logic block (PT""s/fSoP/PLB) and (2) minimizing the die-space costs of implementing such a result, and (3) minimizing the signal-propagation delay created by such an implementation. This is not an easy task.
Besides being able to comply with pre-specified speed criteria, and pre-specified complexity-of-function specifications, users of CPLD""s also usually want a certain degree of re-design agility (flexibility). Even after an initial design is successfully implemented by a CPLD, users may wish to make slight tweaks or other changes to their original design. The re-design agility of a given CPLD architecture may include the ability to re-design certain internal circuits without changing I/O timings. Re-design agility may also include the ability to re-design certain internal circuits without changing the placement of various I/O terminals (e.g., pins). Such re-design agilities are sometimes referred to respectively as re-design Speed-Locking(trademark) and Pin-Retention (the former term is a trademark of Lattice Semiconductor Corp., headquartered in Hillsboro, Oreg.). The more generic terms of: xe2x80x98re-design Speed-Consistencyxe2x80x99 and xe2x80x98re-design PinOut-Consistencyxe2x80x99 will be respectively used herein interchangeably with xe2x80x98re-design Speed-Locking(trademark)xe2x80x99 and xe2x80x98re-design Pin-Retentionxe2x80x99.
In addition to speed, re-design agility, and full Boolean correctness, users of CPLD""s typically ask for optimal emulation of an original design or a re-design in terms of good function packing density, low cost, low power usage, and so forth.
Some previous CPLD architectures meshed well with specific bus sizes of specific design problems. However, preferences tend to change over time. Industry standards may, at first, favor designs where address and data words have a size in the range of 8 to 16 bits. Industry standards may later migrate towards larger-sized organizations of signals such as address and data words having sizes in the range of 32 to 64 bits each.
A CPLD that has an architecture optimized for bus-oriented word sizes of 8 to 16 bits may not be able to efficiently accommodate designs where word sizes, and particularly, control word sizes, increase into a range of say, 32 to 64 bits. What is needed is an architecture that can efficiently accommodate dense design problems having word sizes in the range of 32 to 64 bits or more without losing speed and re-design agility. At the same time, if word sizes drop to a lower range for some supplied design problems, and workable solutions can be arrived at with use of relatively simpler circuit chunks, the flexible CPLD architecture should be able to make efficient use of resources that might otherwise go unused because of the drop to the smaller word sizes and/or to simpler partition chunks.
An improved CPLD device in accordance with the present disclosure of invention includes a plurality of flexible, or variable-grain, product-term processors which each operate on a respective xe2x80x98clusterxe2x80x99 of at least 4 or 5 product term inputs (PTi""s). The PT signals of these clusters can be summed locally in one step to provide a first, cluster-based, sum-of-products signal, fN less than 6SoP.1 whose production delay may be relatively small, but whose functional-complexity (e.g., N less than 6) is also relatively small. In accordance with the disclosure, expansion means are provided for producing in each product-term processor, a second, cluster-based, sum-of-products signal, fN greater than 5SoP.2 whose production delay is somewhat larger than that of the first fN less than 6SoP.1 signal, but still fairly small, while its functional-complexity (e.g., N greater than 5) can be made relatively larger.
Outputs of respective ones of the expansion means are cross-laced in a cascading manner into inputs of other expansion means at an interval (e.g., J+7) that fairly minimizes or avoids overlap of function-producing capabilities while allowing for continuous incremental build up of functional-complexity (e.g., N=10, 15, 20, 25, etc.) as longer sequences of the cross-lacing option are used. The outputs of the expansion means are further fed to a sums sharing array whose internal structure co-relates with the lacing interval chosen for the cross-lacing of the outputs and inputs of the plural expansion means. This and other aspects of the disclosure will become clearer from the below detailed description.