1. Field of the Invention
The present invention relates to devices or systems where the interconnect (wiring, communication) structure is defined after fabrication and interconnect patterns persist for long times such as Field-Programmable-Gate-Arrays (FPGAs) and coarse-grained reconfigurable devices. Specifically, the present invention addresses how to accelerate the discovery of high-quality routes on these devices or systems, by disclosing a router and a hardware-assisted fast routing method.
2. Description of the Prior Art
FIGS. 1 to 3 schematically describe a programmable network which has been generally discussed in C. E. Leiserson, Fat Trees: Universal Networks for Hardware Efficient Supercomputing, IEEE Transactions on Computers, C-34(10):892-901, October 1985 and further expanded in William Tsu, Kip Macy, Atul Joshi, Randy Huang, Norman Walker, Tony Tung, Omid Rowhani, Varghese George, John Wawrzynek, and André DeHon, “HSRA: High-Speed, Hierarchical Synchronous Reconfigurable Array,” in Proceedings of the International Symposium on Field Programmable Gate Arrays, February 1999, pp. 125-134.
With reference to FIG. 1, the programmable network comprises an array of interconnects having a first set of switches or switchpoints 1 and a second set of switches or switchpoints 2. The switches of the first set are usually T-switches, and the switches of the second set are usually π-switches, as later described in FIGS. 2 and 3. The HSRA also comprises a third set of switches 3, the connection-box switches, and a plurality of network endpoints 4.
The network endpoints 4 can be, for example, lookup tables (LUTs) or processors. In the example shown in the Figure, each endpoint is connected to seven connection-box switches 3. A connection-box switch is a matrix of switching transistors (like the ones shown in the following FIGS. 2 and 3) each of which is able to connect, based on the status of a control bit, a vertical connection (e.g. a connection into the hierarchical network) with a horizontal connection (a connection into the endpoint). The connection-box matrix may be partially populated (as shown). See, for example, André DeHon, Entropy, Counting and Programmable Interconnect, FPGA '96, ACM-SIGDA Fourth International Symposium on FPGAs, Feb. 11-13, 1996, Monterey Calif., FIG. 2.
A first feature of the HSRA network is that the number of switches in each hierarchical switchbox is linear in the number of wires in the switchbox and the total number of switches 1, 2, 3 in the network is linear in the number of endpoints 4. See, for example, hierarchical switchboxes 5, 6, and 7 of FIG. 1.
A further feature of the HSRA network is that there is a unique set of switchboxes between any source endpoint and sink endpoint of the network, so that global routing (identification of a set of switchboxes from a source to a sink) is trivial. However, detail routing, i.e. identification of the precise set of switches from source to sink, is not trivial.
FIG. 2 shows an embodiment of a switch identified by the numeral 1 in FIG. 1. The switch 1 of FIG. 2 is called T-switch (three-side switch) and comprises switching transistors 21, 22, and 23 having respective configuration bit control inputs 24, 25, and 26. The switching transistors allow a connection to be made between any two of the three sides A, B, and C, where side A is usually called the “parent” and sides B and C are usually called the “children”, or even to make a connection between all three sides. For example, if bit control inputs 24 and 26 are set to 1, and input 25 is set to 0, the parent A is connected to the child B and the child B is connected to the child C, but the parent A is not directly connected with the child C.
FIG. 3 shows an embodiment of a switch identified by the numeral 2 in FIG. 1. The switch 2 of FIG. 3 is called π-switch (four-side switch) and comprises switching transistors 31, 32, 33, 34, and 35 having respective configuration bit control inputs 36, 27, 38, 39, and 40. A π-switch allows, for example, side F or side G to be connected to sides D and/or E, according to the status of the control inputs.
Switches with more than two children links and/or more than two parent links are also known. See, for example, Andre DeHon, Rent's Rule Based Switching Requirements, System-Level Interconnect Prediction, SLIP 2001, Mar. 31-Apr. 1, 2001, pp. 197-204.
The current dominant approach to HSRA detail routing is a software approach based on a routine known as PathFinder. See, for example, Larry McMurchie and Carl Ebeling, “PathFinder: A Negotiation-Based Performance-Driven Router for FPGAs,” in Proceedings of the ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, ACM, February 1995, pp. 111-117. Further approaches provide for the presence of multiple processors, where parallel software implementation of PathFinder is provided. See Pak K. Chan and Martine D. F. Schlag, “Acceleration of an FPGA Router,” in Proceedings of the IEEE Symposium on FPGAs for Custom Computing Machines, IEEE, April 1997, pp. 175-181, and Pak K. Chan and Martine D. F. Schlag, “New parallelization and convergence results for nc: A negotiation-based FPGA router,” in Proceedings of the 2000 International Symposium on Field-Programmable Gate Arrays (FPGA '00), ACM/SIGDA, February 2000, pp. 165-174.
With reference to more traditional, mesh-based FPGA routing networks, several attempts have been made to improve the performance of software-based FPGA routers, as shown in J. S. Swarz, V. Betz, and J. Rose, A Fast Routability-Driven Router for FPGAs, Proceedings of the 1998 International Symposium on Field-Programmable Gate Arrays (FPGA '98), pp. 140-149, ACM/SIGDA, February 1998 and in R. Tessier, Negotiated A* Routing for FPGAs, Proceedings of the 5th Canadian Workshop on Field Programmable Devices, June 1998.
A major problem with these entirely software-based approaches is that billions of software cycles are usually required, which are not sufficient to make runtime routing viable in circumstances where, for example, (1) the specific computing task is not known or defined until runtime, (2) the task may be used for only a few million cycles, or (3) the task must be operational in seconds (or less) instead of minutes or hours.
Hardware-based approaches are also known, as disclosed, for example, in A. Iosupovici, A class of Array Architectures for Hardware Grid Routers, IEEE transactions on Computer-Aided Design of Integrated Circuits and Systems, 5(2):245-255, April 1986 and in T. Ryan and E. Rogers, An ISMA Lee Router Accelerator, IEEE Design and Test of Computers, pp. 38-45, October 1987.
Therefore, there is a need for a method and a device which makes runtime routing more viable than currently known, in order to substantially reduce the time to find a quality set of routes.