The present invention relates, in general, to the field of systems and methods for reconfigurable, or adaptive, data processing. More particularly, the present invention relates to an extremely compact reconfigurable processor module comprising hybrid stacked integrated circuit (xe2x80x9cICxe2x80x9d) die elements.
In addition to current commodity IC microprocessors, another type of processing element is commonly referred to as a reconfigurable, or adaptive, processor. These reconfigurable processors exhibit a number of advantages over commodity microprocessors in many applications. Rather than using the conventional xe2x80x9cload/storexe2x80x9d paradigm to execute an application using a set of limited functional resources as a microprocessor does, the reconfigurable processor actually creates the number of functional units it needs for each application in hardware. This results in greater parallelism and, thus, higher throughput for many applications. Conventionally, the ability for a reconfigurable processor to alter its hardware compliment is typically accomplished through the use of some form of field programmable gate array (xe2x80x9cFPGAxe2x80x9d) such as those produced by Altera Corporation, Xilinx, Inc., Lucent Technologies, Inc. and others.
In practice however, the application space over which such reconfigurable processors, (as well as hybrids combining both microprocessors and FPGAs) can be practically employed is limited by several factors.
Firstly, since FPGAs are less dense than microprocessors in terms of gate count, those packaged FPGAs having sufficient gates and pins to be employed as a general purpose reconfigurable processor (xe2x80x9cGPRPxe2x80x9d), are of necessity very large devices. This size factor alone may essentially prohibit their use in many portable applications.
Secondly, the time required to actually reconfigure the chips is on the order of many hundreds of milliseconds, and when used in conjunction with current microprocessor technologies, this amounts to a requirement of millions of processor clock cycles in order to complete the reconfiguration. As such, a high percentage of the GPRP""s time is spent loading its configuration, which means the task it is performing must be relatively long-lived to maximize the time that it spends computing. This again limits its usefulness to applications that require the job not be context-switched. Context-switching is a process wherein the operating system will temporarily terminate a job that is currently running in order to process a job of higher priority. For the GPRP this would mean it would have to again reconfigure itself thereby wasting even more time.
Thirdly, since microprocessors derive much of their effective operational speed by operating on data in their cache, transferring a portion of a particular job to an attached GPRP would require moving data from the cache over the microprocessor""s front side bus to the FPGA. Since this bus runs at about 25% of the cache bus speed, significant time is then consumed in moving data. This again effectively limits the reconfigurable processor to applications that have their data stored elsewhere in the system.
These three known limiting factors will only become increasingly significant as microprocessor speeds continue to increase. As a result, the throughput benefits that reconfigurable computing can offer to a hybrid system made up of existing, discrete microprocessors and FPGAs may be obviated or otherwise limited in its potential usefulness.
In accordance with the disclosure of a representative embodiment of the present invention, FPGAs, microprocessors and cache memory may be combined through the use of recently available wafer processing techniques to create a particularly advantageous form of hybrid, reconfigurable processor module that overcomes the limitations of present discrete, integrated circuit device implementations of GPRP systems. As disclosed herein, this new processor module may be conveniently denominated as a Stacked Die Hybrid (xe2x80x9cSDHxe2x80x9d) Processor.
Tru-Si Technologies of Sunnyvale, Calif. (http://www.trusi.com) has developed a process wherein semiconductor wafers may be thinned to a point where metal contacts can traverse the thickness of the wafer creating small bumps on the back side much like those of a BGA package. By using a technique of this type in the manufacture of microprocessor, cache memory and FPGA wafers, all three die, or combinations of two or more of them, may be advantageously assembled into a single very compact structure thus eliminating or ameliorating each of the enumerated known difficulties encountered with existing reconfigurable technology discussed above.
Moreover, since these differing die do not require wire bonding to interconnect, it is now also possible to place interconnect pads throughout the total area of the various die rather than just around their periphery. This then allows for many more connections between the die than could be achieved with any other known technique.
Particularly disclosed herein is a processor module with reconfigurable capability constructed by stacking and interconnecting bare die elements. In a particular embodiment disclosed herein, a processor module with reconfigurable capability may be constructed by stacking thinned die elements and interconnecting the same utilizing contacts that traverse the thickness of the die. As disclosed, such a processor module may comprise a microprocessor, memory and FPGA die stacked into a single block.
Also disclosed herein is a processor module with reconfigurable capability that may include, for example, a microprocessor, memory and FPGA die stacked into a single block for the purpose of accelerating the sharing of data between the microprocessor and FPGA. Such a processor module block configuration advantageously increases final assembly yield while concomitantly reducing final assembly cost.
Further disclosed herein is an FPGA module that uses stacking techniques to combine it with a memory die for the purpose of accelerating FPGA reconfiguration. In a particular embodiment disclosed herein, the FPGA module may employ stacking techniques to combine it with a memory die for the purpose of accelerating external memory references as well as to expand its on chip block memory.
Also further disclosed is an FPGA module that uses stacking techniques to combine it with other die for the purpose of providing test stimulus during manufacturing as well as expanding the FPGA""s capacity and performance. The technique of the present invention may also be used to advantageously provide a memory or input/ouput (xe2x80x9cI/Oxe2x80x9d) module with reconfigurable capability that includes a memory or I/O controller and FPGA die stacked into a single block.