A wide variety of potential applications exist for extremely compact, ultra high capacity computational modules in both embedded and high performance computer environments. These applications include among others sensor signal processing, sensor fusion, image processing, feature identification, pattern recognition, smart cameras, artificial multilayer retinas, early vision systems, neural networks, multiprocessor access to shared memories, parallel-access 3-D memory, highly parallel rendering engines for computer animation and graphics, high resolution two-dimensional and three-dimensional display drivers, and high bandwidth image displays. All of these applications are computation intensive, and in many cases must be accomplished in operational environments that are restricted both in available power and in the space allowed for computational elements.
In order to handle these and other related "grand challenge" problems, advanced computational systems must employ distributed parallel processing elements in an architecture that is amenable to the compact integration of multiple processor and memory chips, operates at low power, and supports high-bandwidth parallel input/output (I/O). Parallel processing can be accommodated both by a multiplicity of single-chip processing elements as well as by the incorporation of parallel processing on each chip. Certain applications (such as parallel multiprocessors, 1-D sensor signal processing, and 1-D sensor fusion) may ultimately require only a few complex processors per chip, while other applications (such as image processing, 2-D sensor signal processing, and early vision) segment more readily into many simple processors (e.g., smart pixels) on each chip. The integration of both fine-grained and coarse-grained processing elements within a computational multichip module requires advanced packaging concepts that increase manufacturability and enable enhanced chip-to-chip interconnection capacity, thereby augmenting the aggregate computational performance of the resultant Processor/interconnection system. The requisite features of such an advanced packaging architecture and its associated packaging technology include the capacity for parallel transmission of intermediate computational results, and the availability of dense local and global interconnections.
Electronic multichip module integration techniques have been previously employed in an attempt to provide these requisite features. Two separate approaches are well-known to those skilled in the art, including (1) the horizontal integration of electronic chips on a common substrate that contains electrical chip-to-chip interconnections, and (2) the vertical integration of electronic chips in a three-dimensional (3-D) stack.
The advantages of horizontal multichip module integration include ease of chip placement and rework, mature wire-bonding and tape-automated bonding (TAB) techniques for electrical interconnection of each individual chip to the substrate, and capability for planar heat removal and thermal management techniques. However, although horizontal multichip module integration has been successfully applied to a wide range of applications, this integration scheme exhibits relatively low I/O bandwidths and high power dissipation, due primarily to long off-chip lead lengths with associated high capacitance.
The advantages of vertical multichip module integration include increased I/O bandwidth and reduced power dissipation, due primarily to shortening of off-chip lead lengths. However, the integration of more than two chips in the vertical dimension by means of electrical interconnections only requires either the routing of all I/O signals to the edge of each chip for interconnection by means of an edge-mounted electrical interconnection network, or the incorporation of vertical electrical vias through each chip. The former (edge-mounted interconnection) approach has shown promise for multichip memory and certain sensor applications in which the memory or processor architectures lead naturally to I/O ports arranged along the individual chip edges. In other highly-parallel computation-intensive applications such as those envisioned herein, the routing of all I/O ports to the chip edges proves to be either impractical, limiting in terms of overall I/O signal capacity, or expensive in terms of the additional chip area that must be incorporated to allow for multiple interconnection routing on the chip and multiplexing and demultiplexing of each I/O port. The latter (vertical electrical via) approach has been intensively investigated for many years, but to date has not proven to be commercially viable.
Given the current limitations of electronic multichip module integration as applied to the computational and display tasks outlined above, several investigators have proposed the interconnection of multiple electronic processors with both optical I/O and electronic I/O, with optical I/O employed for dense parallel chip-to-chip interconnections, and electronic I/O used for lateral control signal and local cache memory access as appropriate. Two primary approaches have been investigated thus far, including (1) the use of free-space optical interconnection techniques, and (2) the incorporation of proximity-coupled photonic sources (such as light-emitting diodes or vertical-cavity surface-emitting-lasers) and associated detectors to provide compact optical-interconnection channels. Free-space optical interconnections provide increased aggregate signal bandwidth and capacity for both local and global interconnectivity, but also require relatively immature bulk-optical packaging technologies and large system volumes. The incorporation of proximity-coupled photonic sources and detectors to provide for plane-to-plane interconnections can significantly reduce the required system volume, but at the current state-of-the-art carry high power dissipation penalties at the desired interconnection bandwidths. As a consequence of high power dissipation, such approaches are delimited in aggregate interconnection capacity as expressed by the product of the number of interconnection channels per unit area and the bandwidth of each interconnection channel. Furthermore, although electronic I/O and its associated packaging issues are well understood and highly developed at this point, packaging techniques that incorporate optical interconnections have not yet achieved technological break-even.
It is to these ends of producing a manufacturable electronic/photonic packaging technology for dense high-bandwidth interconnection of sets of processing elements, microprocessors, and memory modules or arrays distributed over multiple chips with increased chip-to-chip interconnection density and aggregate interconnection bandwidth, as well as reduced power consumption, that the invention described herein is directed.