Optical interconnect systems internal to computers are receiving increasing attention as presented for example in Nail Savage, “Linking With Light”, IEEE Spectrum, August 2002, Pages 32-36. This Paper hypothesizes that within a few years many of the copper connections in computers will yield to high speed optical interconnect where photons rather than electrons, will pass signals from board to board, or chip to chip, or even from one part of the chip to another. In these computer systems, an electrical signal from the processor would modulate a miniature laser beam which would travel through air or a waveguide to a photodetector. This in turn would pass the signal to the electronics. Assuming that processing elements will continue to mostly be electronics-based, the closer the optical interconnect is to the processing elements, the more challenging the introduction of optics as a means for communication therebetween becomes due to the need to operate at high speeds along with the resulting power requirements.
Modern computer design fabricates processing elements and the highest level of the memory hierarchy (e.g., first level cache) on the same large computer chip which is manufactured by Very Large Scale Integration (VLSI) technology. FIG. 1 shows schematically a processor/cache arrangement implemented on a large computer chip 10 where processing elements 12, such as Control Unit, ALU, FPU (floating point unit), etc., and registers 14, as well as cache memory 16 are fabricated on the large computer chip 10 by means of VLSI technology. A motivation for using recent VLSI technology is to permit larger memories and higher bandwidths interconnect to be included in the large computer chip. Positioning of a plurality of processing and memory modules, as well as an interconnect fabric on a single cutting edge (or next generation) 0.065 micron chip using the VLSI approach, results in high manufacturing costs as well as a rather lengthy and complicated manufacturing process that is associated with an extensive number of photolithographical steps performed on the same large chip.
In parallel computing, although massively parallel processors (MPPs) provide the strongest available machines, recent studies demonstrate that, due to their coarse-grain parallelism, MPPs have not been a success for some general purpose applications and in particular applications have irregular parallelism. Achieving programmable, high performance general-purpose parallel computing has been an objective of the explicit multi-threaded (XMT) fine grained parallel on-chip computer architecture framework. A substantial challenge for an XMT design is to provide connectivity between the many execution units and the many cache modules, on chip.
For these purposes, an all-electronic architecture was outlined in D. Naishlos, J. Nuzman, C-W. Tseng and U. Vishlkin. Towards a First Vertical Prototyping of an Extremely Fine-Grained Parallel Programming Approach. Theory of Computer Systems, 36 (2003), 521-552 (Special Issue of SPAA2001) for building a parallel computer on a chip. In this approach, there are processing elements organized in clusters and memory modules. The computer memory is hierarchical, where, subject to chip capacity limitations, the highest level of the hierarchy (comprising the first-level cache) is on the chip itself. What is relatively unique to XMT is that the processors have no local memories, besides their registers, and the whole memory is shared among all the processors. An important clarification is that the memory is partitioned using a hashing method among the memory modules and the cache coherence problem never occurs since the hashing method designates exactly one physical memory module for each logical memory address. The communication between processor clusters and memory modules is done through an electronic interconnection network.
It is therefore highly desirable to provide an alternative less expensive processor/memory arrangement and replace a single large computer chip approach which would allow an optical interconnection between processing elements and cache memories deep inside the microprocessor module of the computer system.