The present invention has particular application to optical computer interfaces. The following discussion and the preferred embodiments to be described are directed specifically to this area. It will be understood that it also has in its general sense application in other areas. Some of such alternate uses will also be discussed.
A computer consists of one or more processors, which do the actual computation and decision making, and memory, which stores programs and data. Most systems, even today, have a single processor. As the demand for faster computers grew, multiprocessor systems were developed. In theory, a system with p processors could run p times faster than a single-processor system.
For example, consider a database search, e.g. a physician searching a medical database for all recent references on a given disease. The database can be partitioned into p parts, and each processor can search its part of the database. Since all this is happening in parallel with respect to the p processors, then there is a potential speedup of a factor of p.
However, this factor-p speedup has yet to be realized in a real system, chiefly due to contention among the p processors for limited resources. If two processors try to access the same memory module at the same time, one of the processors must wait. In the database application mentioned above, even though the database has been partitioned, a given memory module may contain several parts. Thus, the processors may still clash occasionally, which results in loss of the parallelism depended on for the speedup. Also, processors may have to contend for the interconnect between the processors and the memory modules.
These problems have resulted in processing speedup factors much smaller than the theoretical value p, and have continued to plague multiprocessor technology to the present time. See, for instance, Siewiorek et al., Computer Structures: Principles and Examples, McGraw-Hill, 1982; Hwang et al., Computer Architecture and Parallel processing, McGraw-Hill, 1984; and Agrawal, Advanced Computer Architecture, IEEE Computer Society, 1986. Essentially no real time solutions have been found. For example, Cray Research, Inc. recently released the Cray X-MP, a multiprocessor version of the Cray-1 supercomputer. A number of investigations (Bailey, "Vector Computer Memory Bank Contention, " IEEE Transactions on Computers, 1987, C-36, 3, 293-298; Cheung et al., "A Simulation Study of the Cray C-MP Memory System," IEEE Transactions on Computers, 1986, C-36, 7, 613-622; and Oed et al., "Modeling, Measurement and Simulation of Memory Interference in the Cray X-MP, " Parallel Computing, 1986, 343-358) quickly showed the system to suffer from slowdowns due to both contention for shared memory and contention for the network which connects the processors to that memory.
Perhaps an even more dramatic example is the S-1, a TC multiprocessor system developed at Lawrence Livermore National Laboratories (Hwang et al., 1984). Throughout the period of development of this system, it was hailed as one of the most advanced multiprocessor projects in existence. However, recently the project was discontinued, in spite of all the favourable publicity, and the very extensive funds expended. One of the primary reasons given for the discontinuation was that the project engineers had found that the contention for shared memory in the system would be much greater than they had anticipated. They are now beginning to work on a completely new design.
Another obstacle to achieving factor-p speedup is that memory "chips" have a very small pins-to-bits ratio. A memory chip can store thousands or even millions of bits of information, yet this information is accessible through only a small number, such as 8 or 16, of data pins. This has been a problem even in uniprocessor systems; some processors can consume data much faster than the rate at which it can be accessed in a memory chip.
In conventional interconnect technology, the simplest interconnect is a bus, which consists of a single set of wires. All processors and memory modules are attached to the bus. Since processors access memory solely through this single path, it is immediately clear that the contention between processors and memory modules for the interconnect is very severe.
At the other extreme among interconnect structures is the crossbar. Here there are essentially mp processor-memory paths, one for each processor-memory module pair (m being the number of memory modules). There is no interconnect contention for this structure. However, as m and p get large, the size of the product mp grows at a very rapid rate, rendering the crossbar far too expensive a solution. Also, the more complex the crossbar, the more delay is added by the crossbar switching elements, i.e. although there are a sufficient number of paths to memory, each path gets slower. Furthermore, the crossbar still does not solve the problems of accessing the same memory module at the same time and the small pins-to-bits ratio.
Due to the expense and delay associated with a crossbar, a large number of intermediate designs have been proposed (Siegel, Interconnection Networks for Large-Scale Parallel Processing, Health, 1985). Such designs are aimed at providing almost as many processor-memory paths as does a crossbar, but with considerably less complexity. However, again the problems of memory module access and small pin-to-bits ratio remains, and the problem of processors having to contend for the interconnect between the processors and the memory modules remains in part.
Several recent articles have discussed the merits of optical interconnects for VLSI systems. See, for instance, Goodman et al., "Optical Interconnections for VLSI Systems", Proc. of the IEEE, vol. 72, no. 7, pp. 850-865, 1984; and Neff, "Alternative to VLSI", Defense Science and Electronics, May 1986, pp. 23-29. These have shown that the use of optical and electro-optical technologies should be able to overcome pin-limitation problems and increase system operating speeds. Electro-optical conversion of data is one way of realizing optical interconnects but has been hindered by a lack of suitable systems/materials for such conversion.
The loading of an array of "data" into an integrated circuit is described in International Application No. PCT/GB85/00404 of Ullman et al. for "Method and Apparatus for Loading Information into an Integrated Circuit Semiconductor Device", published as International Publication No. WO 86/01931. This loading does not, however, include the transfer of information electro-optically out of an integrated circuit. It relies instead on the use of a spatially modulated mask to define the information or data loaded.