While there are numerous examples of existing optical computing apparatus, it is believed that all of such apparatus are lacking in simplicity and generality. As such, is difficult to characterize such apparatus as general purpose optical computers.
For example, the approach taken in the optical computing apparatus of the above identified copending application relies heavily upon preconditioning of the input bits prior to processing the preconditioned bits with an AND-OR or AND-OR-INVERT optical operation. The more complex the operation sought to be implemented, the more complex the preconditioning that was required.
Another example of an optical computing apparatus is disclosed in U.S. Pat. No. 3,680,080, issued Jul. 25, 1972 to Maure, in which a dual rail input, dual rail switch interconnect is employed. A clear disadvantage of this approach is the complexity of the fixed interconnect scheme and the limiting effect that the dual rail interconnect scheme has upon the kinds of functions that can be implemented.
In the past, computer architects have driven their systems in parallel configurations to achieve speed and avoid bus interface bottlenecks. Historically, the first parallel machines were MIMD machines, Multiple Instruction Multiple Data path. This was effectively the placement of many processors on the same bus. This resulted in shared memory and a MIMD compiler which in itself presented extreme software inefficiencies that ultimately severely limited their performance. The next generation of parallel machinery were the SIMD, Single Instruction Multiple Data path machines, and systolic machines. These machines implement parallelism at the Do loop level, most typically for multiply/accumulate intensive problems such as linear algebra problems, FFTs, N.sup.3 and above problem classes. These machines require "vectorizing" and/or "matrisizing" compilers. Depending on the degree of parallel compilation, the efficiency of these machines could be improved.
Most all code that exists today is Von Neuman in nature, i.e. single instruction sequential. What is desired is a fast Von Neuman machine without I/O bottlenecks.
The architecture described in the subject application provides a solution. Parallelism is identified by the compiler and exploited at the microcode level. That is, each instruction can be written as parallel combinatorial functionals. Data reuse is achieved by operating on the data several times within one instruction thereby avoiding the I/O bottleneck.