1. Field of the Invention
The present invention relates to improvements in data processing systems. More particularly, the invention is directed to a massively parallel data processing system containing an array of closely spaced cells where each cell has direct output means as well as means for input, processing and memory.
2. Description of the Prior Art
Parallel computer systems are well known in the art. IBM""s 3084 and 3090 mainframe computers, for example, use parallel processors sharing a common memory. While such shared memory parallel systems do remove the von Neumann single processor bottleneck, the funnelling of memory access from all the processors through a single data path rapidly reduces the effectiveness of adding more processors. Parallel systems that overcome this bottleneck through the addition of local memory are also known in the art. U.S. Pat. No. 5,056,000, for example, discloses a system using both local and shared memory, and U.S. Pat. No. 4,591,981 discloses a local memory system where each xe2x80x9clocal memory processorxe2x80x9d is made up of a number of smaller processors sharing that xe2x80x9clocalxe2x80x9d memory. While in these systems each local memory processor has its own local input and output, that input and output is done through external devices. This necessitates having complex connections between the processors and external devices, which rapidly increases the cost and complexity of the system as the number of processors is increased.
Massively parallel computer systems are also known in the art. U.S. Pat. Nos. 4,622,632, 4,720,780, 4,873,626, 4,905,145, 4,985,832, 4,979,096, 4,942,517 and 5,058,001, for instance, disclose examples of systems comprising arrays of processors where each processor has its own memory. While these systems do remove the von Neumann single processor bottleneck and the multi-processor memory bottleneck for massively parallel applications, the output of the processors is still gathered together and funnelled through a single data path to reach a given external output device. This creates an output bottleneck that limits the usefulness of such systems for output-intensive tasks, and the reliance on connections to external input and output devices increases the size, cost and complexity of the overall systems.
Even massively parallel computer systems where separate sets of processors have separate paths to I/O devices, such as those disclosed in U.S. Pat. Nos. 4,591,980, 4,933,836 and 4,942,517 and Thinking Machines Corp.""s Connection Machine CM-5, rely on connections to external devices for their input and output. Having each processor set connected to an external I/O device also necessitates having a multitude of connections between the processor array and the external devices, thus greatly increasing the overall size, cost and complexity of the system. Furthermore, output from multiple processor sets to a single output device, such as an optical display, is still gathered together and funnelled through a single data path to reach that device. This creates an output bottleneck that limits the usefulness of such systems for display-intensive tasks.
Input arrays are also known in the art. State-of-the-art video cameras, for example, use arrays of charge-coupled devices (CCD""s) to gather parallel optical inputs into a single data stream. Combining a direct input array with a digital array processor is disclosed in U.S. Pat. No. 4,908,751, and is mentioned as an alternative input means in U.S. Pat. No. 4,709,327. Direct input arrays that do analog processing of the incoming data have been pioneered by Carver Mead, et al., (Scientific American, May 1991). While such direct-input/processor arrays do eliminate the input bottleneck to the processor array, these array elements lack direct output means and hence do not overcome the output bottleneck. Reliance on connections to external output devices also increases the size, cost and complexity of the overall systems.
Output arrays where each output element has its own transistor are also known in the art and have been commercialized for flat-panel displays, and some color displays use display elements with one transistor for each color. Since the limited xe2x80x9cprocessing powerxe2x80x9d associated with each output element cannot add or subtract or edit-and-pass-on a data stream, such display elements can do no data decompression or other processing, and thus the output array still requires a single uncompressed data stream, creating a band-width bottleneck as array size increases.
Portable computer systems are also known in the art. Smaller and smaller systems are being introduced every year, but the most compact systems suffer from extremely limited processing power, cramped keyboards, and limited battery life. Traditional system assembly techniques assemble systems from many separate pieces, which leads to inefficient use of space. Current processor architectures use much of the area of each processor chip with wiring for long distance communication. Furthermore, lithography errors limit the size of processor and memory chips so many separate chips must be used in a system. Processor chips and memory chips are produced on separate thin semi-conductor wafers, and these wafers are diced into their component chips of which a number then are encapsulated in bulky packages and affixed to even bulkier printed circuit boards. These boards are then connected to separate external devices for input and output, creating systems many orders of magnitude bigger than the component chips themselves.
Integrated circuits fabricated from amorphous silicon, as opposed to crystalline silicon, are also known in the state of the art. Amorphous silicon, though, is far less consistent a substrate, making it far more difficult to fabricate super-miniature components, and larger components are slower as well as bulkier than smaller ones. Since processor speed is the main bottleneck in the uni-processor computers that dominate the computer world, and since information gathering speed is a growing bottleneck in the massively parallel systems that are trying to replace them, the slower amorphous silicon integrated circuits have not been competitive with crystalline silicon in spite of their lower per-circuit fabrication costs.
It is therefore one object of the present invention to provide an ultra-high-resolution display containing an array of closely spaced cells where each cell has optical direct output means, input means, and memory and processing means just sufficient to extract a datum from a compressed data stream and to transmit that datum through the direct output means, thus maximizing the number of cells that can be fabricated in a given area.
It is another object of the present invention to overcome the drawbacks in current parallel processing systems by providing a massively parallel data processing system containing an array of closely spaced cells where each cell has direct output means, input means, and means for sufficient memory and processing to perform general data processing, allowing the array to handle a wide range of parallel processing tasks without processor, memory or output bottlenecks.
It is another object of the present invention to provide a massively parallel data processing system that minimizes the distances between input, output, memory and processing means, allowing lower voltages to be used and less power to be consumed during operation.
It is another object of the present invention to provide an array of closely spaced cells where each cell has direct input means, direct output means and means for memory and processing, allowing the array to communicate with external devices without physical connections to those devices.
It is another object of the present invention to provide a data processing system containing an array of closely spaced cells interconnected with spare cells in a network that is highly tolerant of defective cells, allowing large arrays to be fabricated as single units with high production yields in spite of defective cells.
It is another object of the present invention to provide a data processing architecture that maximizes system speed relative to component speed, thereby making practical the fabrication of components from lower-cost, but slower, amorphous silicon.
It is another object of the present invention to provide a data processing architecture that simplifies the implementation of continuous manufacturing processes through the at-least-linear replication of all complex components.
It is a further object of the present invention to provide a method for implementing any of the aforementioned objects of the present invention in single thin sheet.
In accordance with one aspect of the invention, there is thus provided an apparatus containing an array of closely spaced cells, each cell having access to a global input and having direct optical output means as well as minimal memory and processing means, allowing the array to receive, decompress and display data transmitted by another apparatus, such as a computer, a TV station or a VCR.
In accordance with another aspect of the invention, there is thus provided an apparatus containing an array of closely spaced cells, each cell having means for communication with neighboring cells as well as direct optical output means and minimal memory and processing means, allowing the array to receive, decompress and display a large number of parallel input streams transmitted by another apparatus such as a computer or a VCR, and allowing all array cells to be logically identical and to be produced with identical lithographic patterns.
The present invention also provides, in another aspect, a system containing an array of closely spaced cells, each cell having its own direct input means and direct output means as well as means for memory, means for processing and means for communication with neighboring cells, each cell being, in short, a complete miniature data processing system in its own right, as well as being part of a larger network, providing a massively parallel data processing system that overcomes the I/O and memory bottlenecks that plague parallel processors as well as the von Neumann bottleneck of single processor architectures, and eliminating physical interconnections between the processor/memory array and external input and output devices.
In accordance with still another aspect of the invention, there is thus provided a system containing an array of closely spaced cells, each cell having direct input means and direct output means as well as means for memory, means for processing and means for communication with neighboring cells, where all cells are identical in logical characteristics and can be produced with identical lithographic patterns, simplifying the fabrication of the array with continuous linear production techniques.
In accordance with still another aspect of the invention, there is thus provided a system comprising an array of closely spaced cells, each cell having multiple direct output means and sufficient memory and processing capabilities to simulate several smaller cells each with direct output means, increasing the output resolution of the array relative to the cell density.
In accordance with still another aspect of the invention, there is thus provided a system comprising an array of closely spaced cells, each cell having direct output means, means for memory and means for processing, interconnected with spare cells in a manner such that one or more spare cells can replace the functions of any defective cell.
The present invention also provides, in another aspect thereof, a method for producing any of the above arrays of closely space cells where the entire array is fabricated as a single thin sheet.
By the expression xe2x80x9cmassively parallelxe2x80x9d as used herein is meant a problem, a task, or a system with at least 1000 parallel elements.
By the expression xe2x80x9carrayxe2x80x9d as used herein is meant elements arranged in a two dimensional pattern or as the surface of a three dimensional shape.
By the expression xe2x80x9cclosely spaced cellsxe2x80x9d as used herein is meant that the average center-to-center distance between neighboring cells is less than one centimeter.
By the expression xe2x80x9cdirect output meansxe2x80x9d as used herein is meant means for a given cell to send an output signal to a device outside the array (such as a human eye) without that output signal being relayed through a neighboring cell, through a physical carrier common to the cells, or through a separate external output device.
By the expression xe2x80x9cdirect input meansxe2x80x9d as used herein is meant means for a given cell to receive an input signal from a device outside the array without that input signal being relayed through a neighboring cell, through a physical carrier common to the cells, or through a separate external input device.
By the expression xe2x80x9cglobal inputxe2x80x9d as used herein is meant means for an individual cell to pick up an input signal from a physical carrier common to the cells, such as a global data bus.
By the expression xe2x80x9cexternal output devicexe2x80x9d as used herein is meant an output device fabricated as a separate physical entity from the cell array.
By the expression xe2x80x9cexternal input devicexe2x80x9d as used herein is meant an input device fabricated as a separate physical entity from the cell array.
By the expression xe2x80x9cmeans for communication with neighboring cellsxe2x80x9d as used herein is meant input means to receive a signal from at least one neighboring cell and output means to send a signal to at least one other neighboring cell without the signals being relayed through a global data bus or through an external device.
By the expression xe2x80x9cthin sheetxe2x80x9d is meant a sheet whose total thickness is less than 1 centimeter.
The expression xe2x80x9ccould be produced with identical lithographic patternsxe2x80x9d is used to solely to a describe the similarity of the structures and is not to be construed as limiting the invention to embodiments produced with lithography.