This invention relates generally to high speed computing systems, and, more particularly, it relates to a computing system architecture for decomposing mathematical operations into a series of block operations on discrete blocks of data, and for performing those block operations with a block processor by supplying and removing blocks of data to and from the block processor.
Certain classes of mathematical problems require high speed computers for their solution to be useful. For example, a complex weather forecasting program is only useful if it produces predictions for a particular day before that day has come and gone. Similarly, applications such as film animation and seismic exploration also require high speed computing power to complete their task in a useful time period.
Higher computing speeds have been approached in a variety of ways, such as faster semiconductor technology and the use of multiple processors in a parallel or pipelined configuration. In pipelined architectures, for example, an operation is divided into several stages and a sequence of relatively simple operations is performed on long vectors of data. Partial processors achieve increases in speed by simultaneously performing the same operation on different data elements.
While the availability of relatively low cost processing elements has made such multiple-processor architectures possible, the overall speed of the computing system is limited by the speed with which data stored in main memory can be supplied to and retrieved from the processors. In the prior art, methods for local storage of data (buffer memories), pipelining the transmission of data from memory, and double buffering of data have been developed, but, as processor speeds increase, the "I/O bottleneck" remains.
In the mathematical arts, various methods of problem decomposition have been developed for decomposing an operation on large arrays of data into an equivalent sequence of operations on smaller discrete blocks of data. What is needed is a computing system architecture which supports the use of block decomposition algorithms in multiprocessor systems by supplying and removing blocks of data to and from the processing elments at a speed which allows the processors to operate virtually continuously.