In recent decades computer electronics have rapidly advanced through four generations: Vacuum tubes, transistors, integrated circuits, and very large scale integrated (VLSI) circuits. These four generations have improved computer performance by several orders of magnitude while at the same time dramatically reducing cost. Despite these tremendous advances, a present need exists for further improvement by several orders of magnitude in order to solve foreseeable problems in advanced technical areas.
Most computers have been designed using the serial processor model proposed by John Van Neumann. In the classic architecture according to Van Neumann, a single stream of instructions is fed to a single processor and processed in the order received. Data is taken from memory bin locations, processed, and then returned to the same or different bin location. The programming languages are sequential, being designed to follow the serial architecture of the computer. In such serial systems, processing speed depends on the operating speed of the components. Since the component operating speed is not likely to increase significantly in the future, various parallel or concurrent processing techniques have been explored.
The commercial efforts from companies like Cray Research and Control Data in the United States and Hitachi Ltd., Fijitsu Ltd. and NEC Corp. of Japan have resulted in the so-called "supercomputers". These computer systems make extensive use of concurrent processing techniques called pipelining and vectoring. In pipelining tasks are divided up so they can be performed concurrently. For example, one section may fetch instructions while a different section processes a previously fetched instruction. Thus, while one instruction is being executed, the next instruction is simultaneously being fetched. By overlapping tasks in this fashion the operating speed on the computer can be increased. Vectoring is a somewhat similar technique involving organization of data for processing in pipeline fashion. These supercomputers also sometimes use more than one high speed processor in combination with "look-ahead" techniques in order to co-process the program where data needs do not overlap. However, the supercomputers are basically very high speed serial processors rather than parallel processors and, therefor, the extent to which concurrent processing can be incorporated is limited.
Other efforts, mostly experimental efforts at universities, have been directed toward developing systems utilizing huge numbers of processors operating in parallel. Many of these efforts, like the Illiac IV computer at the University of Illinois, have involved special purpose computations, usually problems of a generally parallel character. For example, picture processing systems normally divide the picture into individual pixels each of which can then be processed concurrently following the same program. Such systems are characterized by a single instruction, multiple data, type of flow through the system. This approach does not have general application and is limited in usefulness to what are basically parallel problems.
Parallel processor systems with dynamic configurations have been proposed, as for example, the Blue Chip project at Purdue University. These systems include a large switching array capable of connecting each of the processors to selected ones of its neighbors. Each processor includes its own ]Ocal memory and program. Unfortunately, the switching array becomes inordinately complex as the computer increases in size because of the increasing number of switching possibilities. Thus, the concept is difficult to expand into a large computing machine.
Several experimental computers have been constructed following the data-flow concept. According to this concept, control of the program execution is determined by the arrival of the data. The data is normally tagged so the data for a particular calculation can be identified on arrival. The processors operate individually without a central program counter and send out tagged results to other processors when the processing operation is complete. The data-flow computer automatically exploits the parallelism existing in problems because all instructions for which data is available can be executed at the same time if sufficient processors are available. Although this approach is thought to be promising, effective software has not yet been devised. Also this approach suffers from either high data switching overhead or data transfer bottlenecks and the need for data recognition processing at the individual processors.
A binary tree configuration has been proposed by Gyula Mago at the University of North Carolina. The main processing elements, including individual memory units, are referred to as the leaves and can pass data up the tree (actually an inverted tree) or receive data moving down the tree. Data passing through the node points can also be processed. The computer operates by moving data in waves up and down the tree structure which permits lateral as well as vertical data movement. The concept is interesting but has yet to be implemented in a large high speed computer.