1. Field of the Invention
This invention relates to computer systems and more particularly to multi-processor computer systems wherein a host processor controls the actions of a plurality of peripheral processors.
2. Description of the Prior Art
In certain high data rate applications such as radar processing, seismic processing, voice processing, and others the required data processing rate may be too great for a single data processor of the desired size to process all data in the time allowed. In one solution to such problems relating to data processing capacity it is known to reduce the load on the main, or host data processor by providing a secondary data processor, controlled by the host, which performs part of the processing for the host. Such a secondary data processor will be herein referred to as a peripheral processor or PP.
In prior art systems typically, the host processor transfers a block of data into a data memory associated with the PP. The PP then transforms the data in some desired fashion under the control of its own independent stored program, if the PP includes a computer, or otherwise under the control of a fixed logic arrangement. The host then reads the partially processed data out of the PP data memory. A typical example of such a host processor coupled with a single PP is found in the article "The Omen Computers: Associative Array Processors" by L. C. Higbie, IEEE Computer Society International Conference, 1972, pages 288 and 289.
If the required data processing rate is greater than can be accomplished with the aid of a single PP, additional PPs may be added to the host to perform additional job steps. In the prior art this has been accomplished by interfacing multiple PPs with the host computer data bus, each PP having the appearance of a peripheral input/output device. The host must then read results from one PP data memory and then write these results into the data memory of the next PP in turn. This type of arrangement is referred to, for example, at pages 159-160 of Electronics, Vol. 50, No. 5, Mar. 3, 1977. As the number of PPs increases, the load on the input/output data bus and the load on the host memory access circuitry increases. In some applications it may be desirable to use a substantial number of PPs. One such application arises in speech analysis problems such as word recognition, speaker verification and pitch detection. Steps such as digital bandwidth filtering, fast Fourier transform, convolution, correlation, and others may be provided by PPs. In such applications the processing rate may be limited by the total number of data accesses required to transfer data from the data memory of one PP to the next.
One prior art solution to the transfer rate problem for multiple PPs has been to provide a crossbar switch to interconnect multiple PP processors with multiple PP data memories. An example of one such system is found in U.S. Pat. No. 3,551,894 by Lehman et al. Connections are rearranged through the crossbar switch to associate the partially processed data left in each data memory with the next PP processor which is to act upon it. This technique suffers from the disadvantages that the crossbar switch is complex and nonmodular in structure, and the amount of hardware required grows approximately in relation to the square of the number of PPs involved.
Another technique for reducing the necessity to transfer data between the host and PP data memory is to arrange for each PP to access the host memory on a cycle-stealing basis. Host processor memory thus provides a common pool of memory for the PPs. With this technique the effect of transferring data from one PP data memory to the next is typically achieved by altering pointer information used to access host processor memory, so that the physical memory locations accessed by a given PP processor can be easily altered as processing proceeds. However, as the number of PPs increases the PPs occupy an increasing portion of the available memory access time, possibly interfering with the host processor. In extreme cases the host processor actually may be prevented from performing useful work as the PPs make demands on the host memory. This problem is discussed in Computer, Vol. 10, No. 4, April 1977 in the article "Interprocessor Communication for Multi-Microcomputer Systems" by P. M. Russo at page 69.
Prior art arrangements so far discussed suffer from problems which are solved by the present invention. In those prior art arrangements wherein data is physically transferred from one data memory to the next, data processing must await data transfer. Since data transfer is taking place on a word-by-word sequential basis, there is opportunity for performing some of the required data processing while the data is in transit and prior to the beginning of processing by the PP processor. Further, in prior art systems data is either physically moved or left in place for the succeeding PP processing step in discrete blocks of contiguous words. Data words are thus made available to the succeeding PP in whatever arrangement is convenient to the prior PP. The succeeding PP may first have to rearrange the data words before actual processing can get under way. Each of these problems decreases the overall processing rate of each PP step and increases the complexity of the processing by the PP.