The present invention relates to parallel processors.
Wilson, U.S. Pat. No. 5,129,092 (Wilson '092), describes a single instruction multiple data (SIMD) parallel processor for processing data matrices such as images and spatially related data. As shown and described in relation to FIGS. 1 and 2, the processor includes a linear chain of neighborhood processing units with direct data communication links between adjacent processing units. A single controller sends a sequence of instructions to the processing units, so that all processing units receive the same instruction at any given cycle in the instruction sequence. Each processing unit has an associated memory that is a single bit wide, to and from which data is transferred through shift registers. Similarly, each processing unit receives data from and provides data to adjacent processing units using shift registers, which are used for data input and output as described at col. 8 line 24-col. 9 line 14.
As Wilson '092 shows and describes in relation to FIGS. 1, 2, and 5, the processing units form groups of eight, and a host computer and the controller can both send or receive data from the groups via eight bit lines referred to as data byte lines. One of these lines is coupled to an output selector within each processing unit; the output of the selector can be written into memory by enabling a three-state gate. Similarly, each processing unit can deliver data from memory to its line by enabling a three-state gate.
Wilson '092 describes transpose in and transpose out operations for transposing data between memory and an accumulator in relation to FIGS. 6A and 6B. The look-up table and histogram applications described at col. 16 line 53-col. 18 line 18 both include operations that change data between vertical and horizontal formats, as illustrated in FIGS. 6A and 6B.
Wilson, EP-A 293 701 (Wilson '701), describes another such parallel processor. The data input operation is described in relation to FIGS. 1 and 2 at page 5 col. 7 lines 22-43 and the data output operation at page 8 col. 13 lines 9-41.
Hillis, U.S. Pat. No. 5,113,510 describes techniques for operating cache memory in a multi-processor, apparently developed for use in the Connection Machine from Thinking Machines Corporation, a SIMD parallel processor. As shown and described in relation to FIG. 3, each processor in a multi-processor system is connected to a corresponding cache. When a cache memory outputs a miss signal, a bus arbitration unit provides a signal indicating that each successive cache may not perform an update while the present update is being performed, so that the first cache in the priority chain to request an update temporarily disables all other requests for update. Upon receiving a request for update, a shared memory obtains data at a specified address and outputs a Data ready signal, the address, and the data. When the address is within a specified range of addresses of interest to a cache or when the cache is the source of the request for update, the cache memory accepts and stores the address and data signals. As a result, all caches receive updated data from main memory limited only by an optional range detector. As shown and described in relation to FIG. 4, the bus arbitration circuitry can be arranged in a hierarchical tree-like configuration.