One of the bottlenecks of high-speed data processing is the interface between separate data processing units, such as between memories and CPUs. This interface is an essential pathway in the entire processing system, as it is responsible for carrying an incessant flow of program instructions and data. Parallel processing increases speed on the one hand, but may introduce time-management problems on the other hand in regard to proper distribution of data, dependent on the specific data processing task.
A system that supports a parallel architecture as presented in the preamble is known from U.S. Pat. No. 5,103,311 (PHN 12,376). This reference discloses a processor system for processing video samples on a real-time basis in a modular and hierarchical architecture. The system has at least one processor unit that in turn has at least one module with a plurality of processing elements. The elements operate in parallel and are connected to a cross-bar switch for the appropriate routing of the input signals to the elements of the module and of the output signals from the elements. Each switch point of the cross-bar switch preferably is provided with a register to solve conflicts that may arise from simultaneous access requests to the same processor element. There is a fixed relationship between the sample frequency and the clock frequency at all levels in the system.
Another system of the type mentioned in the preamble is known from U.S. Pat. No. 4,521,874 and concerns an interconnect cross-bar matrix with additional memory elements at interconnect nodes. Distribution Of the memory elements facilitates the operation and programming of a parallel-processor structure. The storage protocol per element is based on consolidation, i.e, re-compaction of the stored data items in functionally contiguous memory locations after a particular data item has been purged from the memory upon reading. This data re-compaction requires additional circuitry. Also, the data re-compaction sets an upper limit to the operation speed, since it requires additional time between the execution of a last-read instruction and the execution of a next write instruction. Accordingly, this system is not suitable to process video signals on a real-time basis.