The present invention relates to reconfigurable components in general, and in particular but not exclusively the decoupling of data processing within the reconfigurable component and/or within parts of the reconfigurable component and data streams, specifically both within the reconfigurable component and also to and from peripherals, mass memories, host processors, and the like (see, e.g., German Patent Application Nos. DE 101 10 530.4 and DE 102 02 044.2).
Memories are assigned to a reconfigurable module (VPU) at the inputs and/or outputs to achieve decoupling of internal data processing, the reconfiguration cycles in particular, from the external data streams (to/from peripherals, memories, etc.).
Reconfigurable architecture includes modules (VPUs) having a configurable function and/or interconnection, in particular integrated modules having a plurality of unidimensionally or multidimensionally positioned arithmetic and/or logic and/or analog and/or storage and/or internally/externally interconnecting modules, which are interconnected directly or via a bus system.
These generic modules include in particular systolic arrays, neural networks, multiprocessor systems, processors having a plurality of arithmetic units and/or logic cells and/or communication/peripheral cells (IO), interconnecting and networking modules such as crossbar switches, as well as conventional modules including FPGA, DPGA, Chameleon, XPUTER, etc. Reference is also made in particular in this context to the following patents and patent applications of the same applicant: P 44 16 881.0-53, DE 197 81 412.3, DE 197 81 483.2, DE 196 54 846.2-53, DE 196 54 593.5-53, DE 197 04 044.6-53, DE 198 80 129.7, DE 198 61 088.2-53, DE 199 80 312.9, PCT/DE00/01869, now U.S. Pat. No. 8,230,411, DE 100 36 627.9-33, DE 100 28 397.7, DE 101 10 530.4, DE 101 11 014.6, PCT/EP00/10516, EP 01 102 674.7, DE 196 51 075.9, DE 196 54 846.2, DE 196 54 593.5, DE 197 04 728.9, DE 198 07 872.2, DE 101 39 170.6, DE 199 26 538.0, DE 101 42 904.5, DE 101 10 530.4, DE 102 02 044.2, DE 102 06 857.7, DE 101 35 210.7, EP 02 001 331.4, EP 01 129 923.7 as well as the particular parallel patent applications thereto. The entire disclosure of these documents are incorporated herein by reference.
The above-mentioned architecture is used as an example to illustrate the present invention and is referred to hereinafter as VPU. The architecture includes an arbitrary number of arithmetic, logic (including memory) and/or memory cells and/or networking cells and/or communication/peripheral (IO) cells (PAEs—Processing Array Elements), which may be positioned to form a unidimensional or multidimensional matrix (PA); the matrix may have different cells of any desired configuration. Bus systems are also understood here as cells. A configuration unit (CT) which affects the interconnection and function of the PA is assigned to the entire matrix or parts thereof.
Memory access methods for reconfigurable modules which operate according to a DMA principle are described in German Patent No. P 44 16 881.0, where one or more DMAs are formed by configuration. In German Patent Application No. 196 54 595.1, DMAs are fixedly implemented in the interface modules and may be triggered by the PA or the CT.
German Patent Application No. DE 196 54 846.2 describes how internal memories are written by external data streams and data is read out of the memory back into external units.
German Patent Application No. DE 199 26 538.0 describes expanded memory concepts according to DE 196 54 846.2 for achieving more efficient and easier-to-program data transmission. U.S. Pat. No. 6,347,346 describes a memory system which corresponds in all essential points to German Patent Application No. DE 196 54 846.2, having an explicit bus (global system port) to a global memory. U.S. Pat. No. 6,341,318 describes a method for decoupling external data streams from internal data processing by using a double-buffer method, in which one buffer records/reads out the external data while another buffer records/reads out the internal data; as soon as the buffers are full/empty, depending on their function, the buffers are switched, i.e., the buffer formerly responsible for the internal data now sends its data to the periphery (or reads new data from the periphery) and the buffer formerly responsible for the external data now sends its data to the PA (reads new data from the PA). These double buffers are used in the application to buffer a cohesive data area.
Such double-buffer configurations have enormous disadvantages in the data-stream area in particular, i.e., in data streaming, in which large volumes of data streaming successively into a processor field or the like must always be processed in the same way.