German Patent No. 44 16 881 describes a method of processing data, where homogeneously arranged cells which can be configured freely in function and interconnection are used.
Independently of the above-mentioned patent, field programmable gate array (FPGA) units are being used to an increasing extent to assemble arithmetic and logic units and data processing systems from a plurality of logic cells.
Another known method is to assemble data processing systems from fixed program-controlled arithmetic and logic units with largely fixed interconnection, referred to as systolic processors.
Problems
Units Described In German Patent No. 44 16 881 Units described in German Patent No. 44 16 881 (referred to below as xe2x80x9cIVPUSxe2x80x9d) are very complicated to configure owing to the large number of logic cells. To control one logic cell, several control bits must be specified in a static memory (SRAM). There is one SRAM address for each logic cell. The number of SRAM cells to be configured is very large, thus, a great deal of space and time is needed for configuring and reconfiguring such a unit. The great amount of space required is problematical because the processing power of a VPU increases with an increase in the number of cells, and the area of a unit that can be used is limited by chip manufacturing technologies. The price of a chip increases approximately proportionally to the square of the chip area. It is impossible to broadcast data to multiple receivers simultaneously because of the repeated next-neighbor interconnection architecture. If VPUs are to be reconfigured on site, it is absolutely essential to achieve short reconfiguration times. However, the large volume of configuration data required to reconfigure a chip stands in the way of this. There is no possibility of separating cells from the power supply or having them cycle more slowly to minimize the power loss.
FPGAs
FPGAs for the use in the area described here usually include multiplexers or look-up table (LUT) architectures. SRAM cells are used for implementation. Because of the plurality of small SRAM cells, they are very complicated to configure. Large volumes of data are required, necessitating a comparably large amount of time for configuration and reconfiguration. SRAM cells take up a great deal of space, and the usable area of a unit is limited by the chip manufacturing technologies. Here again, the price increases approximately proportionally to the square of the chip area. SRAM-based technology is slower than directly integrated logic due to the SRAM access time. Although many FPGAs are based on bus architectures, there is no possibility of broadcasting for rapid and effective transmission of data to multiple receivers simultaneously. If FPGAs are to be reconfigured at run time, it is absolutely essential to achieve short configuration times. However, the large volume of configuration data required stands in the way. FPGAs do not offer any support for reasonable reconfiguration at run time. The programmer must ensure that the process takes place properly without interfering effects on data and surrounding logic. There is no intelligent logic to minimize power loss. There are no special function units to permit feedback on the internal operating states to the logic controlling the FPGA.
Systolic Processors
Reconfiguration is completely eliminated with systolic processors, but these processors are not flexible because of their rigid internal architecture. Commands are decoded anew in each cycle. As described above, there are no functions which include broadcasting or efficient minimization of power loss.
The present invention relates to a cascadable arithmetic and logic unit (ALU) which is configurable in function and interconnection. No decoding of commands is needed during execution of the algorithm. It can be reconfigured at run time without any effect on surrounding ALUs, processing units or data streams. The volume of configuration data is very small, which has positive effects on the space required and the configuration speed. Broadcasting is supported through the internal bus systems in order to distribute large volumes of data rapidly and efficiently. The ALU is equipped with a power-saving mode to shut down power consumption completely. There is also a clock rate divider which makes it possible to operate the ALU at a slower clock rate. Special mechanisms are available for feedback on the internal states to the external controllers.
The present invention is directed to the architecture of a cell as described in, for example, German Patent No. 44 16 881, or, or example, conventional FPGA cells. An expanded arithmetic and logic unit (EALU) with special extra functions is integrated into this cell to perform the data processing. The EALU is configured by a function register which greatly reduces the volume of data required for configuration. The cell can be cascaded freely over a bus system, the EALU being decoupled from the bus system over input and output registers. The output registers are connected to the input of the EALU to permit serial operations. A bus control unit is responsible for the connection to the bus, which it connects according to the bus register. The unit is designed so that distribution of data to multiple receivers (broadcasting) is possible. A synchronization circuit controls the data exchange between multiple cells over the bus system. The EALU, the synchronization circuit, the bus control unit and registers are designed so that a cell can be reconfigured on site independently of the cells surrounding it. A power-saving mode which shuts down the cell can be configured through the function register; clock rate dividers which reduce the working frequency can also be set.