This invention is related in general to digital processing architectures and more specifically to the design of a reconfigurable processing node for use in an adaptive computing engine.
The advances made in the design and development of integrated circuits (“ICs”) have generally produced information processing devices falling into one of several distinct types or categories having different properties and functions, such as microprocessors and digital signal processors (“DSPs”), application specific integrated circuits (“ASICs”), and field programmable gate arrays (“FPGAs”). Each of these different types or categories of information processing devices have distinct advantages and disadvantages. Microprocessors and DSPs, for example, typically provide a flexible, software programmable solution for a wide variety of tasks. The flexibility of these devices requires a large amount of instruction decoding and processing, resulting in a comparatively small amount of processing resources devoted to actual algorithmic operations. Consequently, microprocessors and DSPs require significant processing resources, in the form of clock speed or silicon area, and consume significantly more power compared with other types of devices.
ASICs, while having comparative advantages in power consumption and size, use a fixed, “hard-wired” implementation of transistors to implement one or a small group of highly specific tasks. ASICs typically perform these tasks quite effectively; however, ASICs are not readily changeable, essentially requiring new masks and fabrication to realize any modifications to the intended tasks.
FPGAs allow a degree of post-fabrication modification, enabling some design and programming flexibility. FPGAs are comprised of small, repeating arrays of identical logic devices surrounded by several levels of programmable interconnects. Functions are implemented by configuring the interconnects to connect the logic devices in particular sequences and arrangements. Although FPGAs can be reconfigured after fabrication, the reconfiguring process is comparatively slow and is unsuitable for most real-time, immediate applications. Additionally, FPGAs are very expensive and very inefficient for implementation of particular functions. An algorithmic operation implemented on an FPGA may require orders of magnitude more silicon area, processing time, and power than its ASIC counterpart, particularly when the algorithm is a poor fit to the FPGA's array of homogeneous logic devices.
Matrix operations are used in a wide variety of applications. Image and video applications, audio applications, and signal processing applications can all use matrix operations to perform frequency domain transforms, such as discrete cosine and fast Fourier transforms. Image processing applications can use matrix operations to perform down sampling, color conversion, and quantization. Video applications can use matrix operations to perform video compression or decompression, for example MPEG4. Signal processing applications can use matrix applications to implement finite impulse response (FIR) filters. Matrix operations also are used to interpolate data, correlate sets of data, and perform complex-valued mathematical operations.
Most matrix operations must be performed in real-time, so processing speed is an important design consideration. In addition, with some applications, for example mobile communication devices, limited battery capacity makes power consumption a consideration. Cost is also a consideration, thus, the efficient use of silicon area is a priority for many applications.
Thus, it is desirable to provide a node for use in an adaptive computing engine specifically adapted to performing matrix operations. It is further desirable that this node provide fast performance, flexible configuration, low power consumption, and low cost for a wide variety of applications.