1. Field of the Invention (Technical Field)
The present invention relates to computer architectures for improving speed of execution of repeating short blocks of instructions particularly for general purpose algorithms or data filters.
2. Background Art
Many computer programs perform the largest portion of their processing (measured in time or CPU cycles) in one or several comparatively short blocks of instructions. Programmers have been traditionally taught to make programs "maintainable", even though that may produce some inefficient computer instruction code, and then go back to identify and optimize the blocks of instructions where the most time is being spent. With a device according to the present invention, compute-intensive blocks of instructions may be placed dynamically into programmable logic devices (PLDs), making thems execute quickly and efficiently.
In the prior art, special-purpose co-processors and accelerators exist, such as numeric co-processors, graphic display accelerators, digital signal processors, and the like, but these are designed to expedite a specific function or narrow set of functions. In other words, they are not general purpose. In the current invention, the data might just as easily be a picture, a series of samples of a data acquisition unit, or the result of the last process step. Processors like the Intel numeric co-processors (8087, 80287, etc.) have a fixed set of microcoded numeric instructions to perform, on demand. Graphic display accelerators are designed to perform graphic manipulations on data passing between the system's bus and the video display unit. Although digital signal processors (DSPs) can be flexibly programmed, their instruction set is designed for signal processing and still have to be decoded before being processed, and so routines written for them do not execute as quickly as routines implemented at the logic gate level, as in the present invention. Image Enhancement Co-Processors (IMECOs) employ PLDs and memory blocks, but do not provide for bi-directional data flows, multiple algorithms, algorithm caching, or memory sub-sampling as provided by the present invention to increase throughput and reduce data congestion of the host microprocessor.
Patents disclosing uses of PLDs quite different from the present invention include: U.S. Pat. No. 5,497,498, to Taylor, entitled "Video Processing Module Using a Second Programmable Logic Device Which Reconfigures a First Programmable Logic Device for Data Transformation"; U.S. Pat No. 5,537,601, to Kimura et al., entitled "Programmable Digital Signal Processor for Performing a Plurality of Signal Processings"; and U.S. Pat. No. 5,603,043, to Taylor et al., entitled "System for Compiling Algorithmic Language Source Code for Implementation in Programmable Hardware". Each of these disclosures is directed to a specific task rather than to the general acceleration strategy of the present invention.