To date, the most common method of implementing various functions on an integrated circuit is by specifically designing the function or functions to be performed by placing on silicon an interconnected group of digital circuits in a non-modifiable manner (hard-wired or fixed function implementation.) These circuits are designed to provide the fastest possible operation of the circuit in the least amount of silicon area. In general these circuits are made up of an interconnection of various amounts of random-access memory and logic circuits. Complex systems on silicon are broken up into separate blocks and each block is designed separately to only perform the function that it was intended to do. In such systems, each block has to be individually tested and validated, and then the whole system has to be tested to make sure that the constituent parts work together. This process is becoming increasingly complex as we move into future generations of single-chip system implementations. Systems implemented in this way generally tend to be the highest performing systems since each block in the system has been individually tuned to provide the expected level of performance. This method of implementation may be the smallest (cheapest in terms of silicon area) method when compared to three other distinct ways of implementing such systems today. Each of these other three have their problems and generally do not tend to be the most cost-effective solution. These other methods are explained below.
Any system can be functionally implemented in software using a microprocessor and associated computing system. Such systems would however, not be able to deliver real-time performance in a cost-effective manner for the class of applications that was described above. Today such systems are used to model the subsequent hard-wired/fixed-function system before considerable design effort is put into the system design.
The second method of implementing such systems is by using a digital signal processor or DSP. This class of computing machines is useful for real-time processing of certain speech, audio, video and image processing problems. They may also be effective in certain control functions but are not cost-effective when it comes to performing certain real time tasks which do not have a high degree of parallelism in them or tasks that require multiple parallel threads of operation such as three-dimensional graphics.
The third method of implementing such systems is by using field programmable gate arrays or FPGAs. These devices are made up of a two-dimensional array of fine grained logic and storage elements which can be connected together in the field by downloading a configuration stream which essentially routes signals between these elements. This routing of the data is performed by pass-transistor logic. FPGAs are by far the most flexible of the three methods mentioned. The problem with trying to implement complex real-time systems with FPGAs is that although there is a greater flexibility for optimizing the silicon usage in such devices, the designer has to trade it off for increase in cost and decrease in performance. The performance may (in some cases) be increased considerably at a significant cost, but still would not match the performance of hard-wired fixed function devices.
It can be seen that the above mentioned systems do not reduce the cost or increase the performance over fixed-function silicon systems. In fact, as far as performance is concerned fixed-function systems still out perform the above mentioned systems for the same cost.
The three systems mentioned can theoretically reduce cost by removing redundancy from the system. Redundancy is removed by re-using computational blocks and memory. The only problem is that these systems themselves are increasingly complex, and therefore, their computational density when compared with fixed-function devices is very high.
Most systems on silicon are built up of complex blocks of functions that have varying data bandwidth and computational requirements. As data and control information moves through the system, the processing bandwidth varies enormously. Regardless of the fact that the bandwidth varies, fixed-function systems have logic blocks that exhibit a "temporal redundancy" that can be exploited to drastically reduce the cost of the system. This is true, because in fixed function implementations all possible functional requirements of the necessary data processing has to be implemented on the silicon regardless of the final application of the device or the nature of the data to be processed. Therefore, if a fixed function device has to adaptively process data, then it has to commit silicon resources to process all possible flavors of the data. Furthermore, state-variable storage in all fixed function systems are implemented using area inefficient storage elements such as latches and flip-flops.
It is the object of the present invention to provide a new method and apparatus for implementing systems on silicon or other material which will enable the user a means for achieving the performance of fixed-function implementations at a lower cost. The lower cost is achieved by removing redundancy from the system. The redundancy is removed by re-using groups of computational and storage elements in different configurations. The cost is further reduced by employing only static or dynamic ram as a means for holding the state of the system. This invention provides a means of effectively adapting the configuration of the circuit to varying input data and processing requirements. All of this reconfiguration can take place dynamically in run-time without any degradation of performance over fixed function implementations.