Progresses in the multimedia technology are making new demands on the functionalities of integrated circuits (ICs), such as chips capable of processing data flow in high speed, performing a large number of high-speed addition, multiplication, fast Fourier transform (FFT), and discrete cosine transform (DCT) operations, and updating functionalities quickly to meet fast-changing market demands.
Traditional general-purpose processors (CPU) and digital signal processors (DSP) are both flexible in functionalities. By updating corresponding software applications, the CPU and DSP can be used for different applications to meet the needs of users. However, due to limited computing resources in a CPU, the CPU does not have adequate capability to process data flow or high throughput, and thus is limited on the applications. Even when multi-core architectures are used in the CPU, the computing resources are still limited, and the degree of parallelism is also limited by the available software applications. Further, the allocation of computing resources is also restricted in the CPU and its throughput is still not satisfactory.
Compared with the CPU, a DSP applies certain optimization on computing resources and also adds more operation units. However, the computing resources of the DSP are still limited. In certain DSP chips, such as the asynchronous array of simple processors (ASAP) from UC Davis, multipliers, adders, shifters and other components are directly implemented in a module, the module can then be reused so that a chip has a large number of computing resources. However, the chip may have insufficient flexibility in that the ways for configuring the chip are limited.
An application specific integrated circuit (ASIC) chip has the capability for data flow processing and high throughput to meet the demand for a large number of high-speed data operations. However, the ASIC chip often has a long design period and high cost. For example, for a 90 nm ASIC chip, the non-recurring engineering cost (NRE) can easily exceed several million dollars. At the same time, the ASIC chip also lacks flexibility and cannot change functionalities when the market demands change. Rather, a new ASIC chip needs to be re-designed. If an ASIC chip is implemented to achieve different operation modes, such as being compatible with different video decoding standards, it may require different modules to be designed for the different video decoding standards and integrated into a single chip, which may significantly increase cost.
Compared with the ASIC, a field programmable gate array (FPGA) is more flexible and can be configured according to different applications. Currently, the FPGAs on the market are mainly based on lookup table (LUT), and a chip designed based on the FPGA often has low NRE and low design cost. However, the FPGA is mainly used for random logic and can easily realize logic operations such as two-input or three-input exclusive AND and exclusive OR, etc. But when the FPGA is used for arithmetic operations such as multiplication, the synthesized multiplication and other operation units may often have a large chip area. Although the FPGA often has its own multiplier, such as an 18×18 multiplier, it may be difficult to configure such multiplier into a different multiplier, such as a 32×32 multiplier or an 8×8 multiplier. Further, when the FPGA is used in a design, the interconnection delay counts for a large part of the total FPGA delay. For example, when Lattice's model LFSC25 FPGA is used to implement cyclical redundancy check (CRC) calculation, the interconnection delay may count for 78.3% of the total delay. Therefore, the interconnection delay in the FPGA may substantially limit its performance. In addition, the FPGA interconnection delay is often not known until the design is mapped onto the FPGA. Thus, in a FPGA-based design, because the approximate interconnection delay is not known at design time, the design may need to be revised several times in order to reduce the delay. The design period may then be prolonged.
The disclosed methods and systems are directed to solve one or more problems set forth above and other problems.