A digital signal processor (DSP) is a special purpose computer that is designed to optimize performance for digital signal processing applications, such as, for example, fast Fourier transforms, digital filters, image processing, signal processing in wireless systems and speech recognition. DSP applications are typically characterized by real-time operation, high interrupt rates and intensive numeric computations. In addition, DSP applications tend to be intensive in memory access operations and to require the input and output of large quantities of data. DSP architectures are typically optimized for performing such computations efficiently.
The core processor of a DSP typically includes a computation block, a program sequencer, an instruction decoder and all other elements required for performing digital signal computations. The computation block is the basic computation element of the DSP and typically includes one or more computation units, such as a multiplier and an arithmetic logic unit (ALU), and a register file.
Digital signal computations are frequently repetitive in nature. That is, the same or similar computations may be performed multiple times with different data. Thus, any increase in the speed of individual computations is likely to provide significant enhancements in the performance of the DSP.
Some applications, such as base stations in wireless systems, have performance and timing requirements that exceed the capabilities of current DSPs. To meet these requirements, designers have used DSPs in combination with ASICs (application specific integrated circuits) and/or FPGAs (field programmable gate arrays). Such systems lack flexibility and are relatively expensive. Further, the required performance increases as next generation wireless systems are introduced. High power dissipation is usually a problem in high performance processors.
DSP designs may be optimized with respect to different operating parameters, such as computation speed, power consumption and ease of programming, depending on intended applications. Furthermore, DSPs may be designed for different word sizes. A 32-bit architecture that utilizes a long instruction word and wide data buses and which achieves high operating speed is disclosed in U.S. Pat. No. 5,954,811, issued Sep. 21, 1999 to Garde, the entire disclosure of which is incorporated by reference herein. The core processor includes dual computation blocks. Notwithstanding very high performance, the disclosed processor does not provide an optimum solution for all applications.
Furthermore, even DSPs that incorporate multiple computation blocks generally suffer from latency as instructions or data circulate and recirculate among all or a subset of the computation blocks.
Accordingly, there is a need for further innovations in DSP architecture and performance.