This invention relates to digital signal processors and, more particularly, to computational core architectures that facilitate both complex digital signal processing computations and microcontroller operations.
A digital signal computer, or digital signal processor (DSP), is a special purpose computer that is designed to optimize performance for digital signal processing applications, such as, for example, Fast Fourier transforms, digital filters, image processing and speech recognition. Digital signal processor applications are typically characterized by real-time operation, high interrupt rates and intensive numeric computations. In addition, digital signal processor applications tend to be intensive in memory access operations and to require the input and output of large quantities of data. Digital signal processor architectures are typically optimized for performing such computations efficiently.
Microcontrollers, by contrast, involve the handling of data but typically do not require extensive computation. Microcontroller application programs tend to be longer than DSP programs. In order to limit the memory requirements of microcontroller application programs, it is desirable to provide a high degree of code density in such programs. Thus, architectures that are optimized for DSP computations typically do not operate efficiently as microcontrollers. Also, microcontrollers typically do not perform well as digital signal processors. Nonetheless, a particular application may require both digital signal processor and microcontroller functionality.
Digital signal processor designs may be optimized with respect to different operating parameters, such as computation speed and power consumption, depending on intended applications. Furthermore, digital signal processors may be designed for 16-bit words, 32-bit words, or other word sizes. A 32-bit architecture that achieves very high operating speed is disclosed in U.S. Pat. No. 5,954,811 issued Sep. 21, 1999 to Garde.
Digital signal processors frequently utilize architectures wherein two or more data words are stored in each row of memory, and two or more data words are provided in parallel to the computation unit. Such architectures provide enhanced performance, because several instructions and/or operands may be accessed simultaneously.
Notwithstanding the performance levels of current digital signal processors, there is a need for further enhancements in digital signal processor performance.
According to a first aspect of the invention, a computation unit is provided. The computation unit is preferably configured for performing digital signal processor computations. The computation unit comprises an execution unit for performing an operation on a first operand and a second operand in response to an instruction, a register file for storing operands, first and second operand buses coupled to the register file, and first and second data selectors. The first and second operand buses each carry a high operand and a low operand. The first data selector supplies the high operand or the low operand from the first operand bus to the execution unit in response to a first operand select value contained in the instruction. The second data selector supplies the high operand or the low operand from the second operand bus to the execution unit in response to a second operand select value contained in the instruction.
The execution unit may comprise an arithmetic logic unit, a multiplier and an accumulator. In one embodiment, the register file comprises first and second register banks, each having two read ports and two write ports. In another embodiment, the register file comprises a single register bank having four read ports and four write ports.
According to another aspect of the invention, a computation unit is provided. The computation unit comprises an execution unit for performing an operation on first and second operands in response to an instruction, a register file for storing operands, an operand bus coupled to the register file, the operand bus carrying a high operand and a low operand, and a data selector, responsive to an operand select value contained in the instruction, for supplying the high operand or the low operand from the operand bus to the execution unit.
According to another aspect of the invention, a method is provided for performing a digital computation. The method comprises the steps of storing operands for the computation in a register file, supplying operands from the register file on first and second operand buses, each carrying a high operand and a low operand, selecting the high operand or the low operand from the first operand bus in response to a first operand select value contained in an instruction and supplying a selected first operand to the execution unit, selecting the high operand or the low operand from the second operand bus in response to a second operand select value contained in the instruction and supplying a selected second operand to the execution unit, and performing an operation specified by the instruction on the operands selected from the first and second operand buses.
According to another aspect of the invention, a digital signal processor computation unit is provided. The digital signal processor computation unit comprises first and second execution units for performing operations in response to an instruction and for producing first and second results, a result register for storing the results of the operations, the result register having first and second locations, and result swapping logic, coupled between the first and second execution units and the result register, for swapping the first and second results between the first and second locations in the result register in response to result swapping information contained in the instruction.
The first and second execution units may comprise first and second arithmetic logic units for performing add and subtract operations. The first and second execution units are separately controllable and may perform the same or different operations in response to operation code information contained in the instruction. The first and second arithmetic logic units may comprise 16-bit arithmetic logic units which are configurable as a 32-bit arithmetic logic unit. The first and second locations in the result register may comprise high and low halves of the result register. The result register may comprise a register in a register file.
According to another aspect of the invention, a method is provided for performing digital signal computations. The method comprises the steps of performing operations in first and second execution units in response to an instruction and producing first and second results, storing the results of the operations in a result register having first and second locations, and swapping the first and second results with respect to the first and second locations in the result register, in response to result swapping control information contained in the instruction.
According to another aspect of the invention, a digital signal processor computation unit is provided. The digital signal processor computation unit comprises first and second execution units for performing operations in response to an instruction and for producing first and second results, a result register for storing the results of the operations, the result register having first and second locations, and means for swapping the first and second results with respect to the first and second locations in the result register, in response to result swapping control information contained in the instruction.
According to another aspect of the invention, a digital signal processor computation core is provided. The digital signal processor computation core comprises first and second execution units for performing first and second operations in response to control signals, and control logic for providing the control signals to the first and second execution units in response to control information contained in an instruction for individually controlling the first and second operations.
In one example, the first and second execution units comprise first and second arithmetic logic units. The first and second operations may be selected from add operations and subtract operations, and may be the same or different.
The computation core may further comprise a register file for storing operands and results of the first and second operations, and first and second operand buses coupled between the register file and the first and second execution units, each of the first and second operand buses carrying a high operand and a low operand, wherein the first execution unit performs the first operation on the high operands and the second execution unit performs the second operation on the low operands.
According to another aspect of the invention, a method is provided for performing digital signal computations. The method comprises the steps of performing first and second operations in first and second execution units, and individually controlling the first and second operations in response to control information contained in an instruction.
According to a further aspect of the invention, a digital signal processor computation core is provided. The digital signal processor computation core comprises first and second execution units for performing first and second operations in response to control signals, and means responsive to control information contained in an instruction for providing the control signals to the first and second execution units for individually controlling the first and second operations, wherein the first and second operations may be the same or different.
According to a further aspect of the invention, a computation core is provided for executing programmed instructions. The computation core comprises an execution block for performing digital signal processor operations in response to digital signal processor instructions and for performing microcontroller operations in response to microcontroller instructions, a register file for storing operands for and results of the digital signal processor operations and the microcontroller operations, and control logic for providing control signals to the execution block and the register file in response to the digital signal processor instructions and the microcontroller instructions for executing the digital signal processor instructions and the microcontroller instructions.
Preferably, the digital signal processor instructions are configured for high efficiency digital signal computations, and the microcontroller instructions are configured for code storage density. In one example, the microcontroller instructions have a 16-bit format and the digital signal processor instructions have a 32-bit format. The digital signal processor instructions may contain information indicating whether one or more related instructions follow. The related instructions may comprise load instructions.
According to a further aspect of the invention, a method is provided for executing programmed instructions. The method comprises the steps of executing digital signal processor operations in an execution block in response to digital signal processor instructions configured for efficient digital signal computation, and executing microcontroller operations in the execution block in response to microcontroller instructions configured for code storage density. An application program having a mixture of digital signal processor instructions and microcontroller instructions is characterized by high code storage density and efficient digital signal computation.
According to another aspect of the invention, a digital signal processor having a pipeline structure is provided. The digital signal processor comprises a computation block for executing computation instructions, the computation block having one or more computation stages of the pipeline structure, and a control block for fetching and decoding the computation instructions and for accessing a memory, the control block having one or more control stages of the pipeline structure. The computation stages and the control stages are positioned in the pipeline structure such that a result of the memory access is available to the computation stages without stalling the computation stages.
The computation stages and the control stages may be positioned in the pipeline structure so as to avoid stalling the computation stages when a computation instruction immediately follows a memory access instruction and requires the result of the memory access instruction. The computation stages and the control stages may be positioned in the pipeline structure such that the control block has one or more idle stages following completion of the memory access. The computation stages and the control stages may be positioned in the pipeline structure such that the computation block has one or more idle stages prior to a first computation stage.
According to another aspect of the invention, a method is provided for a digital signal computation. The method comprises the steps of executing computation operations in a computation block having one or more computation stages, executing control operations, including fetching instructions, decoding instructions and accessing a memory, in a control block having one or more control stages, wherein the computation stages and the control stages are configured in a pipeline structure, and positioning the computation stages relative to the control stages in the pipeline structure such that a result of a memory access is available to the computation stages without stalling the computation stages.
According to a further aspect of the invention, a method is provided for determining an output of a finite impulse response digital filter having L filter coefficients in response to a set of M input samples. The method comprises the steps of (a) loading a first input sample into a first location in a first register, (b) loading a second input sample into a second location in the first register, (c) loading two coefficients into a second register, (d) computing intermediate results using the contents of the first and second registers, (e) loading a new input sample into the first location in the first register, (f) computing intermediate results using the contents of the first and second registers, (g) repeating steps (b)-(f) for L iterations to provide two output samples, and (h) repeating steps (a)-(g) for M/2 iterations to provide M output samples.
Step (d) may comprise a multiply accumulate operation on a first coefficient in the second register and the input sample in the first location in the first register, and a multiply accumulate operation on the first coefficient in the second register and the input sample in the second location in the first register. Step (f) may comprise a multiply accumulate operation on a second coefficient in the second register and the input sample in the first location in the first register, and a multiply accumulate operation on the second coefficient in the second register and the input sample in the second location in the first register.
It will be understood that the foregoing aspects of the invention may be practiced separately or in any combination.