This invention relates to digital signal processors and, more particularly, to a digital signal processor having a computation block architecture that facilitates high speed digital signal computations.
A digital signal computer, or digital signal processor (DSP), is a special purpose computer that is designed to optimize performance for digital signal processing applications, such as, for example, fast Fourier transforms, digital filters, image processing and speech recognition. Digital signal processor applications are typically characterized by real time operation, high interrupt rates and intensive numeric computations. In addition, digital signal processor applications tend to be intensive in memory access operations and to require the input and output of large quantities of data. Thus, designs of digital signal processors may be quite different from those of general purpose computers.
One approach that has been used in the architecture of digital signal processors to achieve high speed numeric computation is the Harvard architecture, which utilizes separate, independent program and data memories so that the two memories may be accessed simultaneously. This architecture permits an instruction and an operand to be fetched from memory in a single clock cycle. Frequently, the program occupies less memory space than the operands for the program. To achieve full memory utilization, a modified Harvard architecture utilizes the program memory for storing both instructions and operands. Typically, the program and data memories are interconnected with the core processor by separate program and data buses.
The core processor of a digital signal processor typically includes a computation block, a program sequencer, an instruction decoder and all other elements required for performing digital signal computations. The computation block is the basic computation element of the digital signal processor and typically includes one or more computation units, such as a multiplier and an arithmetic logic unit (ALU), and a register file. The register file receives operands from memory and supplies the operands to the computation units for use in the digital signal computations. The results of the digital signal computations are returned by the computation units to the register file for temporary storage. Final results are written to memory, and intermediate results are forwarded by the register file to one or more of the computation units for further computation.
Digital signal computations are frequently repetitive in nature. That is, the same or similar computations may be performed multiple times with different operands. Thus, any increase in the speed of individual computations is likely to provide significant enhancements in the performance of the digital signal processor.
Multiport register files which support flow-through of data, wherein data presented at an input port of the register file during a given clock cycle can be passed to an output port of the register file in the same cycle, are disclosed in U.S. Pat. No. 4,811,296, issued Mar. 7, 1989 to Garde and U.S. Pat. No. 5,111,431, issued May 5, 1992 to Garde. While the disclosed multiport register files exhibit generally satisfactory performance, it is desirable to provide computation block architectures with further performance enhancements.
According to a first aspect of the invention, a computation block for performing digital signal computations is provided. The computation block comprises a register file for storage of operands and results of the digital signal computations, first and second computation units for executing the digital signal computations using the operands and producing the results, one or more operand buses each coupled between an operand output of the register file and an operand input of the first and second computation units, and one or more result buses each coupled to a result output of the first and second computation units, to an intermediate result input of the first and second computation units and to a result input of the register file. An intermediate result of the digital signal computation may be transferred directly from the result output of one of the computation units to the intermediate result inputs of one or both of the first and second computation units for use in a subsequent computation without first transferring the intermediate result to the register file.
The first computation unit may comprise a multiplier for performing multiplication operations, and the second computation unit may comprise an ALU for performing arithmetic operations. The computation block may further include a third computation unit comprising a shifter for performing shifting operations. In a preferred embodiment, the computation block comprises two or more result buses each coupled to result outputs of one or more of the first and second computation units, to the intermediate result inputs of the first and second computation units and to result inputs of the register file.
Each computation unit may comprise a first latch coupled to the operand bus and the result bus, a first multiplexer having inputs coupled to the first latch, a second latch having inputs coupled to the operand bus and the result bus, a second multiplexer having inputs coupled to the second latch, a computation circuit receiving first and second operands from the first and second multiplexers, and an output latch having an input coupled to the computation circuit and having an output coupled to the result bus. The register file may comprise a plurality of registers, an operand latch having an input coupled to the registers and an output coupled to the operand bus and a result latch having an input coupled to the result bus and an output coupled to the registers.
According to a further aspect of the invention, a method is provided for operating a computation block that performs digital signal computations, the computation block comprising a register file for storage of operands and results and first and second computation units for executing the digital signal computations. A first digital signal computation is performed with the first computation unit, and an intermediate result is produced. The intermediate result is transferred from a result output of the first computation unit to an intermediate result input of the second computation unit without first transferring the intermediate result to the register file. A second digital signal computation is performed by the second computation unit using the intermediate result to produce a final result or a second intermediate result.
The intermediate result may be transferred from the result output of the first computation unit to an intermediate result input of the first computation unit without first transferring the intermediate result to the register file. The intermediate result may be used by the first computation unit to perform a third digital signal computation.
According to a further aspect of the invention, a computation block for performing digital signal computations is provided. The computation block comprises first and second computation units for executing the digital signal computations and a distributed register file for storage of operands and results of the digital signal computations. The distributed register file comprises a central register file portion coupled to the first and second computation units by one or more operands buses and by one or more result buses, and first and second local register file portions respectively associated with the first and second computation units. An intermediate result produced by one of the computation units may be transferred to the local portions of the distributed register file for use in subsequent digital signal computations without first transferring the intermediate result to the central portion of the distributed register file.