1. Field of the Invention
The present invention generally relates to digital signal processors, and more particularly to a multi-mode buffer for a digital signal processor.
2. Discussion of the Related Art
As is known, digital signal processors (DSPs) are used in a wide variety of practical applications. Although circuit architectures may vary from chip to chip, DSPs are generally characterized by a multiplier component. As is known, multipliers perform the multiplication operation at an extremely high rate of speed (often within a single clock cycle). In comparison, a typically microprocessor architecture, which contains shifters adders and accumulators, performs a number of shift, add, and accumulate operations to carry out a single multiplication operation. This manner of performing a single multiplication operation requires a relatively large number of clock cycles. As a result, arithmetic computations requiring many multiplication operations are preferably performed with a DSP.
As merely one example, DSP chips are used in electronic communications, and virtually all modems include an on-board DSP chip. As is known by those skilled in the communications art, the coding, filtering, error-correction, and other processes associated with electronic communications all demand relatively extensive mathematical computations. In order to achieve the desired speed for communications--and the faster, the better--DSP chips are used to perform this processing.
To facilitate this discussion, reference is made to FIG. 1, which illustrates a very basic DSP architecture, as is known in the prior art. As will be appreciated by those skilled in the art, many other processing and control elements are present in an actual DSP chip. However, only those relevant to the illustration of the present invention have been depicted herein.
In the microprocessor art, the Harvard architecture has been well known for years. This Harvard architecture employs two separate memory devices: one memory for storing instructions, and one memory for storing data. A similar architecture is typically employed in DSP architectures, wherein one memory is used to store date and one memory is used to store coefficients. More specifically, when repeatedly calculating equations of the form: EQU output=coefficient.times.data,
one memory 12 is configured to store coefficient values and the other memory 14 is used to store data values. In this regard, the bit length of the coefficient values is often different that the bit length of the data values. Thus, for example, the coefficient memory 12 may be n bits in size (data path) and the data memory 14 may be n bits in size. In such an architecture, however, it is important to synchronize or coordinate the storage and retrieval of data and coefficient values to and from the respective memories.
Registers or buffers 16 and 18 are disposed in communication with the memories 12 and 14, such that data read out from the memories 12 and 14 may be clocked into the buffers 16 and 18. A multiplier 20 is configured to multiply the output values of the buffers 16 and 18. Then, an adder 22 and accumulator 24 combination sums the values successively output from the multiplier 20. In this regard, the adder 22 is typically an asynchronous device, and the accumulator is a registered device (although the may be combined into a single registered adder). Since an accumulated value is fed back to the adder, registering the summed value allows for controlled addition (without the fed back value over accumulating).
Therefore, a simple multiplication operation on this type of architecture may be carried out in the sequence set forth immediately below.
______________________________________ Clk Cycle Instruction(s) ______________________________________ 1 fetch(coefficient); fetch(data) 2 coefficient.sup.real .times. data.sup.real 3 coefficient.sup.imaginary .times. data.sup.real 4 coefficient.sup.real .times. data.sup.imaginary 5 coefficient.sup.imaginary .times. data.sup.imaginary ; fetch(coefficient); fetch(data) . . . . . . ______________________________________
In the first clock cycle, data may be fetched (or loaded) from the memory devices 12 and 14 into the buffers 16 and 18. Then, the multiplication operation is carried out. Since arithmetic computations of ten involve numbers having both real and imaginary components, the example used herein illustrates the computation accordingly. In this regard, the buffers 16 and 18 are illustrated in groups of two, wherein one buffer (of each pair) may carry the real component and the other may carry the imaginary component. Alternatively, the buffer may simply be large enough to hold both values. For example, a sixteen bit buffer, wherein the first eight bits hold the real component and the last eight bits hold the imaginary component. As is known, this multiplication of two complex numbers actually requires four separate multiplication operations, as shown above. Then, commensurate with the last multiplication operation, the data and coefficient values for the next multiplication operation may be read into the buffers.
It has been found that this prior art structure does, however, suffer various shortcomings. For example, if the two memories share a common address bus and data bus, it will take longer to read the data and coefficient values into the memories, since the must be written to one at a time. Alternatively, if the memories are designed on separate address and data busses, the circuitry for controlling the loading and operations becomes more complex, as separate address and control logic must be employed, then synchronized with each other to insure that the data and coefficient values are stored and read appropriately.
Accordingly, there is a need to provide an improved DSP architecture that overcomes these and other related shortcomings of the prior art.