1. Field of the Invention
The present invention relates to a data processor for high-speed digital signal processing and a method of processing data for high-speed digital signal processing.
2. Description of the Background Art
Digital signal processors (DSPs) having an architecture suitable for signal processing have been used as data processors designed specifically for high-speed digital signal processing. These DSPs execute processing frequently used in signal processing such as a multiply-add operation at high speeds. An example of a DSP is Motorola DSP56000. The DSP56000 includes two address pointers, two data memories, and a multiply-add operation unit. Parallel loading of data (e.g., the load of coefficients and data) from two 1-word memories specified respectively by the address pointers, updating of the two address pointers, and the execution of the combined multiply-add operation allows the multiply-add operation to be executed with a high throughput (See DSP56000 Digital Signal Processor Family Manual, 1992). In this manner, the DSP normally has two memories. Data are distributed to either of the memories. Some DSPs use a 2-port RAM for efficient data transfer.
An example of microprocessors incorporating the DSP function includes Motorola CPU16. The CPU16 may repeatedly perform the multiply-add operation and 2-word load in response to one RMAC instruction. However, the CPU16 wherein one multiply-add operation requires 12 cycles is difficult to achieve the performance competing with the DSPs (CPU16 Reference Manual, 1993).
In recent years, some microprocessors have been intended for implementing signal processing by means of software as the operating frequency improves. To improve the arithmetic performance, some of the microprocessors additionally provide the multiply-add operation instructions and make the most of sophisticated parallel processing techniques such as superpipeline and superscalar to achieve DSP-level performance. For example, PowerPC603 (Motorola and IBM) may execute a single-precision floating-point multiply-add operation with one clock cycle throughput by using 3-stage pipeline processing. This requires the amount of hardware and significantly complicated control. To perform one multiply-add operation for each clock cycle, one clock cycle requires 2-word data. The PowerPC603 may load a maximum of one word for each clock cycle, resulting in an insufficient supply of operands (Proceedings of COMPCON 1994: xe2x80x9cThe PowerPC603 Microprocessor: A High Performance, Low Power, Superscalar RISC Microprocessorxe2x80x9d, PowerPC603 RISC Microprocessor User""s Manual, 1994).
The DSPs which must include two memories have a complicated memory construction and require very cumbersome data management for distribution of data between the two memories. The use of a 2-port RAM adds to the area and costs of the data processor. Additionally, the DSP is in general an accumulator machine and is difficult to execute complicated data processing.
The microprocessors which require one memory have a relatively simple memory construction. However, the microprocessors are not efficient in signal processing unlike the DSPs wherein hardware directly represents the flow of signal processing. To achieve the DSP-level performance, the state-of-the art microprocessors require an increased amount of hardware, adding to the costs of the data processor. Further, the microprocessors are difficult to reduce power consumption because of the need for operation at high frequencies.
According to a first aspect of the present invention, a data processor comprises: a first memory portion for storing an instruction including a first operation code and a second operation code; a second memory portion for storing data; an instruction decode unit for receiving the instruction stored in the first memory portion, the instruction decode unit including first and second decoders for decoding the first and second operation codes in parallel, respectively; a register file portion including a plurality of registers for storing data to transfer data from and to the second memory portion; an operation unit for receiving first data stored in a first register of the register file portion to perform an arithmetic operation using the first data in response to a control signal, the control signal being the first operation code decoded by the first decoder of the instruction decode unit; and an operand access unit operated in parallel with the operation unit for causing second and third data stored in the second memory portion to be transferred in parallel and stored in second and third registers of the register file portion, respectively, in response to a control signal, the control signal being the second operation code decoded by the second decoder of the instruction decode unit.
Preferably, according to a second aspect of the present invention, the second and third data each are n bit (n is a natural number) in length, and the second and third data are combined together into 2n-bit data when the second and third data are transferred to the register file portion.
According to a third aspect of the present invention, a data processor comprises: a first memory portion for storing an instruction including a first operation code and a second operation code; a second memory portion for storing data; an instruction decode unit for receiving the instruction stored in the first memory portion, the instruction decode unit including first and second decoders for decoding the first and second operation codes in parallel, respectively; a register file portion including a plurality of registers for storing data to transfer data from and to the second memory portion; an operation unit for receiving first data stored in a first register of the register file portion to perform an arithmetic operation using the first data in response to a control signal, the control signal being the first operation code decoded by the first decoder of the instruction decode unit; and an operand access unit operated in parallel with the operation unit for causing second and third data stored respectively in second and third registers of the register file portion to be transferred in parallel and stored in the second memory portion in response to a control signal, the control signal being the second operation code decoded by the second decoder of the instruction decode unit.
Preferably, according to a fourth aspect of the present invention, the second and third data each are n bit (n is a natural number) in length, and the second and third data are combined together into 2n-bit data when the second and third data are transferred to the second memory.
Preferably, according to a fifth aspect of the present invention, the operation unit includes a multiplier for multiplying together the first data and fourth data stored in a fourth register of the register file portion, and an adder for adding at least two data together, the adder adding together the result of multiplication of the multiplier and data stored in a register of the register file portion to cause a register of the register file portion to store the result of addition.
Preferably, according to a sixth aspect of the present invention, the operation unit includes a multiplier for multiplying together the first data and fourth data stored in a fourth register of the register file portion, and an adder for adding at least two data together, the adder adding together the result of multiplication of the multiplier and data stored in a register of the register file portion to cause a register of the register file portion to store the result of addition.
Preferably, according to a seventh aspect of the present invention, the operation unit includes a multiplier for multiplying together the first data and fourth data stored in a fourth register of the register file portion, an adder for adding at least two data together, and an accumulator for holding a result of an operation, the adder adding together the result of multiplication of the multiplier and the data held in the accumulator to cause the accumulator to hold the result of addition.
Preferably, according to an eighth aspect of the present invention, the operation unit includes a multiplier for multiplying together the first data and fourth data stored in a fourth register of the register file portion, an adder for adding at least two data together, and an accumulator for holding a result of an operation, the adder adding together the result of multiplication of the multiplier and the data held in the accumulator to cause the accumulator to hold the result of addition.
According to a ninth aspect of the present invention, a data processor comprises: a memory portion for storing data; an instruction decode unit for receiving a first instruction including first and second operation codes and a second instruction including third and fourth operation codes and to be processed after the first instruction to decode the first and second operation codes and the third and fourth operation codes in parallel; a register file portion connected to the memory portion and including a plurality of registers each for storing data or an operand address; an operation unit for performing an arithmetic operation of the data stored in the register file portion; and a memory access portion operated in parallel with the operation unit for causing the operand address stored in the register file portion to be applied to the memory portion and for updating the operand address, wherein, in a first processing, the instruction decode unit receives the first instruction, and executed is parallel processing of (a) the operation unit to receive first data stored in a first register of the register file portion to perform an arithmetic operation in response to a control signal which is outputted from the instruction decode unit decoding the first operation code, and (b) the memory access portion to cause a first operand address stored in a second register of the register file portion to be applied to the memory portion to cause second data stored in the memory portion to be transferred to a third register of the register file portion in response to a control signal which is outputted from the instruction decoded unit decoding the second operation code and to update the first operand address to write a second operand address into the second register in response to the control signal, and wherein, in a second processing, the instruction decode unit receives the second instruction, and executed is parallel processing of (c) the operation unit to receive the second data stored in the third register of the register file portion to perform an arithmetic operation in response to a control signal which is outputted from the instruction decode unit decoding the third operation code, and (d) the memory access portion to cause the second operand address stored in the second register of the register file portion to be applied to the memory portion to cause third data stored in the memory portion to be transferred to a fourth register of the register file portion in response to a control signal which is outputted from the instruction decode unit decoding the fourth operation code and to update the second operand address to write a third operand address into the second register in response to the control signal, the first processing and the second processing being executed by pipeline control.
A tenth aspect of the present invention is intended for a method of processing data by a data processor which includes a memory portion for storing data, a register file portion connected to the memory portion and including a plurality of registers each for storing data or an operand address, an operation unit for receiving the data stored in the register file portion to perform an arithmetic operation, and a memory access portion for causing the operand address stored in the register file portion to be applied to the memory portion. According to the present invention, the method comprises the steps of: (a) transferring first and second data stored in a first area of the memory portion in parallel to write the first and second data into first and second registers of the register file portion, respectively; (b) transferring third and fourth data stored in a second area of the memory portion in parallel to write the third and fourth data into third and fourth registers of the register file portion, respectively; (c) applying the first data stored in the first register and the third data stored in the third register to the operation unit to perform an arithmetic operation of the first and third data by the operation unit; and (d) applying the second data stored in the second register and the fourth data stored in the fourth register to the operation unit to perform an arithmetic operation of the second and fourth data by the operation unit.
Preferably, according to an eleventh aspect of the present invention, the method further comprises the steps of: (e) transferring fifth and sixth data stored in a third area of the memory portion in parallel to write the firth and sixth data into fifth and sixth registers of the register file portion, respectively; and (f) transferring seventh and eighth data stored in a fourth area of the memory portion in parallel to write the seventh and eighth data into seventh and eighth registers of the register file portion, respectively, wherein one of the steps (c) and (d) is executed in parallel with at least one of the steps (e) and (f).
Preferably, according to a twelfth aspect of the present invention, the third area is the same as the first area, and the fourth area is the same as the second area.
Preferably, according to a thirteenth aspect of the present invention, the first and second data each are n bits (n is a natural number) in length, and the first and second data are combined together into 2n-bit data when the first and second data are transferred to the register file portion.
Preferably, according to a fourteenth aspect of the present invention, the step (c) comprises the steps of: multiplying the first and third data together; and adding data stored in a ninth register to the result of multiplication to store the result of addition as ninth data in the ninth register, and the step (d) comprises the steps of: multiplying the first and fourth data together; and adding the ninth data stored in the ninth register to the result of multiplication to store the result of addition in the ninth register.
In accordance with the first aspect of the present invention, the data processor comprises the instruction decode unit including the first and second decoders, the register file, the operation unit, and the operand access unit. The first and second operation codes are decoded and executed in parallel, and the arithmetic operation and the access of two data to the memory are executed in parallel, achieving high-speed data processing. In particular, a DSP-level signal processing performance of a microprocessor is implemented.
The simple construction may reduce the costs of the data processor.
The parallel processing of the multiply-add operation instruction and the access of two data to the memory allows one multiply-add operation to be performed per clock cycle.
In accordance with the data processor of the ninth aspect of the present invention, a plurality of instructions including the operation code for specifying the application of a memory operand to the register file while updating an address by using the register contents as the address, and the operation code for specifying the execution of the arithmetic operation with reference to the register value are processed by means of pipeline processing technique. This permits the arithmetic operations to be executed without operand interference by means of software, improving the processing performance.
In accordance with the tenth aspect of the present invention, the method of processing data comprises loading the first and second data in parallel from the memory to the register, loading the third and fourth data in parallel from the memory to the register, performing the arithmetic operation of the first and third data, and performing the arithmetic operation of the second and fourth data. The access to the memory and the arithmetic operation are executed efficiently by using one memory, improving the performance of the data processor. In particular, digital signal processing performance is greatly improved under simple control.
It is therefore an object of the present invention to provide an inexpensive high-performance microprocessor-type data processor which readily reduces power consumption under relatively simple control.
It is another object of the present invention to provide a data processor having DSP-level digital signal processing performance.
It is still another object of the present invention to provide a method of processing data which may achieve high-performance data processing control.
These and other objects, features, aspects and advantages of the present invention will become more apparent from the following detailed description of the present invention when taken in conjunction with the accompanying drawings.