1. Field of the Invention
The present invention relates to a processor core for using an external extended arithmetic unit efficiently and a processor incorporating the same, and more particularly to a technique to improve processing efficiency in a processor core that makes an external extended arithmetic unit execute a part of an arithmetic operation instruction by using the external extended arithmetic unit efficiently.
2. Description of the Background Art
FIG. 1 is a block diagram showing an example of a basic arrangement of a conventional processor.
The processor includes a processor core 70 and a data memory 80. The processor core 70 comprises a fetch PC (FPC) 701 for fetching an instruction, an address incrementer 702 for incrementing an address in an instruction memory, an instruction memory 703 for storing an instruction, an instruction register 704 for retaining an instruction read out from the instruction memory 703, a register file 705, a decode circuit 706 for decoding an instruction, a source (1) register 707 and a source (2) register 708, an execute stage control register 710, an arithmetic circuit 711 for executing an arithmetic operation, a pipeline control circuit 712 for controlling pipeline processing in the processor, a memory stage data register 713, a memory stage control register 715, a register write stage data register 716, a register write stage control register 717, and an instruction validating register 718.
The processor core 70 executes an instruction by, for example, a 5-stage pipeline structure. In other words, in the processor core 70, the pipeline stages include a pipeline stage 1 (instruction fetch stage), a pipeline stage 2 (register read stage), a pipeline stage (execute stage), a pipeline stage 4 (memory stage), and a pipeline stage 5 (register write stage).
The instruction fetch stage is a stage where an instruction is read out from the instruction memory 703, and the register read stage is a stage where a register specified by the instruction read out in the instruction fetch stage is read out from the register file 705 while the instruction is decoded by the decode circuit 706. The execute stage is a stage where the instruction is executed by the arithmetic circuit 711 in accordance with a value in the register read out in the register read stage and the decode information of the instruction, and the memory stage is a stage where an access is made to the data memory 80 in case that the instruction is a memory load or store instruction. The register write stage is a stage where an execution result in the execute stage or load data in case of a memory load instruction is written into the register file 705.
However, the conventional processor core merely executes a pre-defined instruction, and is not provided with an extended function for connecting an arithmetic unit to its exterior. In other words, because the conventional processor is not provided with mechanism or an interface signal for connecting an arithmetic unit to its exterior efficiently, the system performance cannot be improved by connecting an arithmetic unit suitable for application systems to the processor core efficiently.
Pre-integrated an arithmetic unit suitable to application systems, for example, a production-sum arithmetic circuit, in the processor core may eliminate the above problem. However, not all the application systems use the production-sum arithmetic circuit. Hence, incorporating the production-sum arithmetic circuit in every processor core may produce useless hardware, thereby causing the cost to be increased unnecessarily.
FIG. 2 is a block diagram showing a second arrangement of the conventional processor.
The processor of the arrangement shown in FIG. 2 is connected to a coprocessor at its exterior. The coprocessor receives an instruction directed to the coprocessor from the processor core; and executes the same.
The coprocessor includes in its interior a coprocessor register file 705b, a coprocessor source (1) register 707b and a coprocessor source (2) register 708b, a coprocessor arithmetic circuit 711b, and a pipeline register 720b. 
The coprocessor executes an arithmetic operation specified by the instruction directed to the coprocessor by reading out a value in each register in the coprocessor register file 705b specified by the above instruction, and using the read out values as input data to the coprocessor arithmetic circuit 711b. 
With the processor core of the arrangement shown in FIG. 2, by connecting the coprocessor to its exterior, the function can be extended, but data used in the arithmetic operation carried out by the coprocessor arithmetic circuit 711b is still limited to the content of the coprocessor register file 705b in the coprocessor. For this reason, the coprocessor arithmetic circuit 711b is not allowed to execute an arithmetic operation by directly using the content of the processor register file 705 in the processor core.
As has been discussed, the conventional processor core is not provided with an interface function for connecting an arithmetic unit to its exterior efficiently, and there has been a need for an extended function for executing an arithmetic operation efficiently by using an external arithmetic unit connected to the processor core.
It is therefore an object of the present invention to provide a processor core for connecting an arithmetic unit
Its external efficiently, so that the system performance can be improved drastically without increasing the size thereof by connecting the arithmetic unit to its exterior to furnish an efficient interface function therebetween.
To achieve the object, an aspect of the invention provides a processor, comprising: a processor core; a data memory accessed by the processor core; and an extended arithmetic unit, connected to an exterior of the processor core, for processing a particular instruction, the extended arithmetic unit executing an arithmetic operation by using arithmetic operation data retained in a register file in the processor core and outputting a result of an arithmetic operation directly to the processor core, the processor core saving the result of the arithmetic operation executed by the extended arithmetic unit and inputted therefrom in the register file in the processor core.
Another aspect of the invention provides a processor, comprising: a processor core; a data memory accessed by the processor core; and an extended arithmetic unit, connected to an exterior of the processor core, for processing a particular instruction, the processor core, at least including: an instruction memory for storing an instruction to be executed; an instruction decode unit for reading out an instruction from the instruction memory to decode the instruction, in case that the instruction decoded is an extended arithmetic unit control instruction that should be executed by the extended arithmetic unit connected to the exterior of the processor core, the instruction decode unit also outputting at least an instruction code of the extended arithmetic unit control instruction to the extended arithmetic unit; a register file for retaining arithmetic operation data of an arithmetic operation that should be executed by the instruction decoded, in case that the arithmetic operation data is data of the extended arithmetic unit control instruction, the register file also outputting the arithmetic operation data to the extended arithmetic unit; a first operational section for executing the instruction decoded; and an extended arithmetic unit, at least including, a second operational section for executing an arithmetic operation specified by the extended arithmetic unit control instruction by using the arithmetic operation data retained in the register, and outputting an execution result of the arithmetic operation to the processor core.
Preferably, in case that the instruction decoded is the extended arithmetic unit control instruction, the processor core outputs to the extended arithmetic unit at least an instruction code that specifies an action involved in an arithmetic operation in the extended arithmetic unit and an instruction valid signal that indicates the instruction code is valid.
Preferably, the arithmetic operation data outputted to the extended arithmetic unit is a value read out from the register file in the processor core in accordance with a register number specified by a part of the extended arithmetic unit control instruction.
Preferably, the processor core includes a pipeline control unit for controlling pipeline processing in an interior of the processor core and in the extended arithmetic unit.
Preferably, the pipeline control unit outputs to the extended arithmetic unit a first pipeline stop signal for suspending execution of an instruction therein.
Preferably, the pipeline control unit outputs to the extended arithmetic unit a pipeline flush signal for abandoning execution of an instruction outputted thereto.
Preferably, the pipeline control unit stops execution of an instruction in the processor core in accordance with a second pipeline stop signal for suspending execution of an instruction, the second pipeline stop signal being inputted from the extended arithmetic unit and executed by the processor core.
Preferably, the extended arithmetic unit outputs to the processor core an arithmetic operation result invalidating signal that invalidates an execution result of an arithmetic operation executed therein.
Preferably, the data memory receives from the extended arithmetic unit at least one of an address in memory access, data, a write control signal for controlling data writing, and a read control signal for controlling data reading; reads out the data from a region specified by the address and outputs the data to the extended arithmetic unit in case that data reading is carried out because the read control signal is asserted; and writes the data inputted from the extended arithmetic unit into a region specified by the address in case that data writing is carried out because the write control signal is asserted.
Preferably, the extended arithmetic unit includes: a plurality of arithmetic circuits; a first pipeline register for storing a processing result by an arithmetic circuit in a preceding stage at a rising of a following clock; and a second pipeline register for storing a processing result by an arithmetic circuit in a succeeding stage at the rising of the following clock.
Still another aspect of the present invention provides a processor core connected to an extended arithmetic unit for processing a particular instruction to an exterior thereof, comprising: an instruction memory for storing an instruction to be executed; an instruction decode unit for reading out an instruction from the instruction memory to decode the instruction, in case that the instruction decoded is an extended arithmetic unit control instruction that should be executed by the extended arithmetic unit connected to the exterior of the processor core, the instruction decode unit also outputting at least an instruction code of the extended arithmetic unit control instruction to the extended arithmetic unit; a register file for retaining arithmetic operation data of an arithmetic operation that should be executed by the instruction decoded, and in case that the arithmetic operation data is data for the extended arithmetic unit control instruction, the register file also outputting the arithmetic operation data to the extended arithmetic unit and storing a result of an arithmetic operation executed in the extended arithmetic unit.
Preferably, in case that the instruction decoded is the extended arithmetic unit control instruction, the instruction decode unit outputs to the extended arithmetic unit at least an instruction code that specifies an action involved in an arithmetic operation by the extended arithmetic unit and an instruction valid signal that indicates the instruction code is valid.
Other features and advantage of the present invention will become apparent from the following description taken in conjunction with the accompanying drawings.