1. Field of the Invention
The present invention relates to a processor core for using an external extended arithmetic unit efficiently and a processor incorporating the same, and more particularly to a technique to improve processing efficiency in a processor core that makes an external extended arithmetic unit execute a part of an arithmetic operation instruction by using the external extended arithmetic unit efficiently.
2. Description of the Background Art
FIG. 1 is a block diagram showing an example of a basic arrangement of a conventional processor.
The processor includes a processor core 70 and a data memory 80. The processor core 70 comprises a fetch PC (FPC) 701 for fetching an instruction, an address incrementer 702 for incrementing an address in an instruction memory, an instruction memory 703 for storing an instruction, an instruction register 704 for retaining an instruction read out from the instruction memory 703, a register file 705, a decode circuit 706 for decoding an instruction, a source (1) register 707 and a source (2) register 708, an execute stage control register 710, an arithmetic circuit 711 for executing an arithmetic operation, a pipeline control circuit 712 for controlling pipeline processing in the processor, a memory stage data register 713, a memory stage control register 715, a register write stage data register 716, a register write stage control register 717, and an instruction validating register 718.
The processor core 70 executes an instruction by, for example, a 5-stage pipeline structure. In other words, in the processor core 70, the pipeline stages include a pipeline stage 1 (instruction fetch stage), a pipeline stage 2 (register read stage), a pipeline stage 3 (execute stage), a pipeline stage 4 (memory stage), and a pipeline stage 5 (register write stage).
The instruction fetch stage is a stage where an instruction is read out from the instruction memory 703, and the register read stage is a stage where a register specified by the instruction read out in the instruction fetch stage is read out from the register file 705 while the instruction is decoded by the decode circuit 706. The execute stage is a stage where the instruction is executed by the arithmetic circuit 711 in accordance with a value in the register read out in the register read stage and the decode information of the instruction, and the memory stage is a stage where an access is made to the data memory 80 in case that the instruction is a memory load or store instruction. The register write stage is a stage where an execution result in the execute stage or load data in case of a memory load instruction is written into the register file 705.
However, the conventional processor core merely executes a pre-defined instruction, and is not provided with an extended function for connecting an arithmetic unit to its exterior. In other words, because the conventional processor is not provided with mechanism or an interface signal for connecting an arithmetic unit to its exterior efficiently, the system performance cannot be improved by connecting an arithmetic unit suitable for application systems to the processor core efficiently.
Pre-integrated an arithmetic unit suitable to application systems, for example, a production-sum arithmetic circuit, in the processor core may eliminate the above problem. However, not all the application systems use the production-sum arithmetic circuit. Hence, incorporating the production-sum arithmetic circuit in every processor core may produce useless hardware, thereby causing the cost to be increased unnecessarily.
FIG. 2 is a block diagram showing a second arrangement of the conventional processor.
The processor of the arrangement shown in FIG. 2 is connected to a coprocessor at its exterior. The coprocessor receives an instruction directed to the coprocessor from the processor core, and executes the same.
The coprocessor includes in its interior a coprocessor register file 705b, a coprocessor source (1) register 707b and a coprocessor source (2) register 708b, a coprocessor arithmetic circuit 711b, and a pipeline register 720b. 
The coprocessor executes an arithmetic operation specified by the instruction directed to the coprocessor by reading out a value in each register in the coprocessor register file 705b specified by the above instruction, and using the read out values as input data to the coprocessor arithmetic circuit 711b. 
With the processor core of the arrangement shown in FIG. 2, by connecting the coprocessor to its exterior, the function can be extended, but data used in the arithmetic operation carried out by the coprocessor arithmetic circuit 711b is still limited to the content of the coprocessor register file 705b in the coprocessor. For this reason, the coprocessor arithmetic circuit 711b is not allowed to execute an arithmetic operation by directly using the content of the processor register file 705 in the processor core.
As has been discussed, the conventional processor core is not provided with an interface function for connecting an arithmetic unit to its exterior efficiently, and there has been a need for an extended function for executing an arithmetic operation efficiently by using an external arithmetic unit connected to the processor core.