1. Technical Field
The present invention relates to a semiconductor device having a main processor and a coprocessor for data processing, and more particularly a semiconductor device having a main program memory and a coprocessor program memory for storing different types of instructions.
2. Discussion of Related Art
In a data processing device having a CPU or a main processor, a microprocessor distinct from the main processor, called a coprocessor, is used to perform specified functions that the CPU cannot perform or cannot perform as well and/or as quickly. For example, a coprocessor like a FPU (floating point unit) is a special purpose coprocessor used mainly to perform floating-point operations, vector operations and scalar operations. While a CPU can perform these functions, the coprocessor can perform these functions faster. In such a case, the CPU is used instead to fetch instructions, input or output data, and control program sequences. The CPU actually fetches all instructions and operands for both the main processor and coprocessor. The coprocessor performs the operations specified in the instruction fetched and decoded by the CPU. If the instruction is a data input/output operation, the CPU controls the coprocessor to input or output the data. It is not uncommon that the CPU even controls coprocessor pipelining. That is, the CPU fetches CPU and coprocessor instructions while the coprocessor performs coprocessor operations.
In conventional data processing devices, it may be difficult to pipeline, i.e., overlap execution of both CPU instructions and coprocessor instructions, because the coprocessor receives its instructions after the CPU fetches and decodes the coprocessor instructions. There are two conventional ways of configuring a pipeline in the data processor. According to one of the two ways, CPU and coprocessor use a same pipeline. In the second way, CPU and coprocessor use different, respective pipelines. The coprocessor receives and decodes an instruction transferred from the CPU. This coprocessor decoding cycle is time consuming and the cycle can likely be a critical timing path of a pipeline processing. In the second way, although the critical path problem of the first way is less severe, the need to maintain precise interrupts in cases of interrupt or exception operations require burdensome management by the CPU. The data bus width and instruction bit length of a coprocessor is usually wider and longer than that of the CPU. Therefore, if the CPU fetches a coprocessor instruction, a cycle loss (stall) of the pipeline processing may occur because the CPU has to repeat the coprocessor instruction fetch cycle several times until one instruction is completely fetched by the CPU. Consequently, this increases complexity for pipeline control and interrupt operations.
Accordingly, a need exists for a data processor which efficiently and quickly fetches CPU and coprocessor instructions. A need also exists for a data processing system having a tightly coupled main processor and coprocessor for executing different types of instructions in a single pipeline stream.
A data processing device having a main processor and a coprocessor is provided wherein the coprocessor has its own data bus to fetch coprocessor instructions. According to an aspect of the present invention, a coprocessor program memory is used to store coprocessor instructions and a predecoder is provided for predecoding an instruction fetched by the main processor.
It is an object of the present invention to provide a data processing device wherein the coprocessor is capable of fetching its own instruction through its own data bus and memory, preferably within the same instruction fetch cycle of the main processor.
According to an aspect of the present invention, a semiconductor device having a main processor and a coprocessor is provided for data processing, comprising a main program memory for storing main processor instructions and a first portion of coprocessor instructions; a coprocessor program memory for storing a second portion of coprocessor instructions; and a predecoder for predecoding at least one bit of each instruction fetched from the main program memory and for generating an active coprocessor control signal upon predecoding a coprocessor type instruction, wherein the second portion of coprocessor instructions are fetched directly from the coprocessor program memory and said first portion and said second portion of coprocessor instructions are processed by the coprocessor upon receipt of the active coprocessor control signal.
The coprocessor active control signal and the main processor are preferably synchronized to a system clock. The main processor instructions are m-bits and the coprocessor instructions are m+n bits, the n-bits being stored in the coprocessor memory, wherein said m-bits of said main processor instruction are fetched from the main program memory by the main processor and sent to the coprocessor after buffering by an instruction fetch buffer of said main processor. The m-bits are preferably sent through an instruction register in said main processor before being forwarded to the coprocessor.
According to another embodiment of the present invention, the m bits are forwarded directly to the coprocessor from the main program memory and the n bits are forwarded directly to the coprocessor from the coprocessor program memory.
A method of data processing in a semiconductor device having a main processor and a coprocessor is also provided, said main processor for executing m bit instructions, said coprocessor for executing m+n bit coprocessor instructions, the method comprising the steps of: fetching by the main processor an m-bit instruction from a main program memory addressed by a program address; and fetching by the coprocessor an n-bit instruction from a coprocessor program memory addressed by the program address upon decoding a predefined coprocessor code by the main processor.
According to a preferred embodiment of the present invention, the step of decoding a coprocessor code is performed by a predecoder of the main processor, the predecoder decoding at least one bit of said m-bits allocated for signalling a coprocessor operation, the fetching steps by the main processor and the coprocessor are preferably synchronized to a system clock and the fetching by the main processor and coprocessor occur within a system clock cycle.
Preferably the m-bit instruction fetched from the main memory is forwarded to said coprocessor to form a coprocessor instruction of m+n bits.