1. Field of the Invention
This invention relates to a mechanism to manage the result returned from a translator co-processor to a central processor in a pipelined architecture. Particularly, the invention relates to a system and method including a recovery unit (R-Unit or result unit) utilized for managing communications between co-processors and pipelined central processors.
2. Description of the Related Art
High performance processors are typically pipelined. Pipelining is a method of processing that allows for fast concurrent processing of data. This is accomplished by overlapping operations in a portion of memory that passes information from one process to another process. Most processor instructions have to go through the same basic sequence: the instruction must be fetched, it must be decoded, it must have its operands fetched, (which may first require address generation), it must be executed, and its results must be put away. The pipelining method fetches and decodes instructions in which, at any given time, several program instructions are in various stages of being fetched or decoded. The pipelining method improves the speed of system execution time by ensuring that the microprocessor does not have to wait for instructions. When the processor completes execution of one instruction, the next instruction is ready to be performed.
However, pipelining is not without inefficiency. The flow of instructions through the pipeline may stall. For example, when an instruction N modifies a register which a subsequent instruction N+2 needs for calculating the address of N+2""s operands, the instruction N+2 is delayed until the instruction N is finished modifying the register. In other words, the instruction N+2 may progress to the address generation stage, but must be delayed at that point until the instruction N modifies the register that the N+2instruction needs. Only after N is finished can N+2 continue in the pipeline. Another common situation which may stall the flow of the instructions through the pipeline is during the operand fetch stage of the pipeline where it may not yet know whether that particular operand access is allowed, or whether it should be denied (and report an exception) based on architectural restrictions. The corresponding instruction stalls until the operand access permission is determined. This stall period while checking for exceptions is referred to as xe2x80x9cexception_pending.xe2x80x9d Further, there are situations where instructions at various stages of the processor pipeline may need to be terminated and discarded due to an error in a branch instruction stream.
A common situation where this occurs is for incorrectly predicted branches. For example when an instruction stream is being executed, and a branch instruction Nb is encountered, the next instruction N+1 may come from more than one place. If the branch for Nb is not taken, then instruction N+1 is the next sequential instruction in the instruction stream. If the branch for Nb is taken, then instruction N+1 is the instruction at the branch location instructed by Nb (this is referred to as a xe2x80x9cbranch targetxe2x80x9d). However, when branch instruction Nb is at the decode stage of the pipeline, there is no definitive way to determine whether the branch is supposed to be taken or not taken. So, the instruction fetching logic xe2x80x9cpredictsxe2x80x9d whether instruction N+1 should be the next sequential instruction, or whether the branch target should be next. The xe2x80x9cpredictedxe2x80x9d instruction N+1 then enters the pipeline following the branch instruction Nb. Until the moment when the branch instruction Nb is evaluated, which happens at the execution stage of the pipeline, the processor can not determine with certainty whether the xe2x80x9cpredictedxe2x80x9d instruction N+1 was the correct instruction. When the branch instruction Nb reaches the execution stage of the pipeline, it is evaluated as to whether the branch instruction should have been taken or should not have been taken. If the instruction N+1 was not predicted correctly, then a xe2x80x9cbranch wrongxe2x80x9d exists. A branch wrong requires that instructions N+1 and later be terminated and removed from the pipeline. Next, the instruction fetching logic backs up to where the decision was made and fetches a different instruction N+1 down the other path. Therefore, the instructions that enter the pipeline following a branch instruction are xe2x80x9cconditionalxe2x80x9d until the branch is resolved as taken or is resolved as not taken. This may also be referred as a xe2x80x9cconditional pathxe2x80x9d or a xe2x80x9cspeculative execution.xe2x80x9d
The resultant stalling of the flow of instructions, the need to terminate and discard instructions and the occurrence of incorrectly predicted branches produces system inefficiency. A system which uses a translator co-processor with a central processor in a pipelined configuration will encounter such inefficiencies. Accordingly, there remains a need for improving pipeline processing of certain types of instructions.
An exemplary embodiment of the invention is a method and system for managing a result returned from a translator co-processor to a recovery unit of a central processor. The computer system has a pipelined computer processor and a pipelined central processor, which executes an instruction set in a hardware controlled execution unit and executes an instruction set in a milli-mode architected state with a millicode sequence of instructions in the hardware controlled execution unit. The central processor initiates a request to the translator co-processor a cycle after decode of a perform translator operation instruction in the millicode sequence. The translator co-processor processes the perform translator operation instruction to generate a perform translator operation result. The translator co-processor returns the results to a recovery unit of the central processor. The recovery unit stores the perform translator operation result in a system register. The request for the perform translator operation result by the central processor is interlocked by a hardware interlock of the recovery unit until the translator co-processor returns the perform translator operation result. The mechanism allows: the recovery unit to maintain the correct perform translator operation result with speculative execution and instruction level retry recovery throughout the duration of the perform translator operation.