The technical field relates to digital computer systems and fetching instructions. More particularly, it relates to methods and an apparatus for fetching instructions from a computer memory in a mixed architecture.
In the field of computer architecture, a single chip may process instructions from multiple instruction sets. In such mixed architectures, the processor hardware is designed and optimized for executing instructions from one instruction set generally referred to as the native instruction set, while emulating other instruction sets by translating the emulated instructions into operations understood by the native hardware. For example, the IA-64 architecture supports two instruction setsxe2x80x94the IA-32 (or x86) variable length instruction set and the fixed-length enhanced mode (EM) instruction set. When executing the IA-32 instruction set, the central processing unit (CPU) is said to be in IA-32 mode. When executing EM instructions, the CPU is said to be in EM mode. Native EM instructions are executed by the main execution hardware of the CPU in EM mode. However, the variable length IA-32 instructions are processed by the IA-32 (or x86) engine and broken down into native EM mode instructions for execution in the core pipeline of the machine. In x86 mode, it is desirable to retrieve instructions from the IA-64 memory subsystem into an x86 engine. To accomplish this, the x86 execution engine must interface with the EM pipeline, because the memory subsystem is tightly coupled to the EM pipeline. The x86 hardware support exists primarily to support legacy software. For this reason, it is desirable that the x86 engine not slow the processing of native instructions in the EM pipeline.
Existing methods of fetching instructions, such as those methods previously implemented in IA-64 architecture, use dual pipelinesxe2x80x94the EM pipeline and the x86 pipelinexe2x80x94to process instructions. In these methods, the x86 engine simply sends a fetch address to the EM fetch engine, which accesses the memory subsystem and returns a line of instructions for depositing to a macroinstruction queue (MIQ) in the x86 engine. While both pipelines are synchronized to process the same set of addresses, they operate independently such that the x86 engine sends a new fetch address in each clock cycle, and the EM fetch engine retrieves a new line of instructions in each clock cycle.
In the presence of pipeline stalls (for example due to a cache miss), the pipelines could go out of synchronization. This is because, given the physical separation of the x86 engine and the EM fetch engine it takes one complete clock-cycle to transmit information between these pipelines. In the case of a stall, it is not possible to report the stall to the x86 engine in the same cycle that the fetch engine sees it. That is, the x86 engine would not notice the stall in the EM pipeline until at least one clock cycle after it occurred. Meanwhile, the x86 pipeline continues to advance the fetch address as though no stall had occurred. The x86 pipeline and the EM pipeline become unsynchronized and will process different instructions in corresponding pipeline stages. This requires a complicated stall recovery means to get the pipelines back into synchronization.
Another stall-related problem with existing methods of processing instructions is that there may not be enough room to write a line of returning instructions on the MIQ. That is, existing methods and apparatuses may try to write a new line of instructions to the MIQ, even though the MIQ may be full with unprocessed entries. One prior art method introduces a new stall to recover from this oversubscription to the MIQ. The detection and signaling of this new stall is cumbersome and combined with the earlier fetch-related stalls, requires complicated hardware to handle.
What is needed is a means of interfacing the hardware of a CPU that processes both native instructions and emulated instructions. In particular, what is needed is a method for retrieving instructions of one instruction set architecture (ISA) from the memory of a different, native ISA, while avoiding the problems associated with pipeline stalls and the complexities inherent to the dual, synchronous pipeline system.
A method of interfacing hardware in a processor capable of implementing more than one instruction set, such as a native instruction set and an emulated instruction set is described. In particular, an engine responsible for fetching native instructions from a memory subsystem is interfaced with an engine that processes emulated instructions. This is achieved using a handshake protocol, whereby the x86 engine sends an explicit fetch request signal to the EM fetch engine along with a fetch address. The EM fetch engine then accesses the memory subsystem and retrieves a line of instructions for subsequent decode and execution. The EM fetch engine sends this line of instructions to the x86 engine along with an explicit fetch complete signal. The EM fetch engine also includes a fetch address queue capable of holding the fetch addresses before they are processed by the EM fetch engine. The fetch requests are processed such that more than one fetch request may be pending at the same time. If a pending fetch request is canceled due to a pipeline flush, then the fetch address queue is cleared and the pending fetch requests are canceled. The system also prevents macroinstruction (MIQ)-related stalls by using a speculative write pointer to control the issuance of fetch requests, thereby preventing the MIQ from becoming oversubscribed.
A computer system capable of processing instructions from more than one instruction set and an engine that fetches native instructions from a memory subsystem (such as an EM fetch engine), and an engine that processes emulated instructions (such as an x86 engine) is described. The EM fetch engine has a fetch address queue. The EM fetch engine interfaces with the memory subsystem and the x86 engine by using a handshake protocol. The x86 engine sends an explicit fetch request signal to the EM fetch engine along with a fetch address. The EM fetch engine then accesses the memory subsystem and retrieves a line of instructions. The EM fetch engine sends this line of instructions to the x86 engine along with an explicit fetch complete signal. The EM fetch engine also includes a fetch address queue capable of holding the fetch addresses before they are processed by the EM fetch engine. The fetch requests are processed such that more than one fetch request may be pending at the same time. If a pending fetch request is canceled due to a pipeline flush, then the fetch address queue is cleared and the pending fetch requests are canceled. The system also prevents macroinstruction (MIQ)-related stalls by using a speculative write pointer to control the issuance of fetch instructions, thereby preventing the MIQ from becoming oversubscribed.