The invention relates to computer processors and processing systems. More particularly, the invention relates to enforcement of a non-speculative processing policy for execution of certain groups of instructions in a speculative processing environment.
Speculative processing (also referred to as Speculative Access or Speculative Addressing) is a technique that is utilized by nearly all modern high performance processors to improve system performance. In a processor system that implements the speculative processing technique, the CPU, when it is not busy, makes a xe2x80x9cguessxe2x80x9d of, or speculates, the next instruction (or a sequence of instructions) that is likely to be executed, and actually initiates the execution of the guessed instruction in advance, which includes, inter alia, reading, or fetching, the instruction from storage.
Speculation is the result of two primary causes. The first cause is that modem branch prediction techniques allow processors to make fairly accurate guesses as to the outcome of a branch instruction before said branch instruction finishes executing. That is, before a branch instruction completes, the processor will attempt to guess whether the next instruction to execute will be from the branch target or from the next sequential instruction following the branch and the processor will immediately begin speculatively fetching instructions from that predicted path. If the processor predicted incorrectly, the instruction fetched after the branch will be xe2x80x9cflushedxe2x80x9d out of the pipeline and the processor will begin fetching from the correct path. This is termed a xe2x80x9cbranch mispredictionxe2x80x9d.
The second cause of speculation is that processors typically implement an interruption mechanism. An interrupt, in general, causes a change in control flow in response to a condition detected by the processor hardware that requires assistance, typically provided by a software interrupt handler. An example of an interruption would be a TLB miss fault, which requires a piece of software called the TLB miss fault handler to be executed. Interruptions cannot be predicted when an instruction is fetched, therefore all instructions after the instruction causing the interruption will have been fetch speculatively and will need be flushed (similar to a branch misprediction).
The prediction is generally accurate most of the time, so that the speculative processing does improve the performance of the computer system. Even when the prediction was inaccurate, the result of the execution of the extra instructions is simply discarded, and, for the most part, no harm was done, i.e., the CPU would have been idle in any event.
Unfortunately, however, there are some types of operations that, if processed speculatively, may result in a catastrophic problem. For example, I/O operations, i.e., reading from a peripheral device, e.g., a hard disk, a sound card, a keyboard or a display or the like, must not be processed in speculative manner. This is because the I/O devices and the CPU typically communicate via a buffer. That is the data being exchanged between the CPU and an I/O device is temporarily stored in a buffer, typically in a first-in-first-out (FIFO) device, and is lost after being read once.
For example, a hard disk controller would place data in an interface buffer for transmittal to the CPU. Once the CPU reads the data, the hard disk controller assumes that the CPU has received and properly used the data, arid starts to fill the buffer with newer data. Thus, if the disk read was speculatively processed when the CPU neither needed nor was ready to use the data at that time, the data is lost, and would not be available when the CPU actually needs it at a later time. This can lead to a catastrophic error , e.g., a missing data block.
Therefore, certain instructions, e.g., instructions referencing addresses mapped to I/O devices, are marked xe2x80x9cnon-speculative instructionsxe2x80x9d, and are prevented from being speculatively executed. However, almost all modern computer systems also utilize a pipelining technique, and thus fetch instructions many clock cycles before the instructions are actually executed. A fetched instruction in a pipeline system does not always get executed. For example, the pipeline may be flushed after an interrupt, or the program flow may branch to another instruction, before the fetched instruction reaches the execution stage.
When a non-speculative instruction is fetched as described above, a similar result may occur as when the non-speculative instruction had been speculatively processed. This is because when an instruction is fetched, the cache system snoops the bus, and if the address being referenced is not found in the cache, it would signal a cache miss condition. The cache miss initiates a transfer of data from the memory and/or I/O devices. Thus, a mere fetching of an instruction may result in an emptying of an I/O interface buffer. When the I/O instruction is ultimately not executed in the pipeline for, e.g., reasons described above, the data that was in the I/O interface buffer is lost. Thus, in a pipelined system, it is critical to ensure that a non-speculative instruction does not cause a cache miss to occur.
A typical way in which a non-speculative instruction is prevented from being fetched is to halt the fetching of instructions into the pipeline altogether, until the instruction immediately preceding the non-speculative instruction is executed, thus ensuring that the non-speculative instruction will be executed, i.e., no speculative processing occurs.
For example, as shown FIG. 1, instructions are fetched by the fetch engine 101 into the execution pipeline 105 from a hierarchy of memory, e.g., the cache 103, memory 104 and/or a hard disk (not shown) and the like, in a manner well known to those familiar with pipelined processor architecture. The Translation Lookaside Buffer (TLB) 102 contains a subset of page table entries (PTE), which are typically stored in the main memory. The PTEs allow a translation from a virtual address to the corresponding physical address. The TLB, which is a smaller and faster memory than the main memory, acts similar to a cache memory with regard to the PTE, and thus speeds up the address translation process.
As previously mentioned, the non-speculative instructions are identifiable by, e.g., having one or more memory pages marked as non-speculative (e.g., the non-speculative memory block 107), and monitoring any access to those marked locations. When the next instruction to be fetched is identified as a non-speculative instruction, the fetch engine 101 halts before fetching the non-speculative instruction, and holds the subject non-speculative instruction at a stage of the pipeline 105 at which no cache miss due to the non-speculative instruction can occur. Then, the fetch engine 101 fills the pipeline 105 with xe2x80x9cbubblesxe2x80x9d, which may be any inconsequential instructions, e.g., No-Op (No Operations) or the like. When the restart logic detects the retirement of the instruction immediately preceding the subject non-speculative instruction from the pipeline 105, the subject non-speculative instruction is guaranteed to be executed next (i.e., the execution of the non-speculative instruction is now guaranteed to be non-speculative). Accordingly, the restart logic 105 sends a restart signal to the fetch engine 101 to restart the fetching, at which point the subject non-speculative instruction enters and proceeds through the pipeline 105 to be eventually executed.
While the above described method does prevent a catastrophic error resulting from a fetching of a non-speculative instruction, it does so at a significant expense, namely the extra hardware to implement the halt condition and/or the restart logic 106.
Thus, what is needed is an efficient mechanism to ensure that no non-speculative instructions are fetched before a guaranteed execution thereof without increasing the system complexity.
A method and an apparatus for ensuring that a non-speculative instruction does not cause a cache miss condition that may cause a catastrophic error is described. More particularly, a method of, and an apparatus for, ensuring fetching of a non-speculative instruction after execution thereof is guaranteed in a processor system having a pipeline comprises injecting a micro-fault into the pipeline in place of the non-speculative instruction, the micro-fault having encoded therein an associated address, the micro-fault causing a re-direction of an instruction flow in the pipeline to the associated address when the micro-fault is executed in the pipeline.