1. Field of the Invention
The invention relates to decoding instructions, and more specifically, to identifying boundaries between variable length instructions.
2. Description of Related Art
Computers process information by executing a sequence of instructions, which may be supplied from a computer program written in a particular format and sequence designed to direct the computer to perform a particular sequence of operations. Most computer programs are written in high level languages such as FORTRAN or "C," which are not directly executable by the computer processor. These high level instructions are translated into macroinstructions, having a format that can be decoded and executed within the processor.
Macroinstructions may be stored in data blocks having a predefined length in a computer memory element, such as main memory or an instruction cache. Macroinstructions are fetched from the memory elements and then supplied sequentially to one or more decoders, in which each macroinstruction is decoded into one or more micro-operations having a form that is executable by an execution unit in the processor.
Macroinstructions, such as instructions in the Intel iA32 instruction set, may have variable lengths. The Intel iA32 instruction set is described in detail in the Intel Architecture Software Developer's Manual, 1997, available from Intel Corporation, the entire contents of which are incorporated by reference herein. For example, one instruction may be two bytes long, the next four bytes long, the next three bytes, etc. Pipelined processors define multiple stages for processing a macroinstruction. To decode the macroinstructions, the length of the instruction must be calculated; however the length is only available during the decoding operation, not before. The start of the following instruction is then determined based on the length information. Thus, a considerable amount of processing is required to determine the start of the next instruction.
All of this processing may not be accommodated in a single processing (pipeline) stage of high frequency computers. To make the process of marking instruction boundaries amenable for pipelining, end byte markers that indicate the end of a given instruction are calculated as packets of instruction bytes flow through the pipeline. This marking is done even before the actual instruction decoding takes place. Hence, steering to the next instruction becomes a function of end byte markers, rather than depending on decoding the instruction.
A prior art process for decoding macroinstructions is illustrated in FIG. 1. In block 10, a block of instructions is fetched from the memory element. Instruction boundaries, which are defined as the location between adjoining macroinstructions in the instruction code, are marked in block 12. For example, an end byte marker may be set to a logically high state if its associated byte is the last byte of an instruction, or the end byte, and set to a logically low state if the associated byte is not an end byte. After the instruction boundaries are marked, they are rotated, or aligned, in block 14 based on the end byte markers so that each decoder may receive an instruction starting with the beginning of the instruction. The macroinstructions are then decoded into micro-ops, also referred to as uops, in block 16.
In known instruction decoding systems, the instruction length decode logic identifies and marks end bytes for an instruction packet having a predetermined number of instruction bytes. Providing end byte markers to mark instruction boundaries is well known in the art, and has been implemented in several computer systems. As an example, FIG. 2 illustrates two stages of an instruction decode pipeline for a prior art processor, such as the Intel.RTM. Pentium.RTM. Pro processor. The Pentium Pro system marks end bytes for an instruction packet containing eight instruction bytes during each clock cycle. In the instruction boundary marking stage 20, end bytes 22 for the eight instruction bytes b0-b7 are marked during a first clock cycle, then passed on to the align stage 24 for rotation during the next clock cycle. Instruction bytes b2 and b6 are marked as end bytes in FIG. 2. Thus, one instruction ends with byte b2 and the following instruction begins with byte b3, and another instruction ends with byte b6 with the following instruction beginning with byte b7.
As processor frequency increases, however, each of the predetermined number of bytes cannot be marked and passed to the next stage during a single clock cycle. Changing the instruction packet size flowing through the processor pipeline most likely would require substantial system redesign, thereby degrading system performance. Thus a need exists for a method and device for marking instruction boundaries in high frequency machines, without degrading performance.