1. Field of the Invention
This invention relates to microprocessor architecture and, more particularly, to a superscalar variable length instruction decode mechanism.
2. Description of the Related Art
In various systems, the front end of a processor core may include a mechanism for length decoding and identifying boundaries of variable length instructions. In some designs, the length decode mechanisms identifies the end of each of the variable length instructions and then stores endbits at the end of each instruction to identify the instruction boundaries. The variable length instructions and corresponding endbits are usually stored in the instruction cache (L1 cache) of the processor core to await further processing. The endbits may also be stored in the L2 cache of the processor. Storing endbits, along with the instructions, in the instruction cache and L2 cache may take up a significant amount of room. The relatively large size of the instruction cache and L2 cache that may be needed may increase die area and cost of the processor.
In these systems, after obtaining the variable length instructions and the corresponding endbits, the instruction decode unit of the processor core typically confirms that the endbits correctly identify the boundaries of the variable length instructions using a length decode unit. For x86 architectures, the instructions can have a wide range of lengths (e.g., 1-15 bytes) and the instructions can start/end on any byte boundary (or alignment). Therefore, it is especially important in x86 architectures to verify that the endbits correctly identify the instruction boundaries, because if the same instruction bytes are decoded from two different start positions, two entirely different sets of instructions may be obtained.
During the verification process, if the endbits are determined to be correct, the instruction decode unit usually begins to fully decode the instructions and eventually dispatches the decoded instructions to the execution unit. If the endbits are determined to be corrupted, the instruction decode unit typically decodes the instructions at a much slower rate since it may need to repair or recalculate the endbits to identify the instruction boundaries. In these systems, the decode mechanism may be relatively complex since it needs circuitry to confirm all of the endbits and additional circuitry to repair or recalculate corrupted endbits.