1. Field of the Invention
This invention is related to the field of microprocessors and, more particularly, to address calculation mechanisms within microprocessors.
2. Description of the Relevant Art
Superscalar microprocessors achieve high performance by simultaneously executing multiple instructions during a clock cycle and by specifying the shortest possible clock cycle consistent with the design. As used herein, the term "clock cycle" refers to an interval of time during which the pipeline stages of a microprocessor perform their intended functions. Storage devices (e.g. registers or arrays) capture their values in response to a clock signal defining the clock cycle. For example, storage devices may capture a value in response to a rising or falling edge of the clock signal.
Microprocessor designers often design their products in accordance with the x86 microprocessor architecture in order to take advantage of its widespread acceptance in the computer industry. Because the x86 microprocessor architecture is pervasive, many computer programs are written in accordance with the architecture. X86 compatible microprocessors may execute these computer programs, thereby becoming more attractive to computer system designers who desire x86-capable computer systems. Such computer systems are often well received within the industry due to the wide range of available computer programs.
The x86 microprocessor architecture includes an address translation mechanism. Address translation involves the mapping of an address created by the microprocessor to an address actually used to access memory. Address translation mechanisms are employed for many reasons. For example, the address translation mechanism may be used to define certain microprocessor-created addresses as not presently stored within the main memory. Data corresponding to addresses which are not stored within main memory may be stored on a disk drive. When such an address is accessed, the corresponding data may be swapped with data currently stored in main memory. The address translation for the data swapped onto the disk drive is invalidated and an address translation is defined for the data swapped into memory. In this manner, the computer program may access an address space larger than the main memory can support. Additionally, address translation mechanisms are used to protect the data used by one program from access and modification by another program executing within the same computer system. Different areas of main memory are allocated to each program, and the translations for each program are constrained such that any address created by one program does not translate to a memory location allocated to another program. Many other reasons for employing address translation mechanisms are well known.
The x86 address translation mechanism includes two levels. A first level, referred to as segmentation, takes a logical address generated according to instruction operands and produces a linear address. The second level, referred to as paging, translates the linear address to a physical address (i.e. the address actually used to access memory). The linear address is equal to the physical address in cases where the paging mechanism is disabled.
For a particular data access to memory, the logical address comprises the result of adding certain operands defined by the instruction. As used herein, the term "operand " refers to an input value operated upon by an instruction. Operands are referred to as register operands if the value is stored in a register within the microprocessor. Conversely, operands are referred to as memory operands if the value is stored in a memory location. The memory location is identified by forming a data address. In the x86 microprocessor architecture, an instruction may form the logical data address of a memory operand using up to two register values and up to one displacement value. The displacement is a value encoded into a particular field of the instruction, and is intended for use in forming the logical data address. The register values used to form the logical data address are also referred to herein as register operands.
Upon generating the logical address, the linear address may be generated. A set of segment registers and associated "shadow registers" store segmentation translation information. The segment selectors are accessible via instructions, while the shadow registers are accessible only to microprocessor hardware. As used herein, the term "segment registers" will be used to refer to the segment registers and associated shadow registers. Each instruction accesses a particular segment register by default when forming linear addresses. Additionally, an instruction may specify a segment register other than the default via an instruction prefix defined in the x86 microprocessor architecture.
Generally speaking, segmentation translation information includes a segment base address, a segment limit, and segment access information. The segment base address is the linear address defined for a logical address having the arithmetic value of zero. Linear addresses within the segment have an arithmetic value which is greater than or equal to the segment base address. The segment limit defines the largest logical address which is within the segment. Logical addresses larger than the segment limit result in an exception being generated by the microprocessor. The segment access information indicates if the segment is present in memory, the type of segment (i.e. code or data, and various subtypes), the addressing mode of the segment, etc. The linear address corresponding to a particular logical address is the result of adding the segment base address to the logical address. Additional information regarding the x86 address translation mechanism may be found in the publication: "PC Magazine Programmer's Technical Reference: The Processor and Coprocessor" by Hummel, Ziff-Davis Press, Emeryville, Calif., 1992. This publication is incorporated herein by reference in its entirety.
As used herein, the term "exception" refers to an interruption in the execution of an instruction code sequence. The exception is typically reported to a centralized handling mechanism which determines an appropriate response to the exception. Some exceptions (such as branch misprediction, for example) may be handled by the microprocessor hardware. The hardware performs corrective actions and then restarts instruction execution. Other exceptions may cause a microcode routine within the microprocessor to be executed. The microcode routine corrects the problem corresponding to the exception. Instruction execution may subsequently be restarted at the instruction causing the exception or at another instruction subsequent to the instruction, dependent upon the corrective actions taken. A third type of exception causes execution of special instruction code stored at an address defined for the exception. The special instruction code determines the reason for the exception and any corrective actions. The third type of exception is architecturally defined, such that software may be written to handle the exception. Upon execution of a particular instruction (a return instruction), instruction execution is typically restarted at the instruction which causes the exception. Segment limit violations are an example of the third type of exception. Selection of which method to handle a particular exception with in a microprocessor is dependent upon the relative frequency at which the exception occurs, and the associated performance impact of handling the exception in the various different manners.
Unfortunately, the generation of a logical address involving up to three operands followed by generation of a linear address from the logical address leads to significant latency in data accesses. Because the logical address may depend upon registers, it is typically not generated until the associated instruction arrives in a functional unit. Generating the logical address typically requires one to two clock cycles (depending upon the number of operands), followed by a linear address generation requiring yet another clock cycle. Delays in address generation result in delays in receiving the accessed data. Instructions dependent upon the accessed data are thereby delayed from executing as well. A mechanism for decreasing the latency involved in generating a linear address is therefore desired.