1. Field of the Invention
This invention relates in general to the field of memory segments, and more particularly to an improved method and apparatus for loading a stack segment register.
2. Description of the Related Art
Segmentation is a memory management and protection mechanism that separates memory within a computer system into a number of different memory segments and supervises access to each of the segments. Segmentation is used to give computer programs their own independent and protected memory spaces. For example, the separate memory spaces can prevent one program from writing into a memory space that is in use by another program. In addition, segmentation provides access control to each memory segment to prohibit programs from accessing memory segments for which they are not authorized.
To reference a memory segment, a segment register is designated by a processor instruction. Within the segment register is a "selector" which acts as an index into a descriptor table. The descriptor table is an array of "descriptors", each of which contain information relating to a particular memory segment. A descriptor includes a limit field that defines the size of the memory segment, a base address for the segment, and other access or control information. When a selector is loaded into a segment register, the descriptor that it references is also loaded into a "hidden" portion of the segment register. Future access to memory locations within the memory segment are then made by designating a segment register within a processor instruction, and providing an offset from the base address of the memory segment.
One particular memory segment that is commonly used is a stack segment. A stack segment, or simply a "stack", is an area of memory that is used by a processor to temporarily store information, and then provide the information back to the processor in reverse order. Most often, information is "pushed" onto the stack, and later "popped" off the stack in a last-in/first-out (LIFO) order.
Within an x86 processor, a stack segment is designated by loading a stack segment register (SS) with a selector/descriptor combination that defines the stack segment. The base address, limit, and access/control information for the stack segment are later referenced by using the SS segment register within a processor instruction (or by using a processor instruction that references the SS register implicitly). A second register, the ESP register, is used to temporarily store an offset from the base address.
An example of an instruction that utilizes the "stack" memory segment is PUSH. This instruction decrements the stack pointer, contained in the ESP register, and then places the designated operand into the memory location defined by adding the base address for the stack (i.e., the base address stored in the descriptor loaded in the hidden portion of the SS register) to the stack pointer (loaded in the ESP register). To retrieve this operand off of the top of the stack, a similar instruction POP is used. This instruction retrieves the operand located at the memory address defined by adding the base address to the stack pointer. After the memory retrieval is performed, the ESP register is incremented.
As mentioned above, within a stack segment descriptor, there exists a base address, a limit, and certain access attributes that define the memory segment. One of the access attributes is called the D/B bit. If the segment defined by the descriptor is a code segment, the bit is called the D bit, and indicates the default length for operands and effective addresses. In x86 processors, the default length can be set to either 32-bit operands or 16-bit operands. In a data segment, this bit is called the B bit, and it controls whether stack operations utilize a 32-bit ESP register, or a 16-bit SP register. If the bit is set, i.e., B=1, then pushes, pops and calls use a 32-bit ESP register. If, on the other hand, the B-bit is cleared, B=0, then the 16-bit SP register is used.
From the above, it should be clear that before any stack operation occurs, whether a push, pop or call, the SS register must first load the selector/descriptor combination which defines the stack segment of interest, and the contents of the B-bit must be known, to allow the processor to understand whether stack address size is to be 32-bit or 16-bit, before the operation is executed. This is well known in the art.
However, a problem exists in requiring the processor to know the bit size for stack operations prior to execution of a stack instruction. This is especially true when a stack instruction, such as a push, pop or call, is executed immediately following an instruction which loads the SS segment register (e.g., LSS). The problem is that when a processor loads the SS segment register with the selector/descriptor, the content of the B-bit is not known until the load instruction is completed. If the instruction following the load is a stack instruction, it cannot begin execution until the processor knows the contents of the B-bit, because it does not know whether the stack operation is a 16-bit or 32-bit operation. Thus, it cannot begin execution until the load instruction has completed. This situation is unfortunate in modern processors where multiple instructions are in various stages of execution at any one time. Requiring a second instruction to await the complete execution of a first instruction, before it can begin, adds costly processing delay to the processor, affecting the overall program speed of the processor.
For example, in a Pentium processor it typically takes 3 processor clock cycles out of the processor pipeline to execute a segment register load instruction. However, when loading the SS stack segment register, the Pentium processor requires at least 7 processor clock cycles. It is believed that the extra 4 clock cycles which are required by the LSS instruction is for the purpose of preventing any following instruction from beginning execution until the LSS has completed execution, i.e., until the contents of the B-bit have been determined, and provided to the processor.
However, adding 4 no operation (NOP) clock cycles to an instruction that really takes 3 clock cycles to execute, just to insure that a following stack instruction doesn't begin with an incorrect bit size (16 or 32), adds unnecessary delay to stack segment loads, as well as to stack operations which follow stack segment loads. Moreover, in much of the software that is written today, it is presumed that stack operations are 32-bit. But, just to insure compatibility with older 16-bit designs, every stack segment load operation incurs this unnecessary delay.
What is needed is an apparatus and method that reduces, or eliminates the delays associated with stack segment loads, while still insuring that stack operations can be executed which are 16-bit or 32-bit. More specifically, what is needed is a method and apparatus which allows a stack instruction such as a push, pop or call, to begin execution, without waiting for the completion of a stack segment load operation, but which still allows changes in the B-bit to immediately affect following stack instructions.
To address the above-detailed deficiencies, it is an object of the present invention to provide an improved method and apparatus for loading a stack segment register.
More specifically, it is an object of the present invention to improve the load time of a load stack segment macro instruction by translating a second instruction following the load stack segment instruction, and providing the translated instruction immediately behind the load stack segment instruction, without any intermediate "holes" or "bubbles" in the pipeline. Then, the translated second instruction is tracked, before execution, to determine whether the load stack segment instruction modifies the bit size for stack operations from 32-bit to 16-bit. If a modification is made by the load stack segment instruction, the intermediate translated instructions are ignored, and the second instruction is retranslated utilizing the new bit size for stack operations.
Accordingly, in the attainment of the aforementioned objects, it is a feature of the present invention to provide an apparatus for loading a stack segment register within a pipeline processor. The apparatus includes a translator, a register file, stack address size logic, stack size tracking logic, and determination logic. The translator is located within a translate stage of the pipeline processor, and translates a first macro instruction into a first sequence of micro instructions, and a second macro instruction into a second sequence of micro instructions. The register file, is connected to the translator, and further includes a stack segment register. The stack segment register stores a descriptor which indicates whether stack operations are of a first bit size or of a second bit size. The stack address size logic is connected to the register file, and to the translator, and provide an indicator to the translator that current stack operations are either of the first bit size or of the second bit size. The stack size tracking logic is connected to the translator, and associated with the second sequence of micro instructions, following the first sequence of micro instructions in the pipeline processor. Associated, in this context, indicates that the tracking logic follows or tracks the micro instructions down the pipeline, in processing stages after the translate stage. The tracking logic tracks whether stack operations were of the first bit size or of the second bit size at the time the second sequence of micro instructions were translated. The determination logic is connected to the stack size tracking logic, and to the stack address size logic, and determines whether the current stack operations and the tracked stack operations are both of the first bit size or of the second bit size, and if not of the same size, provides a signal to the processor to ignore the second sequence of micro instructions, and to retranslate the second macro instruction using the indicator from the stack address size logic.
An advantage of the present invention is that intermediate NOP's which are customarily inserted between the load stack macro instruction and the second macro instruction are eliminated, thus improving the effective load time for loading a stack segment register.
In another aspect, it is a feature of the present invention to provide a tracking mechanism, within a pipeline microprocessor, utilized for micro instructions that follow a stack segment register load. The tracking mechanism includes a plurality of pipeline stages, a stack address size signal line, a plurality of instruction registers, a plurality of stack address size bits, and logic circuitry. The plurality of pipeline stages within the pipeline microprocessor further includes a translate stage that translates macro instructions into micro instruction sequences. The stack address size signal line transmits an indicator signal to the translate stage, from a source thereof, which indicates whether stack operations are to utilize a 16-bit or a 32-bit stack address size (SAS). Each of the plurality of instruction registers are connected to one of the plurality of pipeline stages such that each one of the plurality of pipeline stages has an associated one of the plurality of instruction registers. The plurality of instruction registers temporarily store micro instructions as they proceed through the pipeline microprocessor. The plurality of stack address size bits are connected to one of the plurality of pipeline stages, and associated with one of the plurality of instruction registers, and temporarily store the value of the SAS indicator at the time the micro instruction in the associated instruction register was translated. The logic circuitry is connected to the stack address signal line, and to at least one of the plurality of stack address size bits, to determine whether the SAS indicator, stored in the plurality of address size bits, is the same as the SAS signal on the stack address signal line, and provides a signal to the pipeline microprocessor to continue processing micro instructions within the plurality of pipeline stages, or to discontinue processing the micro instructions within the plurality of pipeline stages, and to retranslate the micro instructions whose associated stack address size bit is not the same as the SAS signal on the stack address signal line.
In yet another aspect, it is a feature of the present invention to provide a method for loading a stack segment register within a pipeline microprocessor. The method includes providing a stack address size (SAS) signal which indicates the current bit size for stack operations, translating a load stack segment macro instruction into a first sequence of micro instructions, providing the first sequence of micro instructions to pipeline stages within the pipeline microprocessor, translating a second macro instruction into a second sequence of micro instructions, storing a stack address size (SAS) indicator along with at least one of the micro instructions within the second sequence, the indicator indicating the bit size for stack operations at the time the second macro instruction was translated, monitoring the SAS signal, and the SAS indicator, to determine whether, prior to executing the micro instructions within the second sequence, the bit size for stack operations has changed since the micro instructions were translated, and if prior to executing the micro instructions within the second sequence, the bit size for stack operations has changed, then disabling execution of the micro instructions within the second sequence, and retranslating the second macro instruction.
And advantage of the present invention is that instructions which follow a load stack segment instruction need not be delayed for the purpose of insuring that stack operation size is not changed by the preceding instruction, after they are translated, but before they are executed.