A computer system can generally divided into three basic blocks: a central processing unit (CPU), memory, and an input/output interface (I/O). These blocks are interconnected by one or more buses. Typically, an input device, such as a keyboard, mouse, disk drive, etc., is used to input data and/or computer programs to the computer system through the computer's I/O device. The computer programs instruct the computer system as to how the data should be processed. These instructions and data are usually stored in memory. The CPU retrieves the data stored in the memory and processes the data according to the instructions. The results can be stored back into memory or outputted via the I/O interface to a printer, video monitor, speaker, etc. For example, a user can enter characters by typing on a keyboard. With the help of a wordprocessing software program, the document can be formatted, spell-checked, cut-and-pasted, and otherwise manipulated. The document is usually displayed on a computer screen while it is being drafted. The finished draft can be printed out and saved electronically onto a disk drive for subsequent retrieval.
When a computer program changes a value stored in memory, it is performing a "store" operation. And when the computer is retrieving an instruction or data from memory, it is performing a "load" operation. Each of these load and store operations require an address that specifies a location in memory. In a store operation, the address specifies a location in memory that is available for storing the data. In a load operation, the address specifies the location in memory where the desired instruction or data resides.
Typically, in a modern computer system, there are several different varieties of addressing reflecting different levels of abstraction. For example, in the Intel x86 architecture, there are the logical, linear, and physical addresses.
The logical address is specified in the assembly language or machine code program, and consists of a selector and an offset. The offset is formed by adding together 3 components: base, scaled index and displacement. The logical address space is, therefore, segmented.
The logical address (consisted of segment, offset) is transformed to a flat linear address by adding a segment base corresponding to the segment selector to obtain a linear address.
Both the logical and linear address spaces may be larger than the amount of physical memory in a system. A technique called virtual memory is used to translate the linear address into a physical address used to address a limited amount of physical memory. The limited amount of physical memory is extended by secondary storage, such as a hard disk drive.
As mentioned above, a logical address consists of a segment: offset pair, and the offset is often itself calculated via a formula such as
______________________________________ base-register + index-register * scale + immediate. ______________________________________
Correspondingly, this implies that the linear address is
______________________________________ segment-base + base-register + index-register * scale + immediate. ______________________________________
These formulae and the encodings used to represent them in the instruction stream are called "addressing modes."
Addressing modes are motivated by several reasons. First sometimes addressing modes permit programs to be smaller (e.g., by reducing address size). Instead of placing a 32-bit multiple bit logical offset constant in an instruction, they sometimes permit a smaller 1 byte specification of a register, for example. (Note that this is not always true) Second, addressing modes permit programs and subroutines to be written when the addresses of data are not known in advance. The address can be calculated from input and placed in a register. Third, addressing modes permit some frequent calculations to be encoded within the memory reference instruction, rather than requiring separate instructions.
The base registers are generally used by compilers to point to the start of the local variables or arrays. In addition, index registers are used to access the elements of an array or a string of characters. Furthermore, the index register's value can be multiplied by a scale factor (e.g., 1, 2, 4, 8, etc.) which is useful in accessing arrays or similar structures. Lastly, a displacement is added for calculating the final effective address.
Memory can be divided into one or more variable length segments, which can be swapped to disk or shared between programs. Memory can also be organized into one or more "pages". Segmentation and paging are complementary. Segmentation is useful to application programmers for organizing memory in logical modules, whereas pages are useful to the system programmer for managing the physical memory of a system. Typically, a segmentation unit is used to translate the logical address space into a 32-bit linear address space. A paging unit is then used to translate this linear address space into a physical address space. It is this physical address that appears on a microprocessor chip's address pins.
In many CPU's, an Address Generation Unit (AGU) is used to perform the address calculations. The AGU is also responsible for handling all segment opertions and for controlling accesses to all control/test registers. In the past, AGUs were accustomed to handling segments having relatively small widths (e.g., 16 bits). Eventually, advances in microprocessor technology have led to wider and wider segments. At first, in the Intel.TM. 8086 architecture, a segment selector was used to build the descriptor. Then, in the Intel.TM. 286 architecture, a 16-bit protected mode used the selector to access memory for the 48-bit descriptor. Eventually, the Intel.TM. 386 architecture which had a 32-bit protected mode, used the selector to access memory for the 64-bit descriptor. In order to maintain backwards compatibility with previous software, these additional bits were written into separate positions by the operating system. However, these non-contiguous segments are rather difficult to process by the hardware.
Another problem associated with prior art AGU designs is that as segment widths increased, the width of the buses correspondingly needed to increase. However, increasing the bus width is expensive, as it consumes a great deal of silicon area (i.e., die size). Increasing the die size means that less dies can be made from a given wafer. This directly translates into higher production costs.
Yet another problem with prior art AGU designs is that they are slow. One of the factors slowing down the address generation process is attributable to the use of two different descriptor tables. A selector value gives the offset into one of these descriptor tables. Hence, a typical AGU first reads the selector, determines which of the two tables to access, and then actually reads the descriptor value from that table. This process is rather slow and cumbersome.
Furthermore, the next generation of microprocessors are incorporating out-of-order processing. In other words, instructions are not necessarily being executed in the same sequence implied by the source program. In addition, the source code is also being processed speculatively. Speculation is the technique of guessing which way the program will proceed, and performing the execution down that path. This implies that there exists a method of correcting erroneous speculations.
Therefore, there is a need in prior art AGU's for a design that accommodates out-of-order and speculative processing. It would be preferrable if such an AGU were also backwards compatible. It would also be preferrable for such an AGU to have a limited bus width without sacrificing efficiency and also be fast, and flexible.