1. Field of the Invention
This invention relates in general to the field of data processing in computers, and more particularly to an apparatus and method for reducing the number of micro instructions and commensurate clock cycles that are required to access misaligned memory operands.
2. Description of the Related Art
A present day microprocessor has an instruction path, or pipeline, that is divided into stages, with each stage dedicated to performing a specific type of function. A first stage fetches program instructions from instruction memory and places them in a macro instruction queue. A following stage decodes these macro instructions into associated micro instructions. Each of the associated micro instructions directs the microprocessor to perform certain tasks in one or more of the subsequent stages. Together, the tasks specified by each of the associated micro instructions accomplish an overall operation that is directed by a corresponding macro instruction. Decoded micro instructions are executed in sequence by and proceed through successive stages of the microprocessor in synchronization with a microprocessor clock signal.
In addition to prescribing an operation to be performed, many macro instructions also specify operands to be used in the carrying out the prescribed operation. For example, a division macro instruction will specify both a dividend operand and a divisor operand. Most instruction sets today give a programmer a wide variety of options for prescribing operands, one such option being the option to specify an operand that is located at an address in memory.
To access a memory operand, a microprocessor must first generate an address corresponding to the operand""s location in memory. Then this address is issued to memory over an address bus as part of a task to either load or store the operand. Memory devices, in turn, access the operand at the prescribed address. For load operations, the operand is provided from the memory devices to the microprocessor over a data bus. For store operations, the operand is written from the microprocessor to the memory devices over the data bus.
Early microprocessor designs provided a rudimentary 8-bit data bus. Hence, to access an operand whose size was greater than one byte required that the microprocessor execute multiple access micro instructions. And because bus accesses are significantly slower than transfers within a microprocessor, this approach was deemed early on to be deficient because it took too long to retrieve/store memory operands.
Many improvements have been made since that time to speed up the access of memory operands. One such development, multiple-byte data buses, now allows multiple-byte memory operands to be accessed within a single load/store cycle. A significant number of today"" microprocessors have a 64-bit data bus, thus allowing access of eight bytes (i.e., a quadword) in parallel. Furthermore, all accesses are made according to the full width of the data bus. Hence, for a 64-bit data bus, all accesses over the bus are made to 8-byte address ranges, regardless of the size of the prescribed memory operand. In fact, for a 64-bit data bus architecture, the lower three bits of an address are not even provided on the address bus because the lower three bits are used to address individual bytes within a given 8-byte address range.
When a multiple-byte operand is fully contained within the address range corresponding to the size of the data bus, then only one load/store cycle is required to access the operand. But when only part of the operand is contained within the address range and another part of the operand is contained within a previous/next address range, then more than one cycle is required to accomplish the access. An operand that only partially resides within the address range is referred to as a misaligned operand.
To access a misaligned memory operand spanning two sequential address ranges requires three instructions: a first access instruction to access a first part of the misaligned operand within a first of the two sequential address ranges; a second access instruction to access a second part of the misaligned operand within the second of the two sequential address ranges; and an additional instruction to ensure that both parts of the misaligned operand can be retrieved without incurring an error.
This additional instruction is called an access tickle instruction, or more briefly, a tickle instruction. The tickle instruction must be executed when accessing a misaligned operand because memory paging schemes may allow access to the first part of the misaligned operand while precluding access to the second part of the misaligned operand, thus permitting a partial, and inaccurate, load or store of the misaligned operand. Memory paging schemes divide the memory address space up into discrete blocks and allow a programmer to assign access privileges to each block, or page. The most common page size seen today is a 4 kB page. In addition, paging schemes allow a programmer to provide a map, in memory, for each block of memory addresses generated by the microprocessor to a corresponding physical address within the computer system. This mapping feature allows an application program to simulate a large address space by using a small amount of memory and some disk space.
Paging is a powerful feature within present day computer systems. Yet, when paging is employed, each time an instruction directs access to a particular generated address, or virtual address, two actions must occur. First, the virtual address must be mapped to a corresponding physical address within a specific memory page. Second, access privileges to the memory page must be validated. If access is denied to the memory page, then the directed access operation is aborted.
As part of a misaligned access instruction sequence, a present day microprocessor first executes the tickle instruction to verify access to a memory page containing the second part of the misaligned operand, but it does not access the second part. Hence, the memory page containing the second part is xe2x80x9ctickled.xe2x80x9d If access is denied to the memory page containing the second part, then the access operation can be aborted prior to fetching the first part of the operand. If access is allowed, then it is known that access to the memory page containing the second part will be allowed prior to execution of an instruction directing access to the first part.
Tickle instructions are required to perform misaligned memory accesses because a present day microprocessor does not provide the capability to determine, prior to address translation and access validation, whether or not the two parts of a misaligned operand will reside in two separate memory pages having different access privileges.
The present inventors, however, have observed that less than one percent of all misaligned operands actually reside in two different memory pages and hence, a significant number of pipeline cycles are being wasted in the execution of non-essential tickle instructions.
Therefore, what is needed is an apparatus in a pipeline microprocessor for accessing a misaligned memory operand that eliminates generation of a tickle instruction when both parts of the operand lie within the same memory page.
In addition, what is needed is a misaligned memory operand access apparatus that determines, prior to generation of a tickle instruction, whether or not a misaligned memory operand spans more than a single memory page.
Furthermore, what is needed is a method in a pipeline microprocessor for accessing misaligned memory operands that precludes generation of a tickle instruction when the misaligned operands do not cross a memory page boundary.
Accordingly, it is a feature of the present invention to provide an apparatus in a microprocessor for accessing a misaligned memory operand. The apparatus includes page boundary evaluation logic and address logic. The page boundary evaluation logic evaluates an address corresponding to the misaligned memory operand, and determines whether or not access to the misaligned memory operand is within a single memory page. The address logic is coupled to the page boundary evaluation logic. The address logic eliminates an access tickle instruction when access to the misaligned memory operand is within the single memory page.
In another aspect, it is a feature of the present invention to provide a microprocessor apparatus for accessing a misaligned memory operand. The microprocessor apparatus has address logic and page limit logic. The address logic generates access instructions to access the misaligned memory operand. The access instructions include an access tickle instruction, a first access instruction, and a second access instruction. The access tickle instruction only directs page access validation of a second part of the misaligned memory operand. The first access instruction directs page access validation of and access to a first part of the misaligned memory operand. The second access instruction directs page access validation of and access to the second part of the misaligned memory operand. The page limit logic is coupled to the address logic. The page limit logic determines, prior to generation of the access tickle instruction, whether or not the misaligned memory operand spans two memory pages. If the misaligned memory operand resides within a single memory page, the page limit logic directs the address logic to preclude generation of the access tickle instruction.
In yet another aspect, it is a feature of the present invention to provide a data entity load/store apparatus in a microprocessor. The data entity load/store apparatus includes address stage logic and data/ALU stage logic. The address stage logic provides instructions to load/store data entities to/from a memory. The address stage logic has load/store instruction generation logic and a page boundary evaluator. The load/store instruction generation logic generates a specific instruction sequence to load/store a misaligned data entity. The page boundary evaluator is coupled to the load/store instruction generation logic. The page boundary evaluator indicates conditions such that a tickle instruction is eliminated from within the specific instruction sequence. The data/ALU stage logic is coupled to the address stage logic. The data/ALU stage logic executes the specific instruction sequence to load/store said misaligned data entity. If the tickle instruction is provided, then the data/ALU stage logic executes the tickle instruction by validating access to a first memory page without loading/storing a first portion of the misaligned data entity from/to the first memory page.
In a further aspect, it is a feature of the present invention to provide a method in a microprocessor for accessing a misaligned memory operand. The method includes evaluating an address corresponding to the misaligned memory operand to determine whether or not the misaligned memory operand is within a single memory page; and, if the misaligned memory operand is within the single memory page, providing an instruction sequence to load a first part and a second part of the misaligned memory operand from the single memory page, where the instruction sequence does not perform a stand-alone tickle operation.