1. Field of the Invention
This invention is related to the field of processors and, more particularly, to mechanisms for handling misalignment of load/store memory operations in processors.
2. Description of the Related Art
Processors generally include support for memory operations to facilitate transfer of data between the processors and memory to which the processors may be coupled. As used herein, a memory operation is an operation specifying a transfer of data between a processor and a main memory (although the transfer may be completed in cache). Load memory operations specify a transfer of data from memory to the processor, and store memory operations specify a transfer of data from the processor to memory. Memory operations may be an implicit part of an instruction which includes a memory operation, or may be explicit load/store instructions. Load memory operations may be more succinctly referred to herein as xe2x80x9cloadsxe2x80x9d. Similarly, store memory operations may be more succinctly referred to as xe2x80x9cstoresxe2x80x9d.
A given memory operation may specify the transfer of multiple bytes beginning at a memory address calculated during execution of the memory operation. For example, 16 bit (2 byte), 32 bit (4 byte), and 64 bit (8 byte) transfers are common in addition to an 8 bit (1 byte) transfer. The address may be calculated by adding one or more address operands specified by the memory operation to generate a virtual address, which may be translated through an address translation mechanism to a physical address of a memory location within the memory. Typically, the address may identify any byte as the first byte to be transferred, and the additional bytes of the multiple byte transfer are contiguous to the first byte.
Unfortunately, since any byte may be identified as the first byte, a given memory operation may be misaligned. At an architectural level, a memory operation having an address A and accessing N bytes may be defined to be misaligned if A mod N is not equal to zero. However, a particular processor may define misalignment more loosely. Generally, a particular processor may define a memory operation to be misaligned if the memory operation requires additional execution resources (as compared to an aligned memory operation) to complete the access to the N bytes operated upon by the memory operation. For example, a processor may implement a cache having cache lines. If one or more of the N bytes operated upon by the memory operation are in one cache line and the remaining N bytes are in another cache line, two cache lines are accessed to complete the memory operation as opposed to one cache line if the N bytes are included within one cache line. Such an implementation may define misalignment to mean that a cache line boundary is crossed by the N bytes (one or more of the N bytes are on one side of the cache line boundary, and the remaining N bytes are on the other side of the cache line boundary). Other implementations may employ multiple banks within the cache, and each cache line may be spread out among the banks. Such an implementation may define misalignment to mean that a bank boundary is crossed by the N bytes. Other implementations may define misalignment differently.
As indicated above, misaligned memory operations may require more execution resources to complete than aligned memory operations require. However, the misaligned memory operations must be executed correctly to comply with the instruction set architecture of the processor. Accordingly, a mechanism for handling misaligned memory operations is desired.
It is noted that loads, stores, and other instructions or instruction operations may be referred to herein as being older-or younger than other instructions or instruction operations. A first instruction is older than a second instruction if the first instruction precedes the second instruction in program order (i.e. the order of the instructions in the program being executed). A first instruction is younger than a second instruction if the first instruction is subsequent to the second instruction in program order. Additionally, the term xe2x80x9cexecution resourcexe2x80x9d generally refers to a piece of hardware used during the execution of an instruction. If one instruction is using an execution resource, another instruction is precluded from concurrent use of that execution resource.
The problems outlined above are in large part solved by a processor as described herein. The processor includes execution resources for handling a first memory operation and a concurrent second memory operation. If one of the memory operations is misaligned, the processor may allocate the execution resources for the other memory operation to that memory operation. Advantageously, additional execution resources for handling misalignment may be eliminated. Instead, a small amount of hardware may be included to detect the misalignment and allocate the execution resources for the other memory operation. Additionally, in one embodiment, the power consumed when executing misaligned memory operations may be substantially the same as executing non-misaligned memory operations since additional execution resources are not added to support misaligned memory operations. For example, additional cache reads may not be performed if the execution resources to be allocated include a cache port.
In one embodiment, the older memory operation proceeds if misalignment is detected. The younger memory operation is retried and may be reexecuted at a later time. If the older memory operation is misaligned, the execution resources provided for the younger operation may be allocated to the older memory operation. If only the younger memory operation is misaligned, the younger memory operation may be the older memory operation during a subsequent reexecution and may thus be allocated the execution resources to allow the memory operation to complete.
Broadly speaking, a processor is contemplated. The processor includes a first address generation unit (AGU) and a misalignment circuit. The first AGU is configured to generate a first misalign signal indicative of whether or not a first memory operation is misaligned. Coupled to receive the first misalign signal, the misalignment circuit is configured to allocate at least one execution resource corresponding to a second memory operation concurrently executable with the first memory operation to the first memory operation in response to the first misalign signal. Additionally, a computer system is contemplated including the processor and an input/output (I/O) device configured to communicate between the computer system and another computer system to which the I/O device is couplable.
Furthermore, a method is contemplated. A first memory operation is executed, wherein the execution includes determining that the first memory operation is misaligned. At least one execution resource corresponding to a second memory operation is allocated to the first memory operation responsive to determining that the first memory operation is misaligned. The second memory operation is concurrently executable with the first memory operation.
Moreover, a processor is contemplated. The processor comprises a first AGU, a second AGU, and a misalignment circuit. The first address AGU is configured to generate a first misalign signal indicative of whether or not a first memory operation is misaligned. Similarly, the second AGU is configured to generate a second misalign signal indicative of whether or not a second memory operation is misaligned. Coupled to receive the first misalign signal and the second misalign signal, the misalignment circuit is configured to signal a retry of one of the first memory operation and the second memory operation in response to at least one of the first misalign signal and the second misalign signal indicating misaligned.