1. Technical Field
The present invention relates generally to microprocessors and, in particular, to a method and apparatus for aligning memory write data in a microprocessor.
2. Description of Related Art
Contemporary microprocessors usually support the processing of data of multiple bit widths. In such microprocessors, one or more data types are preferred in that they are supported by appropriately-sized hardware primitives, such as registers, arithmetic logic units (ALUs), and memory queues.
Other data types are supported to a limited degree, in that such data can be read from and written to their natural alignment boundaries. During a read process, this usually requires realigning into processor registers and processor data paths such that a single data item which is not of full width is naturally aligned at the least significant position, optionally with zero or sign extension. Likewise, a sub-width data type when written to an address of its natural alignment boundary needs to be realigned within a processor to that boundary.
Also, it should be noted that data is organized in modern computers in one of two formats or in some combination of two formats. The formats are known as “big-endian”, in which the high order bit, byte or other units of information is located in the lower numbered unit address, and “little-endian”, in which the high order bit, byte or other units of information is located in the higher numbered unit address.
To date, microprocessors have usually included a rotator/alignment network prior to the memory interface for writing data, and beyond the memory stage for the read operation. This is undesirable for at least the following reasons. One such reason concerns data flow considerations. That is, if the standard rotator is to be used, then the data flow from the rotator to the memory interface is complex and can be slow. In particular, the rotator is aligned at ALU track pitch which differs from memory track pitch. Therefor, it is often necessary to introduce a second and possibly third rotator, before (for writing) and after (for reading) the memory access. This requires additional hardware, and also increases latency of memory operations, including those which do not need alignment operations (usually data in one of the preferred data formats corresponding to a natural machine processing width), since bypassing the rotators is often not practical.
Attempts have been made to reduce the complexity and resultant latency of memory access in the MIPS-X processor prototype from Stanford University, and the initial Alpha processor specification by Digital Equipment Corporation.
The Stanford MIPS processor is described by J. Hennessy, in “VLSI Processor Architecture”, IEEE Transactions on Computers, Vol. C-33, No. 12, pp. 1221–46, December 1984. This processor uses byte insert (IC) and extract (XC) operations to manipulate bytes, but otherwise only supports word addressing.
The approach used in these processors was to support only preferred data width memory operations, in conjunction with explicit memory alignment operations. Both of these architecture specifications have only had limited success with this approach, prompting the addition of sub-word memory operations to later generations of the processors.
The usual alignment networks are endian-specific and adding endian-independence usually requires additional logic in the alignment network. Processor implementors can either decide to support both big-endian and little-endian modes at the cost of high complexity, or only support one mode at the cost of sacrificing compatibility with a significant number of processors not having the selected endianness.
Explicit software-based alignment does not suffer from this defect, as both little-endian and big-endian configurations can be supported by the appropriate software sequences.
Accordingly, it would be desirable and highly advantageous to have a method and apparatus that supports software-based alignment of memory accesses, so as to reduce microprocessor implementation complexity, support big-endian and little-endian configurations, and reduce the penalty for using software based alignment of memory-write data found in previous processors.