The present application relates generally to data processing, and more specifically, to processor architecture. Binary data is organized in memory as 8-bit units called “bytes,” while the registers implemented by a processor may be larger than a single byte. The terms “endian” and “endianness” refer to how bytes of a multi-byte element are ordered within memory as data is moved between registers and memory.
Individual bytes of a multi-byte element are generally stored in consecutive memory addresses (e.g., 4 consecutive addresses for a 32-bit element). A big-endian processor stores the most significant byte of the multi-byte element in the lowest address of the consecutive range, and stores the least significant byte in the highest address. In contrast, a little-endian processor stores the least significant byte in the lowest address. Put another way, bytes of increasing numeric significance are stored to increasing memory addresses by a little-endian processor, while a big-endian processor stores decreasing numeric significance with increasing memory addresses.
Consider, as an example, the 4-byte element “0A 0B 0C 0D” and a memory range with offsets 0-3. A big endian processor places the first byte (“0A”) in offset 0, the second byte (“0B”) in offset 1, the third byte (“0C”) in offset 2, and the last byte (“0D”) in the last offset, 3. A little-endian processor uses the reverser order, placing the first byte (“0A”) in offset 3, the second byte (“0B”) in offset 2, the third byte (“0C”) in offset 1, and the last byte (“0D”) in the first offset, 0.
A conventional processor that supports big-endian and little-endian byte-ordering uses a mode indication that for all memory operations directs the processor to either perform all memory operations in accordance with a big endian mode, or all memory operations in accordance with a little-endian mode. That is, when in big-endian mode, the conventional processor uses big-endian byte-ordering when transferring data between the processor and memory, and when in little-endian mode, uses little-endian byte-ordering when transferring data between the processor and memory. This implementation works well when accessing singular, scalar values, but not when accessing vectors.
A vector is defined as a collection of scalar values, also referred to as vector elements. Vector elements can be bytes, halfwords (2 bytes), words (4 bytes), doublewords (8 bytes), and larger. Vectors are addressed in memory by the address of the first element in the vector. In this context, “vectors” are numbered sequences of individual, distinctly addressable elements, regardless of these elements are stored in a vector register, a general purpose register, a floating point register, a vector-scalar register, or another register type.
Conventional processors that support both big-endian and little-endian byte-ordering are generally implemented either as a big-endian-based system with added support for little-endian byte-ordering, or as a little-endian-based system with added support for big-endian byte-ordering. That is, processing of vector data is conventionally performed with a left-to-right element ordering for big-endian-based systems and a right-to-left element ordering for little-endian-based systems. When loading little-endian data on a big-endian-based system, or loading big-endian data on a little-endian-based system, while the byte-ordering of each vector element is reversed as needed. However, a side effect also occurs in that the ordering of vector elements in the register is also reversed.
While most vector operations are insensitive to the ordering of elements in a register, there are classes of vector operations that are dependent on the ordering of the elements in a vector and will not produce correct results unless vector elements are presented in reverse order. Examples of such operations include but are not limited to permute operations, string processing operations, and cryptographic processing which operate on arrays of byte-sized elements and which depend on the ordering of the vector elements, or any other operations that make reference to a natural ordering of elements in memory.
The conventional solution of implementing storage accesses that perform a byte-reverse on data transferred between the processor and memory when the endian mode is opposite to the base implementation endianness of the system also causes the ordering of vector elements to be reversed, thus creating problems for these types of vector operations that process this data. To date, those skilled in the art have not been able to resolve this problem, as demonstrated by the omission of string operations from the little-endian specification of the Power architecture when bi-endian support was first introduced to the industry.