Two types of data format are commonly supported within the computing industry, namely big-endian and little-endian. In little-endian format, an address for a data value identifies the least significant byte of the addressed data value, and hence in little-endian notation byte [0] is used to denote the least significant byte of the data value. In big-endian format, the address for a data value identifies the most significant byte of the addressed data value, and accordingly byte [0] is used in big-endian notation to identify the most significant byte of the data value.
A data value can be considered as consisting of a number of data elements, where a data element is the basic unit of addressable data. Hence, typically, a data element will be a byte of data, and the data value will consist of a plurality of bytes, e.g. four bytes for a 32-bit data value, eight bytes for a 64-bit data value, etc. When swapping the endianness of a data value, the ordering of the constituent data elements (e.g. bytes) is reversed. Hence, if a big-endian 32-bit data value consists of the bytes ABCD then the swapping of the endianness of that data value will result in the equivalent little-endian data value DCBA.
Within any particular data processing apparatus, there may be some circuitry which handles data in one endian format, whilst other circuitry within the data processing apparatus handles data in a different endian format. Assuming data values are to be shared between such circuitry, then mechanisms need to be provided for converting data values from one endian format to the other endian format. Considering as an example a processor incorporating a processor core, the processor will typically be coupled via a bus interconnect with a number of other devices. The processor core will typically be arranged to apply operations to data in one particular endian format, and hence by way of example may be arranged to apply operations to data in little-endian format. If such a processor is to be arranged to share data with another device of the data processing apparatus that operates on data using big-endian format, then an endian conversion operation needs to be performed on data as it is read into, and written out of, the processor core.
An added complexity is that the data values operated upon within a data processing apparatus can be of various different sizes, and the exact re-ordering required when performing endian conversion will be dependent on the size of the data values being handled at the time. The action of re-ordering the constituent data elements (e.g. bytes) of data values in order to perform endian conversion is often referred to as “swizzling” the data elements, and the circuitry provided for performing such re-ordering is often referred to as swizzle circuitry. Due to the need to cater for data values of various different sizes, the swizzle circuitry becomes complex, and for example will include many multiplexers that add propagation delay to the data path. For certain paths where such swizzle circuitry is required, the complexity of the swizzle circuitry can lead to those paths becoming a critical path within the data processing apparatus, thereby limiting the speed at which the data processing apparatus can be run. For example, considering a processor having a processor core coupled to a level one data cache, if the processor core operates on data values in little-endian format, but the processor interfaces with big-endian devices within the data processing apparatus, such swizzle circuitry may be required in the path over which data read from the level one cache is returned to the processor core. Such a path may represent a critical timing path, with the complexity of the swizzle circuitry contributing to the delay in that path.
Accordingly, it would be desirable to provide improved swizzle circuitry which can be used to alleviate the timing on such critical paths, whilst still catering for performing swizzle operations in connection with data values of various different sizes.