A microprocessor's word length denotes the length (in bits or bytes) of its basic working unit of data. For example, a 32-bit microprocessor has a nominal word length of 32-bits (4-bytes). With a uniform length instruction set, instructions commonly are stored in memory on natural word boundaries. Some microprocessors, however, use variable-length instructions, such as a mix of 32-bit and 16-bit instructions, or a mix of 64-bit and 32-bit instructions. Support for shorter-length instructions offers, in some cases, legacy compatibility, and provides an opportunity for a smaller instruction memory footprint, at least for applications that can make use of the shorter instructions.
However, realizing the memory savings requires storing the variable length instructions on non-natural boundaries. Memory in which instructions are not necessarily stored on natural word boundaries may be considered as non-aligned memory, while memory in which instructions are stored on natural word boundaries may be considered as aligned memory. As one example where non-aligned memory may be used, the ARM v7 family of microprocessors support word and half-word instructions, and allow 4-byte instructions to be stored across 4-byte boundaries.
While the use of non-aligned memory for instruction storage is space-efficient, the lower-latency instruction buffers which are used to enhance instruction execution performance, commonly use a natural word-alignment. For example, cache memories often are organized into cache lines that buffer word-aligned segments (lines) of external memory, which may be main memory, or a higher level of cache memory.
Reading non-aligned instructions into aligned cache lines means that boundary locations in the cache line may or may not include complete instruction words. That is, word-length instructions may cross cache boundaries. Inter-line boundaries, i.e., the break over from one cache line to the next, represent one type of cache boundary, while intra-line boundaries, such as word-aligned segment boundaries within each cache line, represent another type of cache boundary. Segment boundaries may arise from the use of word-aligned read ports that are less than the full cache line width.
Retrieving cross-boundary instructions from conventional caches requires two accesses: a first access to read out the instruction data in advance of the boundary position, and a second access to read the instruction data after the boundary position. The second access retrieves the trailing (post-boundary) portion of the border-crossing instruction. Obviously, the prevalence of misaligned instructions in cache memory negatively influences overall caching performance, because the extra cache reads required for retrieving the trailing portions of misaligned, cross-boundary instructions.