1. Field of the Invention
The present invention relates generally to the field of programmable computer processors, and more particularly to application specific instruction sets.
2. Description of the Prior Art
Load and store instructions are a mechanism by which processors read data from and write data to memory or cache. Memories in computer processing systems have a fixed word width, but a data element to be stored in the memory often is different in width than the memory word width. For example, a 24-bit data element might be stored in a memory having 32-bit words, leaving eight bits of unused memory capacity. Therefore, a data element that is not aligned within the memory wastes memory capacity and may require multiple memory accesses to process. Alternatively, a processor may operate on multiple smaller elements at once, such as two sequential 16-bit elements, which may necessitate accessing some bytes from one memory word and the rest from the next memory word.
Load/store instructions can be segregated into aligned and unaligned types. With aligned load/store, a processor reads from memory in multi-byte ‘chunks’ beginning at addresses that are multiples of the ‘chunk’ size. For example, aligned loading of a 128-bit word requires loading sixteen 8-bit bytes with an address that is a multiple of sixteen. Aligned instructions require a relatively uncomplicated hardware structure, thus most processors support aligned load/store. Aligned data never presents the problem of accessing multiple words.
Unaligned load/store instructions do not necessarily load data into memory from aligned addresses. Unaligned instructions are relatively easier to use in programming due to reduced memory allocation restrictions and addressing limitations. Unaligned instructions are advantageous for algorithms that process data elements smaller than the memory word size since data elements can be densely packed into memory and more than one data element can be accessed at a time. However, most processors do not support unaligned load/store instructions because unaligned load/store instructions are relatively more complicated to implement with silicon hardware.
Unaligned memory accesses can cross memory boundaries, requiring access from multiple words, and can require accessing partial words. Some implementations for processing unaligned data perform sequential accesses to process a data element that spans multiple data words, but this takes more than one memory cycle. Memory accesses to misaligned addresses can be emulated by multiple aligned accesses, but this method is slow. Attempting to perform an unaligned memory access in a system that does not support it may generate ‘bus errors’ or may abort the program.
Accordingly, a method is needed to provide unaligned load/store instructions for a processor that normally supports only aligned load/store instructions.