It is often desired that processors support concurrent independent data loads and stores (e.g. from memory to registers and vice versa). A known solution to this is to use multiple address units so that load addresses and store addresses can be calculated (and hence used for load and store operations) in parallel. However, adding multiple address units increases the physical size of a processor (e.g. in terms of silicon area) and this in turn increases the cost of the processor. Furthermore, when adding additional address units, additional instruction information is required to control the extra address units which results in increased instruction decode logic and increased storage requirements for the instructions (e.g. more code RAM is required). This further increases the silicon area required for the processor.
The embodiments described below are provided by way of example only and are not limiting of implementations which solve any or all of the disadvantages of known processors and known methods of loading and storing data from/to memory.