The present invention relates generally to data processing systems and methods and, more particularly, to systems and methods for loading data from memory.
Microprocessors, including general purpose microprocessors and digital signal processors (DSPs), are ubiquitous in today's society. Many different types of products incorporate microprocessors, including personal computers, toys and cars just to name a few. At a fundamental level, microprocessors perform functions by executing instructions. The set of instructions which a particular microprocessor is designed to respond to is called its instruction set. Microprocessor instruction sets include commands to the microprocessor to, e.g., load values in registers, store values from registers to locations in main memory, add, subtract, multiply or divide values stored in registers or other storage locations, compare shift register contents and perform a variety of control operations, including testing and setting bits, outputting data to a port, pushing data to a stack and popping stack contents. Microprocessor instructions are typically expressed in mnemonic form. For example, one common microprocessor instruction is ADD, which is the mnemonic for an addition operation, for example, to add the contents of one register to the contents of another register and place the result in an accumulator or other register.
Another common microprocessor instruction is the load operation, the mnemonic for which is typically LOAD, which instructs the microprocessor to load data from a memory device, e.g., a cache memory. Of particular interest for the present discussion are so-called “indirect loads” wherein a first load instruction is used to obtain an address (or a value used to create an address) for a second load instruction. Indirect loads can be written in programs as, for example:                load b=[a]        add c=b+d        load v=[c]where a and c are addresses, b and v are registers, and d is an integer that serves as an offset or displacement from base address b. The displacement may be zero, in which case the add instruction is not needed. Executing this program code results in the value at virtual address a being loaded into register b, that value being added to value d to create c, which in turn is used as a new address that is accessed to load register v. A more detailed discussion of conventional indirect loads is provided below with respect to FIG. 1. However, what is significant to note initially is that indirect loads may take a long time to execute, particularly if the addresses being accessed are not currently stored in a cache.        
Indirect loads occur in programs for several reasons, including reference to a piece of data v through a structure, in which case ‘a’ is the address of a pointer to the structure that contains different fields. The field of interest in this example is located d bytes after the beginning of the structure, and is itself a pointer to v. Another source of indirect loads in programs is linkage tables (also known as “global offset tables”). Data accessed through linkage tables require two loads, one to the linkage table off the global pointer register and one to the address returned by the first load. In addition, the two issues can be compounded, that is, a program may need to access data through a structure that itself needs to be accessed through a linkage table. The impact on memory traffic, and therefore performance, is significant because at least two loads are needed where as few as one is needed conceptually. In addition, these access patterns usually have little locality. For instance, distant entries in the linkage table are often needed at the same time, which means that the two linkage table offsets are large enough for the two entries not to sit in the same line at any cache level. This implies additional latency associated with fetching at least some of the information needed to complete the indirect load from main memory.
Accordingly, it would be desirable to provide techniques and devices which avoid latency associated with indirect loads.