1. Background Field
The present invention relates to processing units and in particular to load schedulers.
2. Relevant Background
Processors, such as microprocessors, digital signal processors, and microcontrollers, are generally divided into many sub-systems, such as a memory system, a processing unit, and load store units. The load store unit transfer data between the processing units and the memory system. Specifically, the load store unit reads (i.e. loads) data from the memory system and writes (i.e. stores) data to the memory system.
FIG. 1 shows a simplified block diagram of a load store unit 110 coupled to a memory system 140. Load store unit 110 includes an instruction decoder 111, a load scheduler 113, a load pipeline 115, a store scheduler 117, and a store pipeline 119. In some processors, instruction decoder 111 may be part of another subsystem. Instruction decoder 111 decodes the program instructions and sends load instructions to load scheduler 113 and store instruction to store scheduler 117. Other types of instructions are sent to appropriate execution units, such as a floating point execution unit, or an integer execution unit. In most systems with multiple processing units, each processing unit includes a separate load/store unit.
Load scheduler 113 schedules the load instructions and issue load instructions to load pipeline 115 for execution. Load pipeline 115 executes the load instructions and reads the requested data from memory system 140. Similarly, store scheduler 117 schedules the store instructions and issues store instruction to store pipeline 119 for execution. Store pipeline 119 executes the store instruction and stores the data from the store instructions into memory system 140.
While the simplest way to issues load instructions is to issues the load instructions in order, greater performance may be achieved by issuing load instructions out of order. For example, if load scheduler 113 receives load instruction L—1, followed by load instruction L—2, followed by load instruction L—3 and load instruction L—1 has unresolved dependencies, load scheduler 113 may issue load instruction L—2 prior to load instruction L—1 rather than stalling and waiting for the dependencies of load instruction L—1 to resolve. Furthermore, load scheduler 113 may also issue load instruction L—3 while waiting for the dependencies of load instruction L—1 to resolve. However, various hazards may occur when load instructions are issued out of order. For example, if load instruction L—1 and load instruction L—2 (which should come after load instruction L—1) are to the same memory location, load instruction L—2 is issued before load instruction L—1, and a store instruction modifies the memory location after execution of load instruction L—2 and before the execution of load instruction L—1, then the data retrieved by load instruction L—1 and load instruction L—2 may be inaccurate. Issuing load instructions out of order is particularly complicated in systems having multiple processing units because store instructions from different processing units may change the data required by the load instructions. Typically, load store units that support out of order execution of load instructions require an extensive tracking system to monitor all loads and store instruction to detect hazards caused by instructions that were issued out of order. The out of order instructions that have hazards are then flushed and reissued to eliminate the hazards. To support unlimited issuing of out of order load instructions, the tracking systems would require extensive overhead. The overhead for the tracking system may negate the performance benefits of issuing out of order load instructions.
Hence there is a need for a method and system to support issuing of out of order load instructions while detecting and resolving any hazards caused by the out of order load instructions without using extensive resources.