The present disclosure generally relates to data processing systems, and more specifically, to techniques for performing a flush and restore of a distributed history buffer in a processing unit.
High performance processors currently used in data processing systems today may be capable of “superscalar” operation and may have “pipelined” elements. Such processors may include multiple execution/processing slices that are able to operate in parallel to process multiple instructions in a single processing cycle. Each execution slice may include a register file and history buffer that includes the youngest and oldest copies, respectively, of architected register data. Each instruction that is fetched may be tagged by a multi-bit instruction tag. Once the instructions are fetched and tagged, the instructions may be executed (e.g., by an execution unit) to generate results, which are also tagged. A Results (or Writeback) Bus, one per execution slice, feeds all slices with the resultant instruction finish data. Thus, any individual history buffer generally includes one write port per Results/Writeback bus.
In traditional processors, the history buffer is typically a centralized component of the processing unit, such that it can back up the data when a new instruction is dispatched and the target register has to be saved into the back up register file. However, such centralized components may not be feasible for processors that include multiple execution/processing slices. For example, in processors with a large number of processing slices, the number of ports needed for such a centralized history buffer can be extensive, leading to an extensive amount of wires between the distributed execution units.
However, including numerous write ports on a history buffer can be expensive to implement in the circuit. For example, as the number of ports associated with the history buffer increases, the circuit area of the history buffer in the processing unit can grow rapidly. This, in turn, creates a compromise on the number of history buffer entries that can be supported in a given circuit area. For example, smaller history buffers generally fill up faster and can impact performance, stalling the dispatch of new instructions until older instructions are retired and free up history buffer entries. On the other hand, larger history buffers are generally expensive to implement and lead to larger circuit size.
To address the limitations associated with centralized history buffers, some processing units may use a distributed history buffer design. In a distributed history buffer design, the history buffer may include multiple distributed levels to provide support for the main line execution of instructions in the processing unit. The use of distributed history buffers, however, has prompted new issues to emerge as areas of concern. One such issue relates to recovery operations for restoring the registers in the register file to the proper states.