1. Technical Field
The present invention generally relates to processors and in particular to a technique for enhancing operations within a processor.
2. Description of the Related Art
A processor is a digital device that executes instructions specified by a computer program. A typical computer system includes a processor coupled to a system memory that stores program instructions and data to be processed by the program instructions. High level processor instruction execution may be broken down into three main tasks: (1) loading data into the upper level cache from memory or an input/output (I/O) device; (2) performing arithmetic operations on the data loaded from memory; and (3) storing the results out to memory via a lower level cache, or to an I/O device.
Of the three main tasks for processor instruction execution, storing, or writing the data to the memory (or I/O device) is the most flexible in regards to the latency of completing the task. Therefore, when there is a simultaneous request to access the upper level cache for loading and a request to access the upper level cache for storing, the loading operation is typically chosen to proceed prior to the storing operation. If multiple requests are made to load data, a request to store data to the cache may occur on consecutive processor execution cycles without success. The most common method of handling the occurrence of waiting to store data to the cache is to utilize a store queue (STQ). A STQ holds the data to be stored while waiting to access the cache.
Some STQs allow more recently processed data to write (or store) to the cache before data that has been waiting longer to be written to the cache. The process of younger data retiring (i.e. writing data into the cache) before older data retiring is known as out-of-order (OoO) operations. OoO STQs may introduce data integrity problems also known as store ordering hazards. For example, in a store ordering hazard, a younger data store to a given address may be retired prior to an older store to the same address. The data integrity problems resulting from the OoO STQ may result in a violation of the sequential execution model that is standard in processor architecture.
Dependency vectors are a method of processing data stores that addresses the problems of an OoO STQ. Although dependency vectors are able to fully and concurrently handle multiple synchronizing operations within an OoO STQ, dependency vectors do not scale well to larger (e.g., greater than sixteen entry vectors) STQs. This lack of scalability when using dependency vectors in large STQs increases the area and power costs of the processor more than is desired.