In a processor system that contains cache lines, factors such as cache miss, control hazard, and data hazard may halt cache line operation and affect the performance of the processor system.
Generally, cache misses are divided into three categories: compulsory, conflict, and capacity. Set-associative cache structure together with adding more sets of cache may be used to reduce conflict misses. But the number of sets of cache may be difficult to exceed certain level due to the power consumption and speed constraints. For example, multi-way set-associative cache structure may require all ways of the set addressed by a same index to be read and compared at the same time. The conventional pre-fetching cache technique is often able to solve the cache miss problem for some conflict misses and capacity misses at a certain cost, but is not effective to reduce compulsory misses. Further, new cache structures, such as victim cache, trace cache, and pre-fetching cache may be able to mitigate the cache miss problem to a certain extent. However, as the speed gap between processor and memory grow wider, cache misses have become the most serious bottleneck in modern processor performance improvement.
Control hazard caused by executing branch instructions is another major cause for cache line performance loss. When processing branch instructions, a conventional processor has no way of knowing in advance which instruction will be executed next after a branch instruction is executed. Such information may only be available until the branch instruction is executed or at least until a transition signal and a branch target instruction address are generated when executing the branch instruction.
In addition, techniques such as branch target buffer and trace cache may be used to predict the possibility of branch transition occurrence and to directly obtain the branch target address when a same branch instruction is executed again. However, such techniques often make prediction based on processor's past execution results. Thus, it is impossible to predict the possibility of branch transition occurrence and to obtain the branch target address when the branch instruction is executed for the first time. Even if the branch instruction is executed again, prediction error may still cause performance loss. Further, cache misses due to branch transition also cause performance loss in conventional processor system.
Data hazard is often caused by read after write (RAW) operation between instructions. For two adjacent or closely located instructions, when a target register in the preceding instruction is same as a source register in the succeeding instruction, the succeeding instruction is not able to obtain correct operand from the register until the result of the preceding instruction is written into the register. The cache line may be paused by inserting a bubble until the correct operand can be read from the register. A bypass technique may be used to alleviate the data hazard problem to certain extent. In this case, a plurality of bypass paths may be added in the processor. When the result of the preceding instruction is generated, the result is directly sent to a bypass path and the succeeding instruction obtains the correct operand from the bypass path instead of from the register. However, the bypass technique does not solve all data hazard problems.