(a) Field of the Invention
The invention relates to a pre-fetch control method and, more particularly, to a pre-fetch control method in conjunction with a combined access control.
(b) Description of the Related Art
A typical architecture of a computer includes a processor core, a L1 I-cache, a L1 D-cache, a pre-fetch cache unit, and a main memory. When a processor core reads instructions from the L1 I-cache, the pre-fetch cache unit, and the main memory, the access priority over the L1 I-cache are higher than those over the pre-fetch cache unit, and the access priority over the pre-fetch cache unit are higher than those over the main memory. Hence, the performance of a computer is enhanced when the hit rates of L1 I-cache and the pre-fetch cache unit were increased.
FIG. 1 shows a schematic diagram illustrating the request procedure of a processor core, where an access request is missed in the L1 I-cache. The steps of the request procedure are listed below.
Step 1: The processor core 11 sends out a read instruction to the L1 I-cache 12 for a requested instruction.
Step 2: The requested instruction is missed in the L1 I-cache 12, and thus the L1 I-cache 12 sends out a new request to the pre-fetch cache unit 13.
Step 3: The pre-fetch unit 13 finds the requested instruction in a pre-fetch buffer (IPB) 14.
Step 4: The pre-fetch unit 13 transmits the requested instruction to the L1 I-cache 12.
Step 5: The L1 I-cache 12 transmits the requested instruction to the processor core 11.
FIG. 2 shows a schematic diagram illustrating the request procedure of a processor core, where a requested instruction is missed both in the L1 I-cache and the pre-fetch buffer. The steps of the request procedure are listed below.
Step 1: The processor core 11 sends out a read instruction to the L1 I-cache 12 for a requested instruction.
Step 2: The requested instruction is missed in the L1 I-cache 12, and thus the L1 I-cache 12 sends out a new request to the pre-fetch unit 13.
Step 3: The requested instruction is also missed in the instruction pre-fetch buffer 14, and thus the pre-fetch cache unit 13 sends out a new request to a bus interface unit (BIU) 15.
Step 4: The pre-fetch cache unit 13 receives the requested data transmitted from the bus interface unit 15 and sends it to the L1 I-cache 12.
Step 5: The L1 I-cache 12 transmits the requested instruction to the processor core 11.
Typically, the access time of the processor core 11 for acquiring instruction from the L1 I-cache 12 is faster than acquiring instruction from the pre-fetch cache unit 13 by several times, and the access time of the processor core 11 for acquiring instruction/data from the pre-fetch unit 13 is faster than acquiring instruction/data from the bus interface unit 15 by several times. Further, the access time of the processor core 11 for acquiring instruction from the L1 I-cache 12 is faster than acquiring instruction from the bus interface unit 15 by tens of times. Hence, the system performance is enhanced when the hit rates of the L1 I-cache and the pre-fetch unit are increased. However, more complicated pre-fetch algorithms and circuits are needed to increase the hit rate, thus resulting in a high cost and power consumption. Further, pre-fetching tasks may occupy the system bus for a long time, which impacts the system performance. However, prior-art references always focus on how to improve the hit rate but fail to provide a discussion of another factor such as the system performance.