1) Field of the Invention
The present invention relates to a system controller, a speculative fetching method, and an information processing apparatus for executing speculative fetching in a memory before determining whether data requested by a memory fetch request is in a cache by searching tag information of the cache.
2) Description of the Related Art
Improvements in operating frequencies of a large scale integration (LSI) have been noticeable in recent years, and the time taken to access the memory has become relatively slow in comparison with the processing time in the LSI. According to one solution, a system controller (SC) receives a memory fetch request from a central processing unit (CPU) or an input/output (I/O) controller (IOP), and, before determining whether the requested data is stored in a cache, performs speculative fetching in which a request for memory fetching is sent to a memory controller (MAC) (see, for example, Japanese Patent Application Laid-open Publication No. 2000-29786 and Japanese Patent Application Laid-open Publication No. 2001-167077).
The SC holds information relating to all CPUs (hereinafter, “tag information”) such as the data addresses of data stored in the cache of the CPU, update status, and the like. In response to the memory fetch request, the SC determines whether the requested data is in the cache by searching the tag information. Instead of accessing the memory when the requested data is not in the cache, the SC speculatively accesses the memory in the stage before determining whether the requested data is in the cache, at the same time as searching the tag information.
This speculative fetching allows memory access to start early, and, when the data is not in the cache, shortens the memory access waiting time (hereinafter, “latency”).
In speculative fetching, when the data requested by the memory fetch request is in the cache of the CPU, the SC requests the CPU that holds the requested data to move it out, transfers the move-out data to the apparatus that is the source of the request, and discards the response data that is speculatively fetched.
Discarding of the response data that is speculatively fetched leads to a possibility that hardware resources, such as a buffer and a bus, which are consumed in speculative fetching, may have delayed processes other than the memory fetch request. The reason is that it may have been possible to execute other processes if the speculative fetching had not been executed. Therefore, speculative fetching has a drawback that it sometimes results in poor latency, since other processing is to be delayed.
When using the bus to send the speculative fetching response data first, move-out data from other CPUs must wait before using the same bus, leading to a problem that speculative fetching actually makes the memory access latency worse.
In view of the features of most programs, memory fetch requests tend to be generated in addresses that are relatively close to each other, in concentration over a short time period. Since speculative fetching is activated by memory fetch requests, speculative fetches are also liable to be generated in concentration. Therefore, the load tends to concentrate on one SC among a plurality of SCs whose addresses are interleaved, while the loads of the other SCs are light, a situation that leads to a problem of even more frequent speculative fetches by the speculative fetching mechanism.
It is determined whether to use the speculative fetch response data based on the search result of all the CPUs. Therefore, until the cache search result is clear, the speculative fetch response data must be stored somewhere in the system and make a queue with the cache search result later.
Accordingly, when the speculative fetch response data arrives before the cache search result, if the improvement in the memory latency achieved by speculative fetching is to be utilized fully, the best queue is the SC that is nearest to the apparatus that issued the memory fetch request (hereinafter, “terminal SC”).
In transferring the response data to the request source apparatus, the terminal SC does not require hardware resources on the transfer path of the response data, and can therefore immediately transfer the speculative fetch response data, obtained from the cache search result, to the request source apparatus, when use of the response data is confirmed.
However, when the terminal SC is the queue, even after it has been decided to discard the speculative fetch response data based on the cache search result, the speculative fetch response data must be transferred to the terminal SC. This results in a problem that, when there is a high load on the transfer path to the terminal SC, hardware resources are further wasted by speculative fetch response data that is to be discarded, further increasing the load.