This invention relates to a cache device, particularly to a cache device connected between at least one processing unit and a main memory shared by a plurality of the processing units.
In a conventional cache device, when a load request to a main memory results in a cache miss, block data including the data indicated by the load request is fetched to the main memory. The block data returned from the main memory are written into the cache memory, and at the same time the desired data is so treated as to be sent back to a processing unit which issues the load request. In addition, to increase the handling capacity of cache, the device is provided with a non-blocking hardware mechanism by virtue of which the cache can manage, even when a cache miss occurs, subsequent load requests up to two requests.
Incidentally, a conventional information processing system, to increase its information processing capacity, is so constructed as to have its main memory shared by a plurality of processors, and with such system it often occurs that load requests from different processors must be handled at the same time. In such situations, namely, when processing requests from different processors compete with each other over the main memory, block load data moved from the main memory to the processor are not sent back in a sequential order, but haphazardly with the intervals between adjacent block data being not constant. To meet such situation, block load data are handled by smaller basic data units.
On the other hand, in the conventional cache device, an identical path acts at the same time as a reply data path through which target data contained in the block data delivered by the main memory is returned to a general purpose register, and as a read path through which, when cache contains the target data, that is, when a cache hit occurs, the required target data is read from the cache. Accordingly, when the cache device returns the target data contained in the block load data delivered by the main memory to the processing unit which issued the load request, the data from a data array becomes impossible because the involved path therefor has been taken over in competition. Thus, subsequent instructions to load data into the main memory are inhibited from accessing to the cache, and hence handling of subsequent instructions is interrupted, which will result in a lowered handling capacity of the cache.
For example, in the data processing system described in Japanese Unexamined Patent Publication No. 7-69863, a non-blocking hardware mechanism is incorporated which, even when successive load requests from a main memory encounter cache misses, ensures successive accesses to the memory. With this data processing system, an identical path acts at the same time as a reply data path through which target data contained in block data delivered by the main memory is returned to a general purpose register, and as a read path through which, when cache contains the target data, that is, when a cache hit occurs, the required datum is read from cache. Accordingly, when the cache device returns the target data contained in block load data delivered by the main memory to a processing unit which issues the load request, data from a data array becomes impossible because the involved path therefor has been taken over in competition. Thus, subsequent instructions to load data into the main memory are blocked for their access to cache, and hence handling of subsequent instructions is interrupted, which will result in a lowered handling capacity of cache.
FIG. 3 is a block diagram illustrating one example of the above-described conventional cache devices. In the figure, an instruction control unit 2 handles instructions one after another in order under the command of a program counter, and when a given instruction to be handled concerns fetching data from a main memory, it registers the address of main memory to be accessed into an EA (Effective Address) register 11.
The address to be registered into EA register 11 is constituted of three kinds of address data: one is an intra-block address which indicates the address within block data to be fetched, a second is a cache index address which determines the access address of cache using block data as basic units, and the third is a tag address which uses the cache capacity as a basic unit and employs an address exceeding the cache capacity as a search address.
The address of an address array 12 is utilized as an index by which a corresponding cache index address of EA register 11 is referred to. An address which has been obtained after a search through the address array 12 is compared with a corresponding tag address of EA register by a comparator 13. When it is found as a result of comparison that the two are identical, namely, when a cache hit occurs, it indicates that desired data is in the cache. When it is found that the two are not identical, namely, when a cache miss occurs, it indicates that desired data is not in the cache. The desired datum must be fetched from a main memory.
When a cache hit occurs, the tag data read from the address array 12 is compared with the corresponding tag data of EA register 11, and, when the two are identical, the hit result is delivered to a hit/miss register 14 for registration. At the same time, the cache index address of EA register 11 is registered into EA1 register 16.
Corresponding data of data array 22 are read with reference to the address of EA1 register, and a selector 23a selects desired data from data read from the data array 22, depending on the hit data provided by the hit/miss register 14, and places it in a register A 24. Then, the selected data is written through a register B 5 into a general purpose register 7a.
In the event of a cache miss, the tag data read from the address array 12 is compared with the corresponding tag data of EA register 11, and, when the two are not identical, the miss result is delivered to the hit/miss register 14 for registration. A start signal to fetch data from the main memory which is generated as a result of miss occurrence is delivered to an address control unit 3. Further, the addresses of data which are to be fetched as block data from the main memory are delivered from EA register 11 to the address control unit 3. The address control unit 3 converts the logic addresses into physical addresses, fetches block data from the main memory 4 and places them into a cache device la.
Within the cache device la, the selector 15a selects write addresses delivered through a signal line 31, and data received by a reply register 17 are written into the data array 22.
Incidentally, when a store instruction is delivered to the main memory to change the given data, and its original data is in cache, it is necessary to update the data in cache as well as that in the main memory. In such a case, the selector 15a selects a write address of EA register 11, and the selector 21a selects a corresponding write data of signal line 32, and writes it into the data array 22.
The main memory 4 is connected to the other information processing units, and, when access requests arrive from different units, competition for processing arises over the main memory. As a result, block load data returned from the main memory to the cache device will arrive in an irregular order.
To put it more specifically, because the aforementioned block load data are controlled in terms of basic data units (e.g., eight bytes), no restrictions are imposed on the order by which the block load data are returned to cache. Block load data returned to compensate for a request resulting in a cache miss occur as a cluster of eight reply data, and the cluster comprising eight reply data contains desired data (to be referred to as target data hereinafter) to be written into the general purpose register 7. When the target data is returned from the main memory 4, it is received temporarily by the reply register 17, and written via registers A 24 and B 5 into the general purpose register 7a.
With a conventional cache device, even when returning of block data from the main memory to cache requires a long time, cache busy signals are continuously asserted to arrest the handling of subsequent cache access requests, from the time when the cache receives the first reply unit data to the time when it receives the last reply unit data, for avoiding difficulties involved in handling of those subsequent cache access requests.
FIG. 4 is a timing chart representing the operation of cache device la described above. At the timing 1, a request for data fetch from the main memory is dispatched by the instruction control unit 2 and judged to encounter a cache miss at timing 2, and the cache miss is registered into the hit/miss register 14. To counteract the cache miss, data must be fetched anew from the main memory, and to execute this, address conversion is performed by the address control unit 3, logical addresses are converted into physical addresses, and from corresponding addresses of the main memory 4, block load data are fetched.
Over at the main memory 4, competition with requests from other information processing units occurs, and thus the block load data return with their order disturbed. More specifically, data a3 returns at timing 9, data a2 at timing 10, data a5 at timing 13, data a1 at timing 17, data a6 at timing 18, data a4 at timing 22, data a7 at timing 23 and data a8 at timing 26.
As the target data, data al, returns at timing 17, it is registered to the register A 24 at timing 18, to the register B 5 at timing 19, and to the general purpose register 7a at timing 20.
In above sequence of events, as the block load data start to arrive from timing 9 onward, cache busy signals are continuously asserted from timing 9 until the whole block load data have been received by the cache. In this particular example, cache busy signals are relieved when timing 26 is reached.
On the other hand, a subsequent request for fetch of data from the main memory 4 arise at timing 12. Because the cache busy signal continues to be active from the timing 9 to the timing 26, the subsequent requests are ignored, and no request is not excuted until timing 27 when cache busy signals are relieved. At timing 27, cache search is practiced for one of the subsequent requests for data fetch from the main memory, and when it is found that there is a cache hit, the target data is read from the data array, and at timing 31 that data is written into the general purpose register 7.
The conventional cache device whose operation proceeds as described above has following problems. In a cache device incorporating a non-blocking mechanism which can manage, even when a cache miss occurs, subsequent memory access instructions without ignoring them, the desired data (the target data) contained in block data returned from the main memory to compensate for the cache miss must be written into the general purpose register 7. During this process, however, because the cache incorporates a non-blocking mechanism, at least one subsequent data fetch instruction may access the cache. At the timing when the subsequent data fetch instruction accesses the cache, the target data delivered from the main memory 4 and the data selected in response to the subsequent fetch instructions may compete for the path to the general purpose register. To avoid such inconvenience, in such a case, cache access by any subsequent data fetch instruction is inhibited, and the registration of target data in the block data fetched from the main memory is allowed to take a precedence. In this case, while the target data of the block load data is returned to the cache, access of the cache by subsequent instructions is inhibited, which leads to a lowered efficiency of memory access management.
The reason why such inefficient memory access will result lies in the fact that the cache is so constructed as to allow an identical path to act at the same time as a data path for data returned from the main memory, and as an access path when a cache hit occurs. Further, when block data returned from the main memory are directly stored in cache, it may happen that cache busy signals are activated continuously for a long time, and in the mean time execution of subsequent instructions for fetch of data from the main memory is halted. This is to prevent those subsequent instructions from getting access to the cache because data may arrive at the cache any time from the main memory.