1. Field of the Invention
The present invention relates to cache memory and a cache system, and more specifically to a cache memory and a cache system for realizing quick access, and particularly to improvement of its system buses.
2. Description of the Related Art
A buffer storage (hereinafter referred to as a cache memory) has appeared as effective means for realizing quick access to main memory to meet the needs of a high-speed modern microprocessor. A microprocessor requires the main memory have, relatively, a large capacity so as to store programs and data. Performance of et system using a microprocessor is largely influenced by access time to main memory. Therefore it is desirable to shorten memory access time. However, recently, need to shorten memory access time has exceeded improvement of performance of DRAM (Dynamic Random Access Memory) chips used generally in main memory.
Not only the performance of the DRAM chips, but the enlargement and diversification of a system's size, such as multiprocessor systems consisting of a plural number of microprocessors, requires complicated bus control, and hence, it becomes difficult to shorten the memory access time.
In order to solve the above-mentioned problems, one may use a hierarchical memory. A cache memory system is one of the ways to make this hierarchical memory. This cache memory system includes a quick-access cache memory and the main memory. Cache memroy is provided between the micro-processor and the main memory to realize the quick-access of the main memory equivalently by accessing the cache memory.
FIG. 1 is a view schematically showing a configuration of a conventional cache system. In FIG. 1, the cache memory system comprises, a data processor 1 consisting of a microprocessor, a cache memory 2 for buffering the memory access of the data processor 1, a main memory 3 as the main memory of the cache memory system and a bus driver circuit 4 for controlling the bus connections between the data processor 1, cache memory 2 and main memory 3.
The cache memory 2 stores a part (copy) of memory contents of the main memory 3, and when a copy of data requested to be accessed is not stored in cache memroy, the response to a read access from the data processor 1, results in generating and giving a cache-miss signal 5 to the bus driver circuit 4. Buses 10a and 10b include a data bus, an address bus and a control bus for transmitting control signals (read/write instruction signals etc.). The operation is described in the following.
The cache memory system is one in which, responsive to the request from the data processor 1, data of an area which is used frequently in the main memory 3 is stored in the cache memory 2, which is quick-access buffer storage, and upon request from the data processor 1, the requested data is read and written rapidly from cache in lieu of from the main memory 3.
The cache memory 2 does not store fixed data, but responsive to requests from the data processor 1, the stored area of the main memory 3 stored in cache memory changes. Thus, data which is fetched from the main memory 3 and stored in the cache memory 2 responsive to the request from the data processor 1 has a good chance of being accessed for a while thereafter. Thereby, once the data of the main memory 3 is stored in the cache memory 2, the effect of the cache memory as quick-access memory is exhibited and the no-wait memory access of the data processor 1 is realized.
When the data processor requests data, first, it accesses to the cache memory and tries to read the data. At this time, if the main memory 3 has not been accessed in the area requested hitherto, data has never been transferred to the cache memory 2 from the main memory 3. There is thus no data to access in the cache memory 2. A case where the requested data is not present in the cache memory 2 is called a cache miss, while a case where the requested data is present in the cache memory is called a cache hit. Since the cache memory 2 has caused the cache miss, it generates and gives a cache miss signal 5 to the bus driver circuit 4. Responding to the cache-miss signal 5, the bus driver circuit 4 connects the bus 10a and the bus 10b. By this bus connection, the bus driver circuit 4 passes address information and a read instruction signal from the cache memory 2 to the main memory 3 via the bus 10b.
The main memory 3, by the signal and address given via the bus driver circuit 4, reads the requested data and gives it to the data processor 1 under control of the bus driver circuit 4. At this time, a block including the requested data and having a predetermined size is stored in the corresponding area of the cache memory 2. In some cases, data requested by the data processor 1 is taken in parallel with the data block which is written into the cache memory 2, and sometimes data transferred at the end, after writing the desired data block into the cache memory 2, the requested data is given to the data processor 1. Furthermore, there is an operating configuration wherein data requested in the beginning is given to the data processor, and then the data block to which this requested data belong is written into the cache memory 2. At this time, since the data processor 1 only takes in one-word data and the cache memory 2 writes the block including the requested data, at the time of writing data into the cache memory 2, the data processor 1 is held in a waiting state under control from the bus driver circuit 4.
When the data processor 1 receives the requested data in a unit of block, data are transferred in parallel and simultaneously to the data processor 1 and the cache memory 2. That is, the bus driver circuit 4, responding to the cache-miss signal 5, judges that the data block including the data requested from the main memory 3 must be sent, and controls the data transfer of this data block (this is usually decided by a tag address). When the above-mentioned operation is repeated and a certain extent of data are cached in the cache memory 2 (data copy of the main memory 3 is stored in the cache memory 2), even when the data processor 1 accesses the cache memory 2, the probability the requested data is stored in the cache memory 2 is high. In this case, the requested data is outputted from the cache memory 2 and transmitted to the data processor 1. At this time, since the cache-miss signal 5 is not generated, the bus driver circuit 4 separates the bus 10a and the bus 10b.
By accessing to the quick-access cache memory 2 at the time of cache hit in such a manner, the data processor 1 can access the memory rapidly, thereby the data processor 1 is able to operate rapidly without deteriorating its processing speed and the cache system itself can operate rapidly.
The aforementioned operation is similar in the case of writing the processing result into the main memory 3 by the data processor 1, and when data of the address requested to be accessed by the data processor 1 is held in the cache memory 2, the content requested to he accessed is rewritten by new data. At this time, the content of the main memory 3 is also rewritten by new data. As a data rewriting system of the main memory 3, there is a write-back system in which, after a block in the main memory 3 corresponding only to occurrence of cache miss, or the data block in the main memory 3 including the address to be rewritten is transferred to the cache memory 2, the data is rewritten only in the cache memory 2. Also, there is a write through system in which, at the time of cache hit, data in the cache memory 2 of the address requested to be rewritten and data in the main memory 3 are rewritten, and at the time of cache miss, only data to the address requested to be accessed in the main memory 3 is written.
As stated above, by providing a buffering quick-access cache memory 2 between the main memory 3 and the data processor 1, and accessing the cache memory 2, the data processor 1 can access data rapidly, and as a result, a quick processing can he executed.
However, when an address space required by the data processor 1 becomes larger, capacity of the main memory 3 also increases inevitably. In this case, when only one cache memory 2 of small capacity is used, a memory area of the cached main memory 3 becomes relatively smaller and the cache hit ratio is decreased, thereby the effect of highspeed cache memory system is spoiled. At this time, though the cache memory capacity may be increased, since the quick-access cache memory is expensive, increase in capacity of the expensive cache memory further pushes up the cost, resulting in a high system cost.
In order to solve such problems, it is considered to adopt a multi-cache system which constitutes a large capacity cache system, by using a plural number of small capacity cache memories and caching the different address areas of the main memory 3 by the individual cache memories.
FIG. 2 is a block diagram showing an example of configuration of the above-mentioned multi-cache system. In FIG. 2, the multi-cache system includes, in addition to the data processor 1, the main memory 3 and the bus driver circuit 4, first and second cache memories 2a and b whose address areas in the main memory 3 to be cached are different, a first logic circuit a for generating a cache-miss signal, and a second logic circuit 8 for generating a non-cachable area signal.
The first cache memory 2a generates the first cache-miss signal 5a and the first non-cachable signal 9a. The second cache memory 2b generates the second cache-miss signal 5b and the second non-cachable area signal 9b.
The first logic circuit 7 consists of an "OR" circuit which receives the first and second cache-miss signals 5a, 5b, generates a third cache-miss signal 5c indicating that the cache miss occurs in the cache memory 2 (the cache memory consisting of the first and second cache memories 2a, 2b) when either the first cache-miss signal 5a or second cache-miss signal 5b is generated, and gives it to the bus driver circuit 4.
The second logic circuit 8 consists of an "AND" circuit which receives the first non-cachable area signal 9a and the second non-cachable area signal 9b, generates a third non-cachable area signal 9c indicating that the non-cachable area of the cache memory 2 is accessed, when both the first and second non-cachable area signals 9a and 9b are generated, and gives it to the bus driver circuit 4.
The first and second cache memories 2a and 2b cache address areas as shown in FIG. 3.
In FIG. 3, the main memory 3 includes four address areas A, B, C, and D. The address area A is the address area whose most significant side two bits of the address are "00", the address area B is the address area whose most significant side two bits of address are "01", the address area C is the address area whose most significant side two bits of the address are "10", and the address area D is the address area whose most significant side two bits of the address are "11". These address areas may be either physical spaces or logical spaces.
The first cache memory 2a caches data in this address area B and the second cache memory 2b caches data in the address area D. For judging whether the address areas are its own cachable areas or not, the first and second cache memories 2a and 2b respectively include comparing circuits, which compare the most significant side two bits of the address with values of the most significant side two bits in the address ares allocated to themselves, and detect the conformity and nonconformity. When the address requested to be accessed is the address outside the area allocated to themselves, the comparing circuits generate the non-cachable area signals 9a or 9b.
Next, the operation of the prior art is described.
The first cache memory 2a caches data when the most significant side two bit of the address are "01", and the second cache memory 2b caches data when the most significant side two bits of the address are "11". That is, in a 32-bit address space, the address space of address 40000000H to 7FFFFFFFH (H represents a hexadecimal notation) is a cache space for the first cache memory 2a, and the address space of address C0000000H to FFFFFFFFH is a cache space for the second cache memory 2b.
In such an environment, when the data processor 1 accesses the address 40000000H, this address is given to the first and second cache memories 2a and 2b in parallel. Since this address is in its own cachable area, the first cache memory 2a does not generate the non-cachable area signal 9a. Subsequently, when the first cache memory 2a judges that this address is in its own caching area, it checks whether there is a data copy of the address accessed to itself. As a result, when there is no copy, it is judged to be a cache miss and the cache-miss signal 5a is generated.
The first logic circuit 7 generates the third cache-miss signal 5c responding to the first cache-miss signal 5a and gives it to the bus driver circuit 4. Meanwhile, since the first non-cachable area signal 9a is not generated, the second logic circuit 8 does not generate the third non-cachable area signal 9c.
Responding to the states of the third cache-miss signal 5c and the third non-cachable area signal 9c, the bus driver circuit 4 judges that, though address in the cachable area is accessed the cache miss occurs in the cache memory 2, and connects the bus 10a and the bus 10b. Responding to the third cache-miss signal 5c, the bus driver circuit 4 accesses the corresponding address data of the main memory 3, transfers a block including the data of accessed address to the first cache memory 2a, and transmits data requested to be accessed to the data processor 1. Data transfer to the data processor 1 may be made after or before transferring the data block to the first cache memory 2a, or the data block may be transferred to the data processor 1.
When the second cache memory 2b is generating the cache-miss signal 5b, it is similar to the above-mentioned operation, the second cache memory 2b and the main memory 3 are connected to transfer the data block.
When the data processor 1 accesses the address 80000000H, the first cache memory 2a compares the accessed address with the address area allocated to itself, and judges whether the address requested to be accessed is in the cachable area or not. In this case, since the address 80000000H is in the non-cachable area for both the first and second cache memories 2a and 2b, the first and second non-cachable area signals 9a and 9b are generated respectively.
When the first and second non-cachable area signals 9a and 9b are generated, in the first and second cache memories 2a and 2b, it is not necessary to judge whether there is a corresponding address copy therein or not, and the cache-miss signals 5a and 5b are not generated. Thus, in this case, the third non-cachable area signal 9c is generated by the second logic circuit 8, while the third cache-miss signal 5c is not generated.
Responding to the generated non-cachable area signal 9c, the bus driver circuit 1 connects the bus 10a and bus 10b and merely transfers data of the address 80000000H requested to be accessed by the data processor 1 to the data professor 1. Since the cache-miss signal 5c is not generated at this time, the bus driver circuit 4 merely controls the operation of transferring of one-word data such that the block transferring operation to the cache memories 2a or 2b does not occur (only when the data block is not received by the data processor).
The above-mentioned cache judging operation is performed similarly at the time of writing data.
By providing a plural number of cache memories and allocating independent cachable a the cache memories as mentioned hereinabove, consequently, the cache memory capacity can be increased by using the caches memories of small-capacity and the cache hit ratio and system performance can be improved.
As mentioned above, for example, by utilizing the difference between a locality of access of the program instruction sequence and a locality of data access and by allocating respective address areas to the cache memories, the cache hit ratio in a large-capacity address space can be improved using the small-capacity cache memories.
However, each of the cache memories can only judge whether a given address is in its own non-cachable area or not, and it can not judge whether it is in a comprehensive cachable area in the cache memory system or not, and also each of the cache memories is not able to know which address area is allocated to the other cache memories. Thus, when designing the system, the address areas must be allocated to the cache memories such that the cachable areas of the respective cache memories do not duplicate. For this end, it is necessary to analyze the data arrangement structure in an address space of the main memory and allocate the most efficient address area, resulting in a long analyzing time.
For judging whether the given address is in the own cachable area or not, the cache memory has to compare addresses therein. And hence, when the number of cache memories increases, naturally the number of address bits to be compared increases. It takes a long time (for example 27 ns) to judge that it is cachable by this comparison, and judging whether cache hit or not is done after the result of judging whether cachable or not, so that quick accessibility of the cache memory is degraded.
The bus driver circuit 4 needs to receive all the non-cachable area signals through the one-stage logic circuit (second logic circuit 8), till it judges that all of the cache memories are generating the non-cachable area signals, therefore, it takes a long time for the bus driver circuit 4 to connect the data processor 1 and the main memory 3 responding to the output of the second logic circuit 8, and hence, the high-speed operability of the system is deteriorated.
In the case of conventional multi-cache system having the configuration as shown in FIG. 2 as described above, when the cachable areas are allocated to the cache memories and access of the data processor 1 and the main memory 3 is controlled by the bus driver circuit 4, on the basis of the cachable judging result from the all cache memories, usually, the wait time required in case of accessing of the main memory 3 by the data processor 1 in a signal cache system is further lengthened, and the quick accessibility of the cache memory system is spoiled.