1. Field of the Invention
The present invention relates to a cache memory in which invalidating processing or copy-back processing is performed to invalidate data of only a data block of a specified address or to write back the data to a main memory.
2. Description of Related Art
A cache memory and a main memory are used for a large scale integration circuit (hereinafter, called LSI) having a central processing unit (hereinafter, called CPU). That is, pieces of data of a part of areas of the main memory are written in data blocks of the cache memory. Also, a bus master such as a direct memory access controller (hereinafter, called DMA controller) impossible to perform a cache access to the cache memory is used for a direct memory access. In a case where the bus master not performing the cache access gains access to the main memory to perform a DMA transfer between the main memory and another main memory or a memory other than the cache memory, to maintain the coherency between pieces of data of a part of areas of the main memory and pieces of data of the data blocks of the cache memory, invalidating processing and/or copy-back processing are first performed for the cache memory according to a software (or a program) executed in the CPU. Thereafter, the DMA transfer is performed under control of the bus master.
Because data of the main memory is changed to updated data due to the DMA transfer, data of each data block of the cache memory differs from updated data of the corresponding area of the main memory. Therefore, pieces of data of the cache memory are invalidated in the invalidating processing.
Also, in a case where updated data transmitted from the CPU through a data bus is written in a remarked data block of the cache memory, the updated data stored in the remarked data block of the cache memory differs from old data stored in a corresponding area of the main memory. When the bus master gains access to the main memory to read out data of the main memory in the DMA transfer, it is required to change the old data of the main memory to the updated data before the DMA transfer. Therefore, the copy-back processing is performed for the cache memory and the main memory to write back the updated data of the remarked data block of the cache memory to the corresponding area of the main memory.
FIG. 8 is a block diagram of a conventional cache memory. In FIG. 8, a conventional cache memory has a memory access control unit 1, a tag memory 2 and a data memory 3. The tag memory 2 and the data memory 3 are respectively operated in synchronization with a clock signal. The memory access control unit 1 controls the tag memory 2 and the data memory 3 by sending a plurality of control signals (memory enable signals, write enable signals, address input signals and data input signals) to the tag memory 2 and the data memory 3. In the data memory 3, a plurality of data blocks are placed to store pieces of cached data sent from a main memory (not shown). In the tag memory 2, pieces of tag information are stored in a plurality of entries. Each data block placed in the data memory 3 is specified according to the tag information stored in the corresponding tag block of the tag memory 2. A memory enable signal S10 is input to the tag memory 2 to control an access operation to the tag memory 2. A write enable signal S11 is input to the tag memory 2 to control the writing of data to the tag memory 2. An address input signal S12 is input to the tag memory 2 to specify an address of a specific tag block. A data input signal S13 indicating data is input to the tag memory 2 to write tag information in the specific tag block of the tag memory 2 specified by the address input signal S12. A memory enable signal S15 is input to the data memory 3 to control an access operation to the data memory 3. A write enable signal S16 is input to the data memory 3 to control the writing of data to the data memory 3. An address input signal S17 indicating the same address as that indicated by the address input signal S12 is input to the data memory 3 to specify an address of a specific data block. A data input signal S18 indicating data output from the main memory and the CPU is input to the data memory 3 to write the data in a data block of the data memory 3 specified by the address input signal S17. Circuits relating to the access operation performed under the control of a CPU (not shown) are omitted in FIG. 8.
In a read operation for the conventional cache memory, a data output signal S14 indicating tag information is output from the tag memory 2. A data output signal S19 indicating output data is output from the data memory 3.
The tag information of each tag block of the tag memory 2 is output with the cached data of the corresponding data block of the data memory 3. This tag information includes a tag address indicating a part of an address of an area of the main memory corresponding to the tag block of the tag memory 2, and a combined address obtained by combining the tag address and the address indicated by the address input signal S12 indicates the address of the area of the main memory corresponding to the data block of the data memory 3. Also, in a case where a request of the invalidating processing or the copy-back processing is sent to the memory access control unit 1, the invalidating processing or the copy-back processing is performed in the conventional cache memory.
The tag information stored in each tag block of the tag memory 2 has a valid bit and a dirty bit. The valid bit of the tag information indicates whether cached data stored in the corresponding data block of the data memory 3 is valid or invalid. The dirty bit of the tag information indicates whether or not cached data of the corresponding data block of the data memory 3 differs from original data in the main memory and whether or not it is required to write back the cached data to the main memory.
Next, an operation of the invalidating processing will be described below.
FIG. 9 is a timing chart of the invalidating processing performed for all areas of the conventional cache memory shown in FIG. 8.
In FIG. 9, in a case where the memory enable signal S10 set to a high level and the write enable signal S11 set to a high level are output from the memory access control unit 1 to the tag memory 2, the write access is performed for the tag memory 2 in synchronization with a clock signal S30. Also, in a case where the memory enable signal S10 set to the high level and the write enable signal S11 set to a low level are output from the memory access control unit 1 to the tag memory 2, the read access is performed for the tag memory 2 in synchronization with the clock signal S30.
When the CPU recognizes that the invalidating processing for the conventional cache memory is needed, an invalidating processing request signal S31 set to a high level is transmitted from the CPU to the memory access control unit 1 according to a software (or a program) executed in the CPU.
In response to the invalidating processing request signal S31 of the high level, the memory access control unit 1 controls the tag memory 2 and the data memory 3. In detail, the memory enable signal S10 set to the high level and the write enable signal S11 set to the high level are input to the tag memory 2. Also, an address input signal S12 indicating a top address “0” of the tag memory 2 is input to the tag memory 2 with a data input signal S13 indicating a valid bit set to “0” in synchronization with the clock signal S30.
In the tag memory 2, the valid bit set to “0” is written in a tag block of the tag memory 2 specified by the address input signal S12. The valid bit set to “0” indicates that the cached data of the corresponding data block of the data memory 3 is invalid. Also, a valid bit set to “1” indicates that the cached data of the corresponding data block of the data memory 3 is valid.
Thereafter, in the memory access control unit 1, the address input signal S12 is incremented to specify a next address of the tag memory 2 corresponding to a next data block of the data memory 3, and a valid bit set to “0” is written in a next tag block of the next address of the tag memory 2 in the same manner. This writing operation is performed for all addresses of the tag memory 2. When the writing operation of the valid bit for all addresses ranging from the top address “0” to a final address “N” is completed, all tag blocks of the tag memory 2 have the valid bit set to “0”, and the invalidating processing is completed.
Next an operation of the copy-back processing will be described below.
FIG. 10 is a timing chart of the copy-back processing performed for all areas of the conventional cache memory shown in FIG. 8.
In FIG. 10, a copy-back processing request signal S32 is output from the CPU to the cache memory. In a case where data of the cache memory is written back to the main memory, a busy signal S33 is set to a high level. Also, the busy signal S33 is set to a low level in a case where the writing-back of data to the main memory is completed. The busy signal S33 set to the high level is transmitted from the CPU to the memory access control unit 1 to temporarily stop the read access performed under the control of the memory access control unit 1 during the writing-back to the main memory. A dirty bit S34 set to a high level is included in the data output signal S14. The dirty bit S34 of the high level indicates that the writing-back of data to the main memory is needed due to the difference between data of the cache memory and data of the main memory. This difference occurs when data transmitted from the CPU is written in the cache memory.
When the CPU recognizes the necessity of the copy-back processing, a copy-back processing request signal S32 set to the high level is transmitted from the CPU to the memory access control unit 1 according to a software (or a program) executed in the CPU. Also, in the memory access control unit 1, in response to the copy-back processing request signal S32 of the high level, memory enable signals S10 and S15 set to the high level are input to the tag memory 2 and the data memory 3 respectively, and address input signals S12 and S17 indicating the top address “0” of the tag memory 2 and the top address “0” of the data memory 3 respectively are input to the tag memory 2 and the data memory 3 respectively.
Thereafter, the valid bit and the dirty bit S34 are output as a data output signal S14 from the tag block of the tag memory 2 specified by the address input signal S12. Also, data is output as a data output signal S19 from a data block of the data memory 3 indicated by the address input signal S17. In a case where the valid bit set to “1” (or high level) and the dirty bit set to “1” (or high level) are output, because the data output from the data memory 3 is valid, the writing-back of the data from the data memory 3 to the main memory is needed. In contrast, in a case where the valid bit set to “0” (or low level) or the dirty bit set to “0” (or low level) is output, the writing-back of the data from the data memory 3 to the main memory is not needed.
In a case where the valid bit set to “0” or the dirty bit set to “0” included in the data output signal S14 is received in the CPU, the address input signals S12 and S17 are incremented by the memory access control unit 1 so as to indicate a next address of the tag memory 2 and a next address of the data memory 3 respectively, and none of other signals is changed. In contrast, in a case where the valid bit set to “1” and the dirty bit set to “1” are received in the CPU, because the copy-back processing is performed in following clock cycles as described later, the dirty bit set to “1” is not needed. Therefore, in the memory access control unit 1, the write enable signal S11 is set to the high level, the address input signals S12 and S17 indicating the same addresses of the tag memory 2 and the data memory 3 are again input to the tag memory 2 and the data memory 3 respectively, and the dirty bit set to “0” is written in a tag block specified by the address input signal S12.
When the valid bit set to “1” and the dirty bit set to “1” are received in the CPU, the CPU judges that the writing-back of the data to the main memory is needed, the data output from the data memory 3 is written back to the main memory in following clock cycles. During the writing-back of the data, the CPU sets the busy signal S33 to the high level, and the memory access control unit 1 sets the memory enable signals S10 and S15 to the low level together. Though the address input signals S12 and S17 are incremented by the memory access control unit 1 in response to the leading edge of the busy signal S33, the increment of the address input signals S12 and S17 is stopped during both a time period of the high level of the busy signal S33 and one clock cycle after the time period.
When the writing-back of the data to the main memory is completed, the busy signal S33 is set to the low level by the CPU, the memory enable signals S10 and S15 are again set to the high level together, the read access to the tag memory 2 and the data memory 3 is restarted by using the address input signals S12 and S17 already incremented, and the read operation for a next address of the tag memory 2 and a next address of the data memory 3 is performed in the same manner. Thereafter, tag information and data are read out from the tag memory 2 and the data memory 3 one after another. When the writing-back of data of the final address “N” is completed, the copy-back processing is completed.
The invalidating processing and the copy-back processing described above are performed for all areas of the cache memory. Also, as is disclosed in a patent literature (pp. 3-8, FIG. 1 and FIG. 2 of Published Unexamined Japanese Patent Application No. 2001-134490), the invalidating processing and the copy-back processing are performed for a specified entry or a plurality of specified entries of the cache memory, or the invalidating processing and the copy-back processing are performed for a data block or a plurality of data blocks of the cache memory corresponding to an address or a plurality of addresses.
Also, in a case where a bus master such as a DMA controller gains access to specific areas of a main memory to perform a DMA transfer for the specific areas of the main memory, the invalidating processing and the copy-back processing are performed before the DMA transfer.
However, because the conventional cache memory has the above-described configuration, when the bus master gains access to a part of areas of the main memory, even though the part of areas of the main memory do not correspond to all areas of the conventional cache memory, the invalidating processing and the copy-back processing are inevitably performed for all areas of the conventional cache memory. In detail, in a case where a remarked data block of the data memory 3 does not correspond to any of the areas of the main memory relating to the accessing of the bus master, the invalidating of data of the remarked data block of the data memory 3 is not needed. However, data of the remarked data block not needed to be invalidated is undesirably invalidated. Therefore, when the CPU gains access to the conventional cache memory to read out data from the remarked data block of the data memory 3, there is high probability that the CPU cannot get the data from the conventional cache memory. In other words, the cache miss occurs at high probability. In a case where the cache miss occurs, a cache replacement is performed to write data of an area of the main memory corresponding to the remarked data block of the data memory 3. Therefore, because the cache replacement for the conventional cache memory is many times performed, a problem has arisen that the processing time for the cache replacement many performed is required.
Also, in a case where a remarked data block of the data memory 3 does not correspond to any of the areas of the main memory relating to the accessing of the bus master, the writing-back of data of the remarked data block of the data memory 3 to the main memory is not needed. However, in a case where the valid bit of “1” and the dirty bit of “1” are set in a tag block of the tag memory 2 corresponding to the remarked data block of the data memory 3, data of the remarked data block not needed to be written back to the main memory is written back to the main memory in the copy-back processing. Therefore, another problem has arisen that the processing time is required in vain to perform the writing-back operation for data not needed to be written back.
Also, even in a case where the invalidating processing and the copy-back processing are performed in the conventional cache memory by specifying each entry of the conventional cache memory according to a software architecture, the invalidating processing and the copy-back processing are performed for all data blocks of the specified entries. Therefore, the same problems occur. Also, in a case where the invalidating processing and the copy-back processing are performed while specifying each address of the conventional cache memory by using a software architecture, it is required to specify each address, for which the invalidating processing and the copy-back processing are needed, by using the software architecture. Therefore, another problem has arisen that the processing time is increased as a size of an area requiring the invalidating processing or the copy-back processing is enlarged.
In the patent literature, it is disclosed that the invalidating processing or the copy-back processing are performed only for areas of a main memory needing the invalidating processing or the copy-back processing by sending a request from the CPU to the main memory only once. In detail, in a case where the invalidating processing or the copy-back processing for the main memory and a cache memory is needed, areas of the main memory to be processed according to the invalidating processing or the copy-back processing are first specified, addresses from a top address to a final address in the specified areas of the main memory are specified one after another while incrementing the specified address, and the invalidating processing or the copy-back processing for a data block of the cache memory corresponding to each specified address of the main memory is performed.
However, there is a case where an address size of the areas of the main memory to be processed according to the invalidating processing or the copy-back processing is considerably large as compared with a size of a cache memory. For example, a cache memory having an address size of 1 KB is used for a main memory, and areas of the main memory corresponding to the invalidating processing have an address size of 1 MB. In this case, an examined address of the main memory is set while incrementing the examined address in the address area of 1 MB, and it is examined whether or not data of each examined address of the main memory is cached in a data block of the cache memory. If data of one examined address of the main memory is cached in a data block of the cache memory, it is required to perform the invalidating processing for the cached data of the data block of the cache memory corresponding to the examined address of the main memory. In this case, each time it is examined whether or not data of one examined address of the main memory is cached in the cache memory, it is required that the CPU gains access to the cache memory. Therefore, a problem has arisen that it takes a lot of processing time to perform the invalidating processing for the main memory having a large size and the cache memory.