The present invention relates to a cache memory apparatus having a plurality of cache memories and a computer readable recording medium on which a program for controlling the cache memory is recorded thereon. More particularly, this invention relates to a cache memory apparatus being capable of avoiding system down caused by occurrence of a parity error and a computer readable recording medium on which a program for controlling the cache memory is recorded thereon.
With the popularization of personal computers, high-speed performance is further demanded. Accordingly, systems each comprising a cache memory apparatus constituted by a plurality of cache memories to achieve high-speed access are popularly used. On the other hand, since an amount of processing performed by a computer increases, emphasis is required to be given on the improvement in reliability of such systems. It is also desired that the system continues to operate without stopping the system even if a minor failure occurs.
FIG. 28 is a block diagram showing the configuration of a conventional cache memory apparatus. Th e cache memory apparatus shown in FIG. 28 comprises a multiple cache memory (primary cache memory 13 and secondary cache memory 14) to eliminate a difference in the processing speed between a CPU (Central Processing Unit) 11 and a main memory device 16. The CPU 11 accesses the primary cache memory 13, the secondary cache memory 14, or the main memory device 16 to read/write data. The main memory device 16 is a hard disk drive for example. The main memory device 16 has, as characteristic features, a large capacity and an access time which is longer than that of the primary cache memory 13 or the secondary cache memory 14. All the data that is used by the CPU 11 is stored in the main memory device 16.
The primary cache memory 13 and the secondary cache memory 14 are SRAMs (Static Random Access Memories) for example, and has, as characteristic features, an access time which is shorter than that of the main memory device 16. The primary cache memory 13 also has, as characteristic features, an access time which is shorter than that of the secondary cache memory 14. More specifically, of the primary cache memory 13, the secondary cache memory 14, and the main memory device 16, the primary cache memory 13 has the shortest access time, the secondary cache memory 14 has an access time which is longer than that of the primary cache memory 13, and the main memory device 16 has the longest access time. In addition, with respect to a memory capacity, the memory capacity of the main memory device 16 is largest, the memory capacity of the secondary cache memory 14 is second largest, and the memory capacity of the primary cache memory 13 is the smallest.
Data transmission between a CPU and a cache memory (main memory device) is generally performed in units of lines. Several methods are available which allow data on the main memory device to correspond to lines in the cache memory. As a typical method, the following set associative method is known. That is, the main memory device and the cache memory are divided into a plurality of sets (set of lines: called way), and data on the main memory device can be placed on determined lines in the way. The set associative method including N ways is called an N-way set associative method. A method in which a cache memory is handled as one way is called a direct mapping method (or one-way set associative method).
The primary cache memory 13 stores a part of the data stored in the main memory device 16, and is a memory using a 4-way set associative method as shown in FIG. 30. As shown in FIG. 30, primary cache memory 13 is constituted by a primary tag RAM 13a for holding addresses A, B, C, and D or the like of data a, b, c, and d or the like, and a primary data RAM 13b for holding the data a, b, c, and d or the like. The primary tag RAM 13a and the primary data RAM 13b are divided into a plurality of ways to be managed. The way of the primary tag RAM 13a and the way of the primary data RAM 13b correspond to each other in a one-to-one relationship. For example, the address A held in a unit (to be referred to as an entry) constituting way 0 in the primary tag RAM 13a and the data a held in the entry of way 0 in the primary data RAM 13b correspond to each other in a one-to-one relationship.
The secondary cache memory 14 is a memory for storing part of data held in the main memory device 16. This secondary cache memory 14 uses a direct mapping method. As shown in FIG. 30, the secondary cache memory 14 is constituted by a secondary tag RAM 14a for holding tag information consisting of an address ADR, an INCL bit, and a way WAY, and a secondary data RAM 14b for holding the real data. The address ADR represents the address of data held in the secondary data RAM 14b. The INCL bit represents whether corresponding data is held in the primary data RAM 13b or not. The INCL bit is xe2x80x9c1xe2x80x9d if the data is held in the primary data RAM 13b, and it is xe2x80x9c0xe2x80x9d if no data is held in the primary data RAM 13b. The way WAY represents the number of a way in the primary cache memory 13 in which the corresponding data is held.
Referring again to FIG. 28, a primary cache access control device 12 controls access from the CPU 11 to the primary cache memory 13, and compares the address of data to be read with an address of the primary tag RAM 13a according to a read request from the CPU 11. When the addresses are equal to each other (this state is called cache hit), the primary cache access control device 12 performs control or the like to read the data corresponding to the address. On the other hand, when the addresses are not equal to each other (this state is called cache miss), the primary cache access control device 12 performs control to access the secondary cache memory 14. A secondary cache access control device 15 controls access from the CPU 11 to the secondary cache memory 14, and compares the address of data to be read with an address of the secondary tag RAM 14a. In case of cache hit, the secondary cache access control device 15 performs control or the like to read the data corresponding to the address. In case of cache miss, the secondary cache access control device 15 performs control or the like to access the main memory device 16.
Operation of a conventional cache memory apparatus will be explained below with reference to FIG. 28, FIG. 30, and the flow chart shown in FIG. 29. In step SA1 shown in FIG. 29, the primary cache access control device 12 checks whether a read request is generated by the CPU 11. If the check result is xe2x80x9cNoxe2x80x9d, this check is repeated. The read request is a request that data should be read from the primary cache memory 13, the secondary cache memory 14 or the main memory device 16.
For example, when a read request for requesting that the data e of the address E shown in FIG. 30 should be read is generated by the CPU 11, the primary cache access control device 12 sets the check result in step SA1 as xe2x80x9cYesxe2x80x9d. In this manner, in step SA2, the primary cache access control device 12 accesses the primary cache memory 13 (r1 in FIG. 30) to compare the address E with an address held in the primary tag RAM 13a.
In step SA3, the primary cache access control device 12 checks whether the address E is present in the primary tag RAM 13a or not, i.e., whether cache hit is established or not. In this case, since the address E is not present in the primary tag RAM 13a, the primary cache access control device 12 determines cache miss (r2 in FIG. 30) to set the check result in step SA3 in xe2x80x9cNoxe2x80x9d. If the check result in step SA3 is xe2x80x9cYesxe2x80x9d, in step SA4, the CPU 11 reads a data corresponding to the address E from the primary data RAM 13b. 
In step SA5, the secondary cache access control device 15 accesses the secondary cache memory 14 (r1 in FIG. 30). In step SA6, the secondary cache access control device 15 checks whether an INCL bit of xe2x80x9c1xe2x80x9d exists in the secondary tag RAM 14a. In this case, since the INCL bit of the address D is xe2x80x9c1xe2x80x9d, the secondary cache access control device 15 sets the check result in step SA6 in xe2x80x9cYesxe2x80x9d and then shifts the process to step SA7. When the INCL bit of xe2x80x9c1xe2x80x9d does not exist in the secondary tag RAM 14a, the secondary cache access control device 15 considers the check result in step SA6 as xe2x80x9cNoxe2x80x9d.
Since the INCL bit and the way are xe2x80x9c1xe2x80x9d and xe2x80x9c3xe2x80x9d respectively, with respect to the address D in the secondary cache memory 14, it is understood that the latest data d of the address D is held in way 3 in the primary data RAM 13b. Data dxe2x80x2 corresponding to the address D in the secondary data RAM 14b is a data that is older than the data d, and it is the data that must be updated.
In step SA7, the secondary cache access control device 15 writes back the data of the corresponding address in the primary cache memory 13 to the secondary cache memory 14. More specifically, in this case, the secondary cache access control device 15 writes back the data d (latest data) existing in the primary data RAM 13b to a region corresponding to the address D in the secondary data RAM 14b with reference to the address D existing in way 3 of the primary tag RAM 13a (r3 in FIG. 30). In this manner, the data dxe2x80x2 (old data) corresponding to the address Din the secondary data RAM 14b is updated to the data d (latest data).
In step SA8, the secondary cache access control device 15 checks whether an address E exists in the secondary tag RAM 14a or not, i.e., whether cache hit is established or not. In this case, since the address E exists in the secondary tag RAM 14a, the secondary cache access control device 15 considers the check result in step SA8 as xe2x80x9cYesxe2x80x9d. In the next step SA9, the CPU 11 reads data e corresponding to the address E from the secondary data RAM 14b. The secondary cache access control device 15 moves the read data e and the address E of the data e from the secondary cache memory 14 into the primary cache memory 13 (r4 in FIG. 30).
When some data exists in the secondary cache memory 14, and that data does not exist in the primary cache memory 13, the data is moved from the secondary cache memory 14 into the primary cache memory 13 to shorten an access time for the data next time. In this case, the secondary cache access control device 15 updates the address D of way 3 in the primary tag RAM 13a to the address E in the secondary tag RAM 14a. Similarly, the secondary cache access control device 15 updates the data d of way 3 in the primary data RAM 13b to the data e in the secondary data RAM 14b. 
On the other hand, if the check result in step SA8 is xe2x80x9cNoxe2x80x9d, the CPU 11 accesses the main memory device 16 in step SA10. In step SA11, the CPU 11 reads the data corresponding the address E from the main memory device 16. In this manner, in the conventional cache memory apparatus, when data corresponding to a certain address is to be read, the CPU 11 accesses the primary cache memory 13, then the secondary cache memory 14, and finally the main memory device 16.
As described above, in the conventional cache memory apparatus, as shown in FIG. 30, at a certain timing, the data d (latest data) held in the primary cache memory 13 (primary data RAM 13b) is written back to the secondary cache memory 14 (secondary data RAM 14b), so that the data dxe2x80x2 (old data) is updated to the latest data. In this case, the data d is written back to the region of the address D in the secondary cache memory 14 with reference to the address D of the data d in the primary tag RAM 13a. 
Recently, in answer to the demand for reduction in size and high-density storage, integrated circuits each having a high degree of integration have been frequently used as the memory elements in the primary cache memory 13 and the secondary cache memory 14. In such an integrated circuit having a high degree of integration, since the constituent parts of the circuit are minute, it is known that, in addition to a hardware error which is a fixed failure such as disconnection of the circuit itself, a failure called a software error occurs. This software error is a phenomenon in which bits are inverted at random due to a small radiation from a very minute radiation source contained in the package of the integrated circuit that envelops the memory elements.
In this manner, in case of write back, when the software error occurs in the address D of the primary tag RAM 13a shown in FIG. 30, an address to which the data is written back becomes unknown. Therefore, data d cannot be written back to the region of the address D in the secondary cache memory 14. In this case, a serious failure that causes immediate system down occurs, and the reliability of the apparatus is degraded. In particular, since the degrees of integration of circuits is increasing every year, the probability of occurrence of the software errors is also increasing. An effective solution to the problem is earnestly desired.
It is an object of this invention to provide a cache memory apparatus capable of avoiding system down and capable of improving the reliability of the apparatus, and a computer readable recording medium on which a program for controlling the cache memory is recorded thereon.
The cache memory apparatus according to one aspect of the present invention comprises a primary cache memory having at least one way, which way having at least one entry; an error detection unit which detects an error in an entry of the way; a secondary cache memory which holds data, a registration position information and a status information of data in said first cache memory; a replace prohibition unit which, when error is detected in an entry of a way by said error detection unit, prohibits that particular way from being replaced; a write back unit which, when an error is detected in an entry of a way by said error detection unit, writes back the data held in that particular entry of the way in said primary cache memory to an entry of said secondary cache memory; a release unit which releases the prohibition of replacement of that particular way of said primary cache memory upon completion of the write back operation by said write back unit; and a write unit which, when the entry of said secondary cache memory is accessed, writes the data which is written back in the entry in said primary cache memory.
According to the above invention, when a parity error occurs in the entry of the primary cache memory, the way including the entry in which the error is detected is prohibited by the replace prohibition unit from being replaced, and the data held in the entry is written back to the entry of the secondary cache memory by the write back unit. In this manner, data obtained before the parity error occurs is held in the secondary cache memory. Upon completion of the write back operation, the prohibition of replacement is released by the release unit, and the data which is written back is written in the entry of the primary cache memory by the write unit, so that the status before the parity error occurs is set.
Thus, when a parity error occurs, after the data is written back from the primary cache memory to the secondary cache memory, the data is written from the secondary cache memory into the primary cache memory. Thus, even if a parity error occurs, the data can be normally read from the secondary cache memory. Therefore, according to the present invention of the first aspect, even if a parity error occurs in the primary cache memory, system down is avoided, and the reliability of the apparatus is improved.
The cache memory apparatus according to another aspect of the present invention comprises a primary cache memory having at least one way, which way having at least one entry; an error detection unit which detects an error in an entry of the way; a secondary cache memory which holds data, a registration position information and a status information of data in said first cache memory; a replace prohibition unit which, when error is detected in an entry of a way by said error detection unit, prohibits that particular entry of the way from being replaced; a write back unit which, when error is detected in an entry of a way by said error detection unit, writes back the data held in that particular entry of the way in said primary cache memory to an entry of said secondary cache memory; a release unit which releases the prohibition of replacement of that particular entry of the way in said primary cache memory upon completion of the write back operation by said write back unit; and a write unit which, when the entry of said secondary cache memory is accessed, writes the data which is written back in the entry in said primary cache memory.
According to the above invention, when a parity error occurs in the entry of the primary cache memory, the entry in which the error is detected is prohibited by the replace prohibition unit from being replaced, and the data held in the entry is written back to the entry of the secondary cache memory by the write back unit. In this manner, data obtained before the parity error occurs is held in the secondary cache memory. Upon completion of the write back operation, the prohibition of replacement is released by the release unit, and the data which is written back is written in the entry of the primary cache memory by the write unit, so that the status before the parity error occurs is set.
Thus, when a parity error occurs, after the data is written back from the primary cache memory to the secondary cache memory, the data is written from the secondary cache memory into the primary cache memory. Therefore, even if a parity error occurs in the primary cache memory, system down is avoided, and the reliability of the apparatus is improved. Further, since an object to be prohibited from being replaced is narrowed to the entry, another entry which can be used in this way is not prohibited from being accessed.
Further, a write back operation is performed at the moment an error is detected by the error detection unit. Therefore, a period of time extending from when a parity error occurs to when the status before the parity error occurs in the primary cache memory can be shortened.
Further, a write back operation is performed at any timing after the error is detected by the error detection unit. Therefore, when the parity error occurs, the parity error does not adversely affect access to another entry pending.
The cache memory apparatus according to still another aspect of the present invention comprises a primary cache memory having a plurality of entries; an auxiliary memory having a plurality of entries whose bit fields are equal to those of entries in said primary cache memory; an error detection unit which detects an error in an entry of said primary cache memory; a secondary cache memory which holds data, a registration position information and a status information of data in said first cache memory; an auxiliary memory selection unit which, when an error is detected in an entry of said primary cache memory, makes a corresponding entry in said auxiliary memory valid in place of the entry of said primary cache memory in which the error has occurred; a write back unit which, when an error is detected in an entry of said primary cache memory, writes back the data held in that particular entry of said primary cache memory to an entry of said secondary cache memory; and a write unit which writes the data which is written back in an entry in said auxiliary memory upon completion of the write back operation by said write back unit.
According to the present invention of the above aspect, when an error in the entry of the primary cache memory is detected by the error detection unit, an auxiliary memory is selected by the auxiliary memory selection unit in place of the entry. The data held in the entry is written back to the entry of the secondary cache memory by the write back unit. In this manner, data obtained before the parity error occurs is held in the secondary cache memory. Upon completion of the write back operation, the data which is written back is written in the entry in the auxiliary memory by the write unit.
Thus, when a parity error occurs in the entry of the primary cache memory, the auxiliary memory is used as a backup in place of the entry. Therefore, the cache memory apparatus can be operated as if no parity error occurs.
The present invention according to still another aspect provides a computer readable recording medium on which a program for controlling a cache memory is recorded thereon, which program causes a computer to execute an error detection step of detecting an error in an entry of a way of a primary cache memory, which primary cache memory having at least one way, and which way having at least one entry; a replace prohibition step of, when an error is detected in an entry of a way in the error detection step, prohibiting that particular way from being replaced; a write back step of, when an error is detected in an entry of a way in the error detection step, writing back the data held in that particular entry of the way in said primary cache memory to an entry of a secondary cache memory, which secondary cache memory holds data, registration position information and status information of data in said primary cache memory; a release step of releasing the prohibition of replacement of that particular way of said primary cache memory upon completion of the write back operation in the write back step; and a write step of, when the entry of said secondary cache memory is accessed, writing the data which is written back in the entry in said primary cache memory.
According to the above invention, when a parity error occurs in the entry of the primary cache memory, the way including the entry in which the error is detected is prohibited by the replace prohibition step from being replaced, and the data held in the entry is written back to the entry of the secondary cache memory by the write back step. In this manner, data obtained before the parity error occurs is held in the secondary cache memory. Upon completion of the write back operation, the prohibition of replacement is released in the release step, and the data which is written back is written in the entry of the primary cache memory in the write step, so that a status before the parity error occurs is set.
Thus, when a parity error occurs, data is written back from the primary cache memory to the secondary cache memory and then written from the secondary cache memory in the primary cache memory. For this reason, even if the parity error occurs, the data can be normally read from the secondary cache memory. Therefore, even if a parity error occurs in the primary cache memory, system down is avoided, and the reliability of the apparatus is improved.
The present invention according to still another aspect provides a computer readable recording medium on which a program for controlling a cache memory is recorded thereon, which program causes a computer to execute an error detection step of detecting an error in an entry of a way of a primary cache memory, which primary cache memory having at least one way, and which way having at least one entry; a replace prohibition step of, when an error is detected in an entry of a way in the error detection step, prohibiting that particular entry from being replaced; a write back step of, when an error is detected in an entry of a way in the error detection step, writing back the data held in that particular entry of the way in said primary cache memory to an entry of a secondary cache memory, which secondary cache memory holds data, registration position information and status information of data in said primary cache memory; a release step of releasing the prohibition of replacement of that particular entry of said primary cache memory upon completion of the write back operation in the write back step; and a write step of, when the entry of said secondary cache memory is accessed, writing the data which is written back in the entry in said primary cache memory.
According to the above invention, when a parity error occurs in the entry of the primary cache memory, the entry in which the error is detected is prohibited by the replace prohibition step from being replaced, and the data held in the entry is written back to the entry of the secondary cache memory by the write back step. In this manner, data obtained before the parity error occurs is held in the secondary cache memory. Upon completion of the write back operation, the prohibition of replacement is released by the release step, and the data which is written back is written in the entry of the primary cache memory by the write step, so that a status before the parity error occurs is set.
Thus, when a parity error occurs, data is written back from the primary cache memory to the secondary cache memory and then written from the secondary cache memory in the primary cache memory. For this reason, even if the parity error occurs in the primary cache memory, system down is avoided, and the reliability of the apparatus is improved. Therefore, since an object to be prohibited from being replaced is narrowed to the entry, another entry which can be used in this way is not prohibited from being accessed.
The present invention according to still another aspect provides a computer readable recording medium on which a program for controlling a cache memory is recorded thereon, which program causes a computer to execute an error detection step of detecting an error in an entry of a primary cache memory, which primary cache memory having a plurality of entries; an auxiliary memory selection step of, when an error is detected in the error detection step, making an auxiliary memory valid in place of the entry of said primary cache memory in which the error has occurred, which auxiliary memory having a plurality of entries whose bit fields are equal to those of entries in said primary cache memory; a write back step of, when an error is detected in the error detection step, writing back the data held in that particular entry in said primary cache memory to an entry of a secondary cache memory, which secondary cache memory holds data, a registration position information and a status information of data in said first cache memory; and a write step of writing the data which is written back in an entry in said auxiliary memory upon completion of the write back operation in the write back step.
According to the above invention, when an error in the entry of the primary cache memory is detected in the error detection step, an auxiliary memory is selected in the auxiliary memory selection step in place of the entry. The data held in the entry is written back to the entry of the secondary cache memory in the write back step. In this manner, data obtained before the parity error occurs is held in the secondary cache memory. Upon completion of the write back operation, the data which is written back is written in the entry in the auxiliary memory in the write step.
Thus, when a parity error occurs in the entry of the primary cache memory, the auxiliary memory is used as a backup in place of the entry. Therefore, the cache memory apparatus can be operated as if no parity error occurs.
The cache memory apparatus according to still another aspect of the present invention comprises a primary cache memory having at least one way; an error detection unit for detecting errors in entries constituting the way of the primary cache memory; a secondary cache memory for storing registration position information and status information of the data in the primary cache memory; an access prohibition unit for prohibiting an access to the primary cache memory when an error is detected by the error detection unit; a write-back unit for accessing every entry of the secondary cache memory in which the data corresponding to an entry where an error occurs may be present when the error is detected and writing back the data stored in a concerned entry of the primary cache memory to an entry of the secondary cache memory in accordance with the registration position information and register status information; a restoration unit for restoring an entry in which an error is detected to a state free from error after the write-back is completed when the status information for the entry in which the above error is detected is an error and invalid; and a release unit for releasing prohibition of an access to the primary cache memory after the write-back is completed.
According to the above invention, when an error occurs in an entry of the primary cache memory, an access to the primary cache memory is prohibited. Therefore, an access does not occur after the error occurs. In this case, though the data in an entry in which the error occurs is lost, the registration position information and status information of the data in the primary cache memory and data are stored in the secondary cache memory. Then, every entry of the secondary cache memory in which the data corresponding to the entry in which an error occurs may be present is accessed and thereafter, data is written back from the primary cache memory to the secondary cache memory.
Then, when the status information for the entry in which an error is detected is an error and invalid, the entry in which an error is detected after write-back is completed is restored to a state free from error. Then, after the write-back is completed, prohibition of an access to the primary cache memory is released by the release unit.
Thus, according to the above invention, when an error occurs in an entry of the primary cache memory, an access to the primary cache memory is prohibited and the entry is restored to a state free from error by the restoration unit after writing back. Therefore, it is possible to avoid a trouble of detecting another error while eliminating the above error and moreover, avoid system down and improve the reliability of the apparatus.
The cache memory apparatus according to still another aspect of the present invention comprises a primary memory having at least one way; multi-hit error detection unit for detecting multi-hit errors in entries constituting the way of the primary cache memory; a secondary cache memory for storing registration position information and status information of the data in the primary cache memory; an access prohibition unit for prohibiting an access to the primary cache memory when a multi-hit error is detected by the multi-hit error detection unit; a write-back unit for accessing every entry of the secondary cache memory in which the data corresponding to an entry where a multi-hit error occurs may be present when the multi-hit error is detected and writing back the data stored in a concerned entry of the primary cache memory to an entry of the secondary cache memory in accordance with the registration position information and status information; a restoration unit for restoring an entry in which a multi-hit error is detected to a state free from multi-hit error after the write-back is completed when the status information for the entry in which the multi-hit error is detected is a multi-hit error and invalid; and a release unit for releasing prohibition of an access to the primary cache memory after the write-back is completed.
Thus, according to the above invention, when a multi-hit error occurs in an entry of the primary cache memory, an access to the primary cache memory is prohibited and write-back is performed so as to restore the entry to a state free from multi-hit error by the restoration unit. Therefore, it is possible to avoid a trouble of detecting another error while eliminating the above multi-hit errors, avoid system down, and improve the reliability of the apparatus.