For example, data handled in a system which includes a processor such as a central processing unit (CPU) and a main memory such as a dual inline memory module (DIMM) is generally protected using a code such as an error correcting code (ECC). In a case where an error of 1 bit occurs in the data, the error can be corrected using the code. However, it is difficult to correct an error of 2 bits or more using the code, and the error of 2 bits or more left as an uncorrectable error (hereinafter, referred to as a UE).
The memory such as the DIMM is provided with an ECC area for storing an ECC besides a data area for storing data, and a memory controller can directly write information such as the ECC in the ECC area. Then, in a case where the UE occurs in write data from the processor to the memory, the processor writes a special code in the ECC area through the memory controller, so that the error which has occurred in the write data can be definitely separated from other errors as a Marked-UE.
For example, a special syndrome (0x7f) indicating a 3-bit error is written in the ECC area as a special code indicating the Marked-UE, and separated from a normal ECC. Further, in the data area corresponding to the ECC area, information (error ID (IDentification), error occurrence cause information) which can be used to specify an occurrence place of the Marked-UE is also stored.
Accordingly, when data is read from the memory, if it is found out that the special code is written in the ECC area corresponding to the read data, the processor can identify that the data is data having the UE. Therefore, in a case where data unnecessary for the process of the processor has the UE, the system can be operated as it is. On the other hand, in a case where the data having the UE is read, since the data is identified as the data having the UE, the processor can cope with the UE in some way.
In addition, in a case where the data area having the Marked-UE is overwritten by data having no Marked-UE in the memory, the data area comes into a state having no Marked-UE, so that the system keeps on the operation.
In recent years, as a memory, a hybrid memory cube (HMC) is proposed in that a dynamic random access memory (DRAM) chip having a three dimensional structure is connected using through-silicon vias (TSVs) for the purpose of an increase in process speed. As described above, the processor (memory controller) can directly write any information such as the Marked-UE other than the ECC in the ECC area of the memory such as the DIMM. On the contrary, the ECC is automatically written in the ECC area of the memory such as the HMC, so that the processor (memory controller) is not allowed to directly write any information such as the Marked-UE other than the ECC.
As described above, in a case where information on an error (for example, the UE) occurred on the upstream side (the processor side) at the time of writing data in the memory is not be stored, the processor is not allowed to identify whether the UE occurs in the data in the memory. Therefore, when an error occurs, it is considered that the entire page having the error is cut off from the system by an operating system (OS).
However, in a case where the entire page is cut off from the system, there is a problem in that when the error occurs during a high criticality process, the termination of the high criticality process is inevitable until the cutting-off process is ended.
Further, since the area having the error is cut off only in units of pages, a necessarily cutting-off range will be widened when a large page is employed. Therefore, there is a problem in that a range of an area having no error to be cut off together with the area having the error is widened, and the use efficiency of the data area of the memory is dropped.