1. Field of the Invention
The present invention relates to a data processing apparatus and method for handling corrupted data values.
2. Description of the Prior Art
In a known data processing apparatus, generally 1, as illustrated in FIG. 1A, there is provided a processor core 10 arranged to process instructions received from a memory 20 via a bus 30. Data required by the processor core 10 for processing those instructions may also be retrieved from the memory 20 via the bus 30. A peripheral device may also be coupled to the bus 30.
The processor core 10 is illustrated in more detail in FIG. 2A, with more detail of the interfaces being illustrated in FIG. 2B. Typically, the processor core 10 comprises a processor 12, a cache 14 and a bus interface unit (BIU) 16. The cache 14 is provided for storing data values (which may be data and/or instructions) retrieved from the memory 20 so that they are subsequently readily accessible by the processor 12. The cache 14 will store the data value associated with a memory address until it is overwritten by a data value for a new memory address required by the processor 12. The data value is stored in cache 14 using either physical or virtual memory addresses. Should the data value in the cache 14 have been altered then it is usual to ensure that the altered data value is re-written to the memory 20, either at the time the data is altered or when the data value in the cache 14 is overwritten.
The BIU 16 is used to retrieve and store data values between the cache 14, the memory 20 and the peripheral device 40. For example, should there be a cache miss when accessing the cache 14, the BIU 16 will initiate a read from the memory 20. The memory 20 then outputs the data value at the address specified. The BIU 16 will then pass the data value to the cache 14, where it can be stored and also read by the processor 12. Subsequently, that data value can readily be accessed directly from the cache 14 by the processor 12. Between the processor 12 and cache 14 are provided an address bus (CACHE_ADD) over which the address associated with a data value is passed, read data bus (CACHE_Rdata) over which data values read from the cache 14 are passed, write data bus (CACHE_Wdata) over which data values to be stored in the cache 14 are passed and a command bus (CACHE_CMD) over which instructions are provided to the cache 14. Between the cache 14 and the BIU 16 are provided an address bus (BIU_ADD) over which the address associated with a data value is passed, read data bus (BIU_Rdata) over which data values read from the BIU 16 are passed, write data bus (BIU_Wdata) over which data values read from the cache 14 are passed and an command bus (BIU_CMD) over which the operation to be performed by the BIU is provided.
The memory 20 and the cache 14 typically store each element or bit of a data value in a memory cell. An example configuration of a so-called “static” memory cell, which is typically employed with the cache 14, is illustrated in FIG. 1B. It will be appreciated that the arrangement shown in FIG. 1B is generic and that the detailed implementation of such static memory may vary slightly. Each memory cell typically includes two inverters 300, 310 and a tristate gate 320. To write a data value into the cell, a WRITE_ENABLE signal is asserted on the trisate gate 320 which drives the inverse of the value of the WRITE_DATA signal onto the node 330. The weak feedback provided by the inverter 310 forms a positive feedback loop to hold the value on the node 330 constant, thereby storing the value of the WRITE_DATA signal in the cell. The weak feedback provided by the inverter 310 is weak enough to be overridden by the tristate gate 320 when writing a data value. The value of the WRITE_ENABLE signal can then be read by sensing the value of the READ_DATA signal.
A problem exists in that external environmental factors can affect the feedback provided by the inverter 310, which may cause the value stored on the node 330 to change state. Such external factors include the instance of a gamma ray on, for example, an active part of the inverter 310. Also, electromagnetic radiation in proximity to the memory device can result in a change in state. Similarly, fluctuations in the voltage supplied to the memory device can cause a change in state. In each of these illustrative situations, an external influence can lead to a corruption in the data values stored in the memory. Such corruption in a data value is often referred to as a “soft error”. It will be appreciated that whilst the occurrence of such soft errors can be statistically low, increasing the density of a memory results in an increased likelihood that a soft error will occur.
To address the problem of soft errors, error detection and correction techniques are known. One such error detection technique is a parity check, whilst error correction coding is an example of an error correction technique. In the parity check technique, the number of bits in a data value set to, for example, ‘1’ is calculated and, if the result is an odd number, then a parity bit appended to the data value is set to, for example, ‘1’; conversely, if the result is an even number, then the parity bit appended to the data value is cleared to, for example, ‘0’. By using this technique it is possible to detect with increased confidence whether or not a data value has been corrupted.
It will be appreciated that whilst the parity check technique increases the likelihood of detecting a corruption, a predetermined period is required to perform the parity check. Error correction coding typically takes even more time to perform than the parity check. In typical data processing apparatus, memory accesses become a critical path within the processor, ultimately limiting the speed at which the processor can be run. Hence, any increase in the time taken to access a data value, such as to perform a parity check, can result in the speed at which the processor can be run being reduced.
In view of the above, because the cache 12 is typically of a relatively low density when compared with memory 20, the likelihood of a soft error occurring is also lower. Hence, traditional caches 12 did not employ such a parity check technique. This is because it is undesirable to adversely affect the normal performance of the cache 12 in order to deal with a statistically infrequent event. In contrast, however, the memory 20 is typically of a relatively high density and so the likelihood of a soft error occurring is increased. Hence, it is known to employ a parity check in the memory 20. Implementing the parity check is possible because the speed at which the data values are accessed in the memory 20 is typically relatively slow in comparison to accesses to the cache 12. Accordingly, although the access time to the memory 20 is increased, there is typically little or no overall performance reduction between a memory 20 which employs parity check and one which does not.
In certain applications, such as in so-called “embedded systems”, the processor core 10 provides data values to the peripheral device 40 and that peripheral device 40 is a so-called safety-critical or fault-intolerant device such as an airbag controller, a brake system controller or the like. The operation of safety-critical devices needs to be well understood and predictable in order to gain the appropriate certification for the use of that device. Because the operation of the peripheral device 40 is dependent upon the data values provided by the processor core 10, when the peripheral device 40 is a safety-critical device, an increased assurance is required that no persistent change in the state of the peripheral device 40 can occur based on corrupted data values. Such changes in the state of the peripheral device 40 could result in, for example, the inappropriate activation of an air bag actuator or a brake servo, or the erroneous transmission of confidential information, such as information relating to a credit card transaction.
With each generation of processor core 10, the density of the cache 12 typically increases. Hence, a problem exists in that the likelihood of a soft error occurring in the cache 12 also increases. If the processor core 10 is to provide data values to a safety-critical device, then assurance is required that no persistent change in the state of the device can occur based on corrupted data values caused by, for example, a soft error. However, as mentioned above, accesses to the cache 12 are a critical path in the performance of the processor core 10. Accordingly, any increase in the time taken to access a data value from the cache 12 will adversely reduce the performance of the processor core 10. Hence, whilst a parity check could be performed in the cache 12, as increasingly occurs the L1 cache of a desktop processor system, because the likelihood of a soft error occurring is still relatively low in comparison to the likelihood of an error not occurring, it is undesirable to affect the overall performance of the processor core 10 to deal with such infrequently-occurring events.
Accordingly, it is desired to provide an improved technique for handling corrupted data values which can retain the performance of the processor core during normal operation whilst ensuring that corrupted data values do not cause a change in the state of a peripheral device.