There are memory systems whose data contents are safeguarded via an EDC code in such a way that a number of redundant bits are additionally stored under the address of the data word. These bits are called control word bits, or K-bits for short, and arise through the formation of the parity sum over particular parts of the data word, which is standardly called EDC coding (xe2x80x9cEDCxe2x80x9d stands for Error Detection Code). During the reading out of the memory word, the sub-parities are again formed, and are compared with the likewise read-out allocated K-bits. If all K-bits are equal, it is concluded that the read-out data word is free of errors. In the case of a non-equality, the type of error is inferred from the pattern of the non-agreement, which is called the syndrome pattern.
The K-bit positions that do not agree are called syndromes. Particular syndrome patterns are decoded, and in this way the falsified bit position in the data word is determined, if necessary, and. is corrected by inversion.
The formation of the K-bits (EDC coding), the comparison, the decoding of the syndromes, as well as the correction and, if necessary, the alerting of a higher-order control unit, currently takes place standardly with the aid of special controller modules, also referred to as EDC controllers in the following.
In FIG. 1, on the basis of what is called an EDC code table it is shown via which bit positions of a data word the K-bits are formed in an EDC controller.
In FIG. 1, the character xe2x80x9cXxe2x80x9d means that the allocated data bit N (00= less than N= less than 31) is included in the parity formation for the check bit C (C0= less than C= less than C7) . The character xe2x80x9c0xe2x80x9d next to the lines of the lower half of the memory word means that the associated C-bit is equal to 1 when the number of xe2x80x9c1sxe2x80x9d included in the parity formation in the entire useful bit part is odd. The character xe2x80x9cExe2x80x9d next to the lines of the lower half of the memory word means that the associated C-bit is equal to 0 when the number of xe2x80x9c1sxe2x80x9d included in the parity formation in the entire bit part is odd. The two last-named statements thus relate to both halves of the memory word.
FIG. 1 presumes data words that comprise 32 data bits. Eight control bits C0, C1, C2 . . . C7 are allocated to these data bits, which control bits are respectively formed by parity formation over particular bit positions of a data word. The entire memory word, i.e. the useful word (address or data) plus the control word, thus comprises 40 bits. These are organized in DRAM memory modules with a cell width of four bits.
On the basis of the control bits formed according to the EDC code tables, one-bit errors can be recognized with certainty, and lead to odd-numbered syndrome patterns. In addition, a correction of one-bit errors can be carried out, since an unambiguous syndrome pattern is fixedly allocated to each error bit position within a useful word. This syndrome pattern can be decoded and thus used for the correction of the errored bit.
Finally, multi-bit errors can be recognized. Double bit errors always lead for example to an even-numbered syndrome pattern not equal to 0, and are thus recognized with certainty as multi-bit errors. The additional even-numbered multi-bit errors likewise always lead to even-numbered syndrome patterns, whereby the zero syndrome arises with a probability of 1/128, since at this ECC width (number of K-bits) there are a total of 128 even-numbered syndrome patterns. Thus, these errors lead immediately to a multi-bit error alert, with a probability of 99.2%.
Odd-numbered multi-bit errors lead to odd-numbered syndrome patterns, whereby the syndrome patterns of 1-bit errors can also arise. Thus, these errors are recognized immediately as multi-bit errors, with a probability of 68.75%. This number arose as follows:
Given an 8-bit ECC width, there are a total of 128 odd-numbered syndrome patterns. Of these 128 patterns, 40 are reserved for 1-bit errors. There thus remain 128xe2x88x9240=88 patterns for odd-numbered multi-bit errors. The probability that one of these patterns is hit in an arbitrary odd-numbered multi-bit error is thus 88/128=68.75%.
In sum, it results that arbitrary multi-bit errors are alerted immediately as multi-bit errors with a probability of 215/256=84%. The even-numbered multi-bit errors, which cause the zero syndrome in 1 of 128 cases, have hereby also been taken into account. This number in turn results as follows:
Given an 8-bit ECC width, there are a total of 256 syndrome patterns. Of these 256 patterns, 40 are reserved for 1-bit errors, and one pattern is the null syndrome pattern. There thus remain 256xe2x88x9240xe2x88x921=215 patterns for multi-bit errors.
The probability that one of these patterns is hit given an arbitrary multi-bit error is thus 215/256=84%.
In FIG. 2, on the basis of what is called an EDC code table it is shown via which bit positions of an address word the K-bits are formed in an EDC controller. For the explanation of the representation in FIG. 2, the same holds as in FIG. 1.
If an error is present in the controlling of the memory units (e.g. memory modules) that are controlled in common, i.e. in parallel, in the context of a memory access, syndrome patterns can result that mimic a correctable one-bit error, and thus are not recognized as errors of the controlling. Other errors are also conceivable, e.g. failure of the write pulse, that cannot be recognized at all via the EDC controller.
The problem named can be reduced considerably if the memory units (i.e. memory modules) that are activated in common during the reading are supplied by several control signals of the same type that originate from self-contained control units. In this case, only the failure of one of these signals need be reckoned with, whereby e.g. data and control bits of different memory words can be mixed with one another during the reading out. However, despite this measure, designated measure A) for short in the following, it is still possible, though with low probability, that one-bit errors or even freedom from error are mimicked.
The last-named problem can however be prevented by suitable partitioning of the data and control bits to the memory units in connection with the associated choice of the EDC codes (see FIG. 1). From FIG. 1, it can be seen that a segment of the control word that is not stored together with that segment of the control word in which there is a one-bit falsification (e.g. the data word segment DWT1 with the control word segment KWT1) can respectively contribute only an even number to the syndrome pattern. On the other hand, a segment of the data word that is stored together with the segment of the control word (e.g. the segment DWT1 with KWT2) can contribute only an odd number to the syndrome pattern. However, the latter case cannot take place given errors that arise from the false controlling (addressing) of a memory unit. Thus, given a false controlling only even-numbered syndrome patterns can arise.
The suitable partitioning of the data and control bits to the memory units in connection with the associated selection of the EDC code is designated as measure B) for short in the following.
The general formation rule for the cited partitioning given more than two memory medium units is explained in more detail in the German patent application P 35 28 902.3-31 (SAG-internal GR 84 P 1995).
Apart from the cited errors, multi-bit errors can also occur within the memory system during the transfer of the memory words between the memory and the memory control unit, which multi-bit errors can be falsely recognized as one-bit errors by the EDC controllers in the memory or, respectively, in the memory control unit.
The underlying aim of the invention is to improve the recognizability of the last-named multi-bit errors.
By means of the inventive partial cross-connection of the doubled line paths between the memory (CMYM) and the memory control unit (CMYC), the recognizability of multi-bit errors is improved considerably.
In general terms the present invention is a memory system having the following elements. A memory stores memory words that respectively have a data word and a control word. The memory has two memory units. A segment of the data word, together with a segment of the control word, is respectively stored in each memory unit. The memory has two error monitoring means that carry out an error monitoring of the memory word on the basis of the control word. A memory control unit controls the memory, which also has two error monitoring means for error monitoring between the memory control unit and the memory. A doubled line structure connects the memory control unit and the memory for the doubled transfer of the memory words between the memory control unit and the memory. The doubled line structure between the memory and the memory control unit is partially cross-connected, such that one of the two segments of the data word is cross-connected.
The named error monitoring means carry out the error monitoring such that they produce a control word as a particular formation rule, using a coding means, from the memory word to be monitored, compare the bits of this control word (K-bits) with the K-bits contained in the memory word, and, given inequality, infer the type of error from the pattern of the equal and unequal K-bits, called the syndrom pattern. The named formation rule is selected such that given a one-bit error, the named comparison yields an odd number of unequal K-bits, whereby an even number of unequal K-bits respectively contribute to the odd number from those segments of the control word that are not stored together with that segment of the data word in which the one-bit falsification is present. This embodiment has the advantage that the recognizability of multi-bit errors is further improved substantially.
The named segment-by-segment partitioning of the data bits and K-bits to the two memory units of the memory system (given a predetermined EDC code) is selected such that an even number of K-bits are involved in an odd syndrome pattern based on a one-bit error, which K-bits are not stored together with that segment of the data word in which the one-bit falsification is present. This embodiment has the effect that the recognizability of multi-bit errors is ensured with almost one hundred percent reliability.