1. Field of the Invention
The invention relates to a memory module containing a plurality of memory devices of the RAM type. As is known, the acronym RAM refers to read/write memories with direct and random access to the memory cells (random access memory). A preferred, but not exclusive, field of application of the invention constitutes memory arrangements comprising dynamic RAM devices (so-called DRAM devices), such as are customary as main memories in computers.
2. Description of the Related Art
Memory devices usually have a data port with a plurality L of parallel data terminals (data pins) in order to input and output in each case a group of L useful bits in parallel form. In the case of the memory devices that are customary at the present time, the number L is preferably an integral power of 2 and is defined by corresponding configuration of the device; customary configurations are those as ×4 device (L=4), ×8 device (L=8) and ×16 device (L=16). A controller usually serves as a source of the data to be input (write data) and a sink of the data that are output (read data), which controller also supplies the control and address bits in order to control the operation of the memory device and to select those memory cells of the memory device to which the useful bits that are input are intended to be written and from which the data to be output are intended to be read.
The transfer of the L-bit groups between the data terminals of the memory device and the controller is effected in a clock-controlled manner via a bundle of L parallel lines. In the case of “single data rate” (SDR) operation, the clock cycle of this transfer is equal to the memory clock cycle, that is to say that precisely L bits are simultaneously written or read at L selected memory cells with each memory clock cycle. In the case of m-fold data rate operation, the data transfer between the memory device and the controller is effected with a clock rate which is twice as fast as the clock rate of the memory accesses (m=2, double data rate DDR) or four times as fast (m=4, DDR2) or eight times as fast (m=8, DDR3). In these cases, during each memory clock cycle, m different L-cell groups (in m different areas of the devices) are addressed in parallel for an access in order to write or read m L-bit words in parallel. By contrast, the external transfer of the words is effected serially with an m-fold memory clock rate, a prefetch register being used for the parallel/serial conversion during reading and for the serial/parallel conversion during writing in order to collect the m L-bit words of each access.
In order to realize RAM data memories having a high storage capacity and/or having a high data throughput, it is customary for a plurality of memory devices of identical type which are in each case integrated on a chip, have the same storage capacity and are designed or set for the same x configuration (that is to say also the same number L of data terminals) to be combined to form a module on a circuit board. Modules are generally organized in such a way that in each case K devices of the module are accessed simultaneously in parallel operation in order, during each access, to write in or read out a data word comprising L*K parallel bits (the symbol * here and hereinafter represents the multiplication sign; an oblique/stands for division). Each group of K devices which are in each case accessed simultaneously in parallel operation is also referred to as a “rank”. A memory module may comprise a plurality R of such ranks or just a single rank (R=1).
During operation, the module is connected to a single memory controller, which transmits the data to be written and receives the data read out and additionally transmits control bits for the memory operation. Said control bits comprise command and setting bits for controlling the operating states of the memory devices and selection bits for selecting the memory devices that are respectively to be addressed within the module and for addressing the memory cells within the respectively selected devices. For transferring the L*K-bit data words between memory module and controller, provision is usually made of a data bus having L*K parallel lines which fan out on the circuit board of the module into K so-called “lanes”, each of which comprises L parallel lines and is connected to a respectively assigned memory device in each rank of the module. The number L is therefore also referred to as the lane width.
Bit errors can occur both during the transfer of the data between module and controller and during the storage of the data in the module, with the result that the so-called “integrity” of the data is not always ensured. Through suitable coding of the data words, it is possible to detect such errors with a certain probability and, if desired, also to correct them. Every coding of this type consists, in principle, in adding to the actual “useful bits”, which describe the useful information of a data word, one or a plurality of “check bits”, which are calculated from the useful bits according to a chosen algorithm.
For memory modules in which an improvement of the data integrity is desirable, what has become preferred in the meantime is an error correction code (ECC) in the manner of a Hamming code, in which each code word comprises 72 bits, of which 64 bits form the useful bits and 8 bits form the check bits, that is to say N=64 and P=8 (“64+8” code). This code and the Hamming algorithm that is usually taken as a basis permit not only the detection but also the correction of the occurrence of a single bit error within the code word. If precisely two bit errors occur within a code word, then this circumstance can be detected with certainty, although without the possibility of correcting these errors (by contrast, the occurrence of more than two errors within a code word is not detected with certainty). It has been shown that the probability of the occurrence of more than one bit error per 72-bit code word is negligibly low in the case of present-day memory technology, with the result that the abovementioned 64+8 Hamming code suffices in practice. However, an error correction algorithm can also be devised such that an error arising from the failure of an entire memory device can be corrected in the read-out code word.
Various schemes are known for the storage of ECC data (that is to say data which are coded with an error correction code) in a memory module, the common feature of said schemes consisting in the fact that each code word is divided into the same number of identically sized blocks as there are memory devices contained in each rank. In the case of the preferred 64+8 code, this may be effected e.g. by dividing the total of 72 bits into eight 9-bit blocks, each of which is stored in one of eight memory devices within a rank. In this case, the memory devices have to be configured as ×9 devices. This otherwise unusual configuration requires special fabrication of the devices. In this case, the addressing is effected as in the case of an ×8 device, except that a group of 9 memory cells is selected per address and 9 data pins are used at the data terminal in order to access the addressed group via 9 data lines. The bits of each ECC code word are divided in such a way that, in each device, in each case eight useful bits and one check bit are stored at an addressed 9-cell group. This has the disadvantage, however, that in the event of failure of an entire device, a check bit always fails as well. Consequently, the abovementioned technique which permits error correction in the event of failure of an entire device is made significantly more difficult or even becomes impossible.
Furthermore, ×9 devices are not advantageous if they are also intended to be used for forming a module for data operation without ECC. A manufacturer of memory modules would like to be able to supply both ECC modules and non-ECC modules, depending on the current demand. In this case, it is important for the manufacturer for stockkeeping reasons that the memory devices stocked by the manufacturer are all of the same type. If this stock consists of ×9 devices and if a non-ECC module is intended to be equipped therewith, then memory space is wasted. This will be illustrated on the bases of the exemplary case in which the non-ECC module is intended to be designed for storing 64-bit words, divided into eight 8-bit blocks for a rank of eight memory devices. In this case, the ninth data line at each ×9 device is left unutilized, and accordingly one cell remains unused in each addressed 9-cell group.
For these and other reasons, in practice an alternative scheme is preferably employed in which useful bits and check bits are stored in separate memory devices. In this case, in each rank, in addition to the plurality KN of memory devices of identical type which serve for storing the useful bits, in each case a number KP of additional devices of the same design and size and also having the same × configuration are provided for storing the check bits. The homogeneity of all the devices is desirable, to be precise for the economic reasons already discussed above and for reasons of compatibility with regard to the address structure. For the same reasons, it is not only desirable but practically essential for the number L to be an integral power ≧2 of 2 (that is to say L=4, 8, . . . ), since ×1 and ×2 devices do not correspond to the conventional memory technology, that is to say are not customary commercially and would also be disadvantageous owing to the low data throughput.
In order that when using an error correction code containing N useful bits and P check bits, all of the available storage capacity in a module constructed according to the scheme described above is utilized fully, the following conditions must consequently be met:
(a) L is to be an integral power ≧2 of 2;
(b) N/L must be a natural number;
(c) P/L must be a natural number;
(d) KN=N/L;
(e) KP=P/L.
In the case of the preferred 64+8 code, that is to say for N=64 and P=8, L can consequently only be equal to 8 or equal to 4. KN=8 and KP=1 thus result when using ×8 devices. KN=16 and KP=2 would result when using ×4 devices. In these cases, the data bus between the module and the controller comprises N+P=72 parallel conductor tracks, 64 conductor tracks being dedicated for transferring the 64 useful bits of each code word between the controller and the useful bit memory devices of the respectively selected rank. The remaining 8 conductor tracks are dedicated for transferring the 8 check bits of the code word between the controller and the check bit memory device(s). The address bits for the selection of the memory cells within the devices of the respectively selected rank are identical for all these devices.
A module of the type described above may optionally also be used for data storage without an error correction code. In this case, with each clock cycle only 64 useful bits are transferred via the dedicated 64 useful bit conductor tracks of the bus between the controller and the memory devices of the selected rank. The remaining 8 conductor tracks of the bus and also the check bit memory device(s) in each rank then remain unused.
There are cases in which it becomes problematic to meet all the abovementioned conditions (a) to (e). One such case exists for example if the clock rate of the data transfer between controller and module is so high that a differential, that is to say two-core, line has to be used in the bus and in the lanes for each bit stream. This situation can arise particularly in the case of multiple data rate operation.
In order to provide a two-core line for each bit stream in the bus, the number of conductor tracks in the bus could be doubled, but this is often undesirable, inter alia for space reasons. Consequently, the only solution that remains is to reduce the effective bus width to half. That is to say that instead of N+P parallel bits, only (N+P)/2 parallel bits can be transferred on the N+P conductor tracks present. Each (N+P)-bit code word of the error correction code therefore has to be divided into two successive parts. In the case of the preferred 64+8 code, this means that each partial code word contains 36 parallel bits, namely 32 useful bits and 4 check bits.
For this case, the abovementioned conditions (a) to (e) could be met only if all the devices are configured as ×4 devices, the number KN of useful bit memory devices per rank is chosen to be equal to 8 and a ninth device of identical type is provided as a check bit device for each rank. However, the higher the data rate and thus the prefetch m, the higher, too, the instantaneous current consumption of a memory device per access becomes, so that it is desirable for the number of memory devices that are to be addressed simultaneously to be kept smaller.
One alternative is still to use ×8 devices, but to reduce the number KN of useful bit memory devices per rank to ½*N/L (that is to say to N/2L). For the case of the preferred 64+8 code, this means that KN=4, that is to say is smaller by half than what is required by condition (d) above. This has the consequence, however, that the additional check bit memory device in the rank has only half as many bits to store as each useful bit memory device. If all the devices are intended to be identical to one another, which is expedient for the reasons mentioned above, then the available total storage capacity of the module is not completely utilized, which is uneconomic. Although this disadvantage could be eliminated by using a memory device having half the capacity (“half-dense memory device”) for check bit storage in each rank, said memory device is often not readily available or else belongs to an older technology generation, the products of which cannot readily be combined with the current generation (e.g. for reasons of the supply voltage). One solution would be to entirely omit the additional check bit memory device and to configure all the remaining four devices as ×9 devices. However, this would result in the disadvantages as have been described further above in conjunction with the ×9 configuration.
The situation described above is only one example for illustrating the problems which might arise if the number KN of useful bit memory devices per rank were less than the ratio N/L for any reasons. Such a situation might arise not only on account of a desirable two-core constitution of the bus lines, but also e.g. if use were made of an error correction code in the case of which the ratio N/P is not equal to L.