1. Field of the Invention
The invention relates to a multiprocessor computer system, comprising n synchronously controlled parallel-operating computer modules, each of which is localized in its own fault isolation area, Each computer module comprises a processor module; a data channel connected to a data connection of the processor module; a reducing encoder connected to the data channel in order to form a code symbol from a data word received so that the relevant encoders forms, on the basis of a data word comprising k data symbols, a code word consisting of k+1.ltoreq.n&lt;3k code symbols of a code incorporating a simultaneous correction capability in at least two code symbols; a memory module comprising a first data input which is connected to a first data output of the associated reducing encoder, and to a second data output; and a data word reconstruction module which is connected via an interconnection network, to the relevant second data outputs of the memory modules of the various computer modules in order to receive a relevant code symbol of a code word from each computer module in order to reconstruct a data word therefrom, the data word reconstruction module comprising a third data output which is connected to said data connection and to said data channel, said data channel also comprising a second data connection for external data input/output.
2. Description of the Prior Art
Such a computer system is disclosed in the previous U.S. Pat. No. 4,512,020 issued Apr. 16, 1985, assigned to the assignee of the present application. For such a multiprocessor computer system a comparatively small total memory capacity suffices for a comparatively high processor capacity, for example in comparison with a total triplication of processor and memory capacity distributed between a corresponding number of faults isolation areas. In accordance with the previous Patent Application, similarly to when using the total triplication, the total circuit of one such fault isolation area may exhibit an arbitrary data error without the operation of the multiprocessor computer system as a whole being impeded. The computer system in accordance with the previous Patent Application has several modes of operation. In one of these modes an arbitrary symbol error can be corrected (provided that it is known which symbol is incorrect) plus one single bit error. In another mode, two arbitrary one-bit errors can be corrected. Several codes which are capable of correcting several bit errors are known per se, for example, the "Fire" codes; the error location while using the last-mentioned codes may be completely arbitrary.
The error correction capability according to said Patent Application is an extension of that disclosed in the previous U.S. Pat. No. 4,402,045 assigned to the assignee of the present application. The latter offers several redundancy levels which can be used to implement the data input/output. A high degree of redundancy with ample correction of errors is achieved by multiplying the connections for data input/output in the same way as the multiplication of the computer modules in the computer system itself. The relevant peripheral apparatus may then be of a multiple construction. On the other hand, the peripheral apparatus may also be singular without redundancy. Intermediate redundancy levels can also be implemented. These different redundancy levels can also be incorporated in the multiprocessor computer system disclosed in said U.S. Patent Application Ser. No. 416,992. In many cases it is necessary to implement an input/output memory, for example for buffering and reformatting the data. The memory capacity is then usually comparatively small when considered as cost factor in comparison with the other cost factors which would occur if a code word comprising n symbols with associated data reconstruction sectors, data interconnections and the like were to be formed. This is also applicable if, in addition to the input/output memory, an input/output processor module and possibly further components associated therewith are required.
Consequently, the input/output data is received in non-coded form (as viewed in relation to the error correction code); therefore, it is an object of the invention to ensure that the input/output data may not be processed in such a way during the input/output process that a bit error occurring could be converted in the reducing encoder into a multibit symbol error. Such a multibit symbol error might be correctable in many cases, but should another bit error occur in the same code word, the error correction capability of the code might easily be insufficient. The object in accordance with the invention is achieved in that each computer module also comprises a second data channel which is connected in series with said second data connection and which comprises a third data connection to the environment; an input/output memory module which is connected, at least when the generator matrix (G.sub.i) of the associated reducing encoder maps a data bit on more than one code bit, to the second data channel by way of a second, non-reducing encoder and a third data input and a decoder which is associated with the second encoder, the following relations existing between the generator matrix (G.sub.i) of a reducing encoder, at least in as far as this encoder maps a data bit on more than one code bit, the generator matrix [G.sub.i ] of the second encoder, and the generator matrix [G.sub.i *.sup.-1 ] of the decoder: EQU [G].multidot.[G.sub.i *.sup.-1 ]=[I], the identity matrix; EQU [F]=[G.sub.i ].multidot.[G.sub.1 *.sup.-1 ],
in which each column of [F], written as consisting of bits, contains at the most one "1" and for the remainder exclusively "zeros", each row of [F] containing at least one "1", so that in the relevant computer module a bit of a data word encoded in the input/output memory is mapped on at the most one bit of the code symbol which can be formed from the data word. In as far as a reducing encoder maps a data bit on "zero" code bits, an error in the relevant data bits will not be passed on to the relevant memory module. In as far as the mapping is performed on one code bit in the reducing encoder, a bit error in the input/output memory will be passed on as only a single bit error in the relevant memory module, even when no special steps are taken in the second encoder/decoder. In both cases the generator matrix of the second encoder may have the properties of an identity matrix (multipled or not by a transposition matrix which modifies the sequence of the bits) for the relevant data bit. In as far as the reducing encoder maps a data bit on more than one code bit, the relevant generator matrices must satisfy more severe requirements.
Preferably, the error correction capability of the code allows for at least one arbitrary error vector in at least one code symbol. In conjunction with the "erasure" mode of said U.S. Pat. No. 4,512,020 the idea of the invention offers a very attractive implementation.
Preferably, a data word reconstruction module has at least two selectively activatable modes of operation each with a different correction capability. The flexibility in dealing with different error causes is thus further enhanced.
Preferably, each row of the matrix [F] contains exactly one "1". In that case many errors of the input/output memory do not become manifest in the relevant memory module of the main memory and the error probability in the latter memory is minimized.
Preferably, each column of the matrix [F] contains exactly one "1". All errors of the input/output memory then become manifest in the relevant memory module of the main memory so that the input/output memory can be readily tested.
The invention also relates to a computer module for use in a multiprocessor computer system of the kind described in which the reducing encoder maps at least one data bit on at least two code bits, the combination formed by the decoder and the reducing encoder mapping each bit from the input/output memory on at the most one bit in the memory module. Using such a computer module, an error-tolerant multiprocessor computer system can be readily formed.