This application is based on and claims the priority under 35 U.S.C. xc2xa7119 of German Patent Application 198 15 263.9, filed on Apr. 4, 1998, the entire disclosure of which is incorporated herein by reference.
The invention relates to an apparatus for the fault tolerant execution of programs and particularly digital computer programs, by the parallel operation of plural processor units operating as redundancy units. The invention further relates to a method of operating such an apparatus, and to the processor units included in the apparatus.
An apparatus or system of the above mentioned general type is disclosed in German Patent Publication 4,401,168. The disclosed apparatus includes a plurality of redundancy units, and can include various different types of serial bus interfaces, such as for a field bus or a MIL-1553 bus or the like, for example.
The degree of redundancy of the known apparatus can be selected variably as desired by means of simple on and off switching of the respective selected redundancy units. In this context, the parallel control can be operated independently of the process interface. The cited reference also describes how the redundancy units are to be interconnected with each other so that the apparatus can achieve a fault tolerant execution of application programs or user programs.
The above described known apparatus suffers several disadvantages. First of all, the maximum desired degree of redundancy must be known or determined before construction of the apparatus, so that the circuit arrangement is properly embodied to achieve the desired maximum degree of redundancy. Therefore, the production of the redundancy units cannot be carried out independently of the specific end use for which the apparatus is to be utilized. It is a significant disadvantage that the redundancy units must be designed, constructed and arranged specifically dependent on the particular application.
Secondly, significant technical efforts and costs are necessary for integrating different types of serial bus interfaces or process interfaces into the known apparatus. As a result, the finished apparatus or circuit arrangement necessarily has a high cost, especially because no standardized interfaces are available for the connection of such buses.
Moreover, difficulties can arise during maintenance procedures, if one or more of the redundancy units need to be switched off or deactivated for testing, reprogramming or the like. Such problems also affect the ability of the apparatus to carry out a standby mode of operation, for example when the apparatus is used in an automation system.
In view of the above it is an object of the invention to provide an apparatus or circuit arrangement for fault tolerant execution of programs, of the general type initially discussed above, which is considerably more flexible, variable, and adaptable in its end use configuration and embodiment, so that it may be selectively used for various different applications without requiring specialized fabrication or assembly thereof. It is a further object of the invention to provide a method of adaptably or flexibly operating such an apparatus, and to provide components making up such an apparatus. The invention further aims to avoid or overcome the disadvantages of the prior art, and to achieve additional advantages, as apparent from the present specification.
The above objects have been achieved according to the invention in a circuit arrangement for fault tolerant execution of programs, comprising a plurality of arithmetic logic units or processor elements arranged as redundancy units in a processor pool, a data line for transmitting data among the processor elements, a clock line for the forced or compelled synchronization of the several processor elements of the arrangement, and a reset line for selectively switching on and off or activating and deactivating each one of the processor elements. The data line, the clock line, and the reset line are respectively embodied as respective first, second and third cross-strapping interconnections. Each one of the processor elements comprises a respective microprocessor controller for controlling the functions of the respective processor element in such a manner that the overall circuit arrangement can be selectively operated using only one or a selected plurality of the available processor elements in parallel, as needed. In this context, a selected number of the processor elements, corresponding to the desired degree of redundancy as prescribed by the respective input coding or program, can be connected, activated and operated in parallel to each other in the circuit arrangement via the data line, the clock line, and the reset line. Moreover, the circuit arrangement comprises a simplex or multiplex serial field bus and a plurality of serial bus controllers respectively allocated to the plural processor elements, wherein the processor elements are connected respectively by the allocated serial bus controllers to the serial field bus.
It is possible to utilize the inventive circuit arrangement generically for many different and independent applications, without requiring special technical adaptations or alterations of the circuit arrangement. As a result, the present circuit arrangement can be manufactured generically or uniformly in a mass production process, to produce a large piece count of identical circuit arrangements. In contrast, the prior art circuit arrangements are typically assembled essentially individually from discrete components to achieve different circuit configurations in small piece counts for particular applications. The present circuit arrangements, on the other hand, can be economically manufactured as highly or large-scale integrated circuit components incorporating all the necessary sub-components, whereby the present fault tolerant processor can be manufactured in a similar manner and with a similar format and package as is presently typical for a simple non-parallel processor unit.
The invention substantially reduces the effort, expense and complexity involved in utilizing the circuit arrangement in particular applications. For example, the fault tolerant processor described in German Patent Publication 4,401,168 requires four independent computer boxes or cases that are equipped with up to three VME (Versa Module Europe) boards for a particular application in a space flight project. In contrast, using the inventive circuit arrangement, such a fault tolerant computer can be realized on a single VME board. The total effort, complexity and expense of hardware components that is now necessary in the context of the invention is only a small portion (namely only a few highly integrated modules) of the total hardware effort, expense and complexity that was necessary for utilizing conventional processors as described above.
Another advantage achieved according to the invention in contrast to the prior state of the art, is that a fault tolerant application program or user program without modification can be executed in the present circuit arrangement both in a redundant fault tolerant operating mode using a plurality of processor elements as well as in a simplex or non fault tolerant mode using only a single processor element.
While the available degree of redundancy of the circuit arrangement operating in the simplex mode is no greater than that of a simple processor, the operation of the present circuit arrangement in the simplex mode still offers considerable advantages or simplifications to the user. For example, the first time the user or application program is executed, the simplex operation of the present circuit arrangement will at first mask all errors or faults that arise through the voting process.
Furthermore, the simplex operation of the present circuit arrangement simplifies and economizes the standby operation of an automation system, because the circuit arrangement can be seamlessly switched to simplex operation without interruption, and the power consumption for the simplex operation will only be a one ntn fraction of the original power consumption for operating the circuit arrangement with n parallel processor elements. This advantage is especially considerable and important in space flight applications where the available power is strictly limited. Once a spacecraft has reached a secure orbit, the operator of the spacecraft can switch off all but one of the processor elements in the circuit arrangement, in order to conserve energy without interrupting the user functions being carried out. Thus, a single circuit arrangement provides full fault tolerant parallel redundancy in one operating mode, and also provides power conserving non-redundant operation in another mode without requiring complicated switch-over processes.
Another special advantage of the inventive arrangement is achieved in the case of maintenance of an automation system. In this context, the automation system can be operated with the processor circuit arrangement in the simplex operation mode for a limited period of time, while the other processor elements are switched to an inactive waiting condition so that they can be tested or so that the user software loaded into these processor elements can be exchanged or updated. Thus, the automation system can continue operating without interruption, even while carrying out hardware testing or software exchanges or upgrades of the redundant processor elements in an inactive or maintenance mode.
According to further details of the invention, an external memory is preferably respectively allocated to each one of the processor elements. Moreover, the circuit arrangement can further comprise at least one application processor, which is connected to at least one processor element via a multiprocessor bus such as a VME bus.
An arithmetic logic unit or processor element suitable for use in the inventive circuit arrangement includes at least one microprocessor controller and is embodied to form a so-called processor pool element (PPE) module, wherein the microprocessor controller controls the functions of the respective processor element and is adapted to carry out a data comparison and a data exchange with other similar processor elements interconnected to each other. This PPE module can be embodied as a multichip module, a hybrid circuit, a piggyback module, a so-called xe2x80x9csystem on a chipxe2x80x9d, or any other similar known miniaturized processor module.
The PPE module further preferably comprises an electrically erasable and programmable read memory (such as an EEPROM) for storing appropriate programs for controlling the processor element. The PPE module may further comprise a standardized parallel bus interface including data and control lines for connecting the respective processor element to at least one serial bus.
Moreover, the PPE module preferably comprises a control unit for fault handling, as well as a first cross-strapping interface allocated to the fault handling controller to achieve a serial interconnection with other processor elements in the circuit arrangement. The PPE module may also comprise a fault tolerant clock signal generator circuit for achieving the compelled or despotic synchronization of the respective PPE module with the other processor elements of the circuit arrangement, as well as a second cross-strapping interface allocated to this clock signal generator circuit to provide a serial interconnection with the other processor elements. The PPE module preferably comprises a fault tolerant reset control circuit as well as a third cross-strapping interface allocated to the reset control circuit for achieving a remote reset control. Finally, the PPE module is preferably further provided with a circuit that allocates a characteristic identifier to the respective processor element.