This invention relates to the computer engineering, and particularly to the method for ensuring the reliable operation of the program computing means.
The reliable operation of any computing means utilizing a software is an actual question up to now. Numerous implementations made in this direction are already known.
Thus, the RU 2011216 C1 (15 Apr. 1994) describes an apparatus for managing a control computing machine, which apparatus manages, while proceeding to an interruption processing, a timing chart portion that is common for external and internal interruptions.
The RU 2050588 C1 (20 Dec. 1995) describes the method for managing and debugging real time programs and the apparatus for implementing thereof. This method has four modes for localizing an error, each of which modes comparing an address of some cell with the address set with tumblers.
The RU 2066877 C1 (20 Sep. 1996) describes the apparatus for managing an electronic computer detecting the control altering errors by comparing the real addresses with admissible ones.
The RU 2094842 C1 (27 Oct. 1997) discloses another apparatus for managing a control computer, which apparatus manages the correctness in addressing modules of said computer, the correctness in switching the sequence of interrupt service routines, and correctness in proceeding to a new linear program part. The RU 2001118437 A1 (10 Jun. 2003) describes the method for sharing time of the central processing unit between tasks in computerized systems for controlling technological processes using the planning management file. In this method which is based on the allocation of priorities in processing the jobs, a cycle of job switch sequence in accordance with their ranking defined yet at the design stage is assigned.
The U.S. Pat. No. 5,966,530 A (12 Oct. 1999) discloses the method for restoring the instruction boundary machine states. In this method, each instruction, at the moment of its issue, is assigned an identifying mark bound with the location in the memory. Data on this location is upgraded in response to instruction activity status changes.
The U.S. Pat. No. 6,374,364 B1 (16 Apr. 2002) describes the fault tolerant computing system using instruction counting, wherein an interruption occurs after a predetermined number of instructions have been executed.
The US 2002/0178209 A1 (28 Nov. 2002) discloses the method for determining the load of a computing element, wherein the program is subdivided into several tasks, and time intervals between interruptions are selected such that at least one task is started and ended during the time interval.
All these known methods ensure some increase of the program computing means operation reliability, however, each being directed onto solving some particular task.
The analogue closest to the claimed invention is described in the U.S. Pat. No. 5,911,040 A (8 Jun. 1999). The computing system disclosed in this document is fault tolerant due to the fact that, upon detecting an error while running the program, a step of returning to the previous checkpoint is carried out, and the program restarts from this checkpoint, the checkpoint set being held in the processor memory. However, this method does not ensure the required reliability too, since it does not recognize types of failures (errors), and hence it is not able to correct those failures (errors) in various manners depending on the type thereof.