1. Field of the Invention
The present invention relates to a concurrent write duplexing device with extension of memory bus in a tightly coupled fault tolerance system, and more particularly, to a concurrent write duplexing device with extension of memory bus, to maintain memory data consistency within a duplexing operating processor module in a failure sensing control system.
2. Discussion of Related Art
Generally, all of systems which are manufactured by the adaptation of human being""s technology always contain the possibility that various kinds of troubles caused due to their designers""s mistake, failure of components and the like may occur. If, however, such the troubles occur in the system used for a specific purpose which should prevent the troubles, such as medical equipments, flight control systems, satellites, weapon systems, switching systems, a normal operation is impossible, which results in a serious problem. A fault tolerance system means a non-stop system in a system level which is constructed to operate in a designed order regardless of generation of hardware failure or software error. In the case where any trouble in the fault tolerance system occurs, the fault tolerance system fundamentally includes a redundancy module which can back up a system function and varies its own embodied method in accordance with the number and type of additional redundancy modules.
In case of encountering any fault in a switching system, the fault can be repaired at a proper time by an operator. Therefore, the switching system does not need a large amount of hardware redundancy which is required in the medical equipments, flight control systems, satellites, and weapon systems. Typically, the switching system is comprised of a module which executes a system function and a standby module which backs up the system function, which is embodied in a duplexing manner. The switching system which operates under high reliability and availability, supports a fault tolerancy function for some important parts in the duplexing manner. A control part as one of important parts in the switching system operates an active module for one portion and a standby module for the other portion by using the same processor module. In a preferred embodiment of the present invention, the data consistency of memory are keeped to be same in the active module and the standby module, and if the fault is produced in the active module, the standby module receives only state information of the active module and changes its own state to be the state of the active module, so that the system can be operated in a non-stop manner in a system level. It is of course important that the data consistancy between the two modules should be maintained to be same as each other. To this end, therefore, a concurrent write method is embodied in the preferred embodiment of the present invention. With the concurrent write method which is applied in the fault tolerance system, a memory write operation in the active module which is implemented in the standby module to constantly maintain the same memory contents, and then if any fault occurs in the active module, the standby module executes the system function in the defective active module and continuously proceeds the function in the system level regardless of the fault in the active module.
Examples of the conventional duplexing devices in which the concurrent write method is embodied are a duplexing data channel matching device using a tightly coupled data transmission method and a duplexing data channel matching device using decoupled data transmission method. Since these devices are achieved by extension of a system bus and alteration of most hardware according to change of a central processing unit (CPU), i.e., the system bus, they exhibit low compatibility and should have a long period of time in driving the hardware development. With the improvement of the performance of the CPU, the conventional device is not useful due to clock increment of the system bus and does not ensure a reliable application in hundreds of MHz of a currently commercial high performance microprocessor. In addition, since the duplexing data channel matching device using the coupled data transmission method should receive answer signals from the two modules to proceed the next operation, the device exhibits serious performance deterioration. Meanwhile, although the duplexing data channel matching device using the decoupled data transmission method separates a memory write operation and a concurrent write operation in the active module by using a first-in first-out (FIFO) to thereby solve the performance deterioration in the duplexing data channel matching device using the coupled data transmission method, the device overcomes the troubles generated in a complicated manner and also exhibits a high fault generation probability.
FIG. 1 is a block diagram illustrating a data transmission channel where a coupled data transmission system which does not separate a memory write operation and a concurrent write operation in an active module is employed, in a duplexing device in which a conventional concurrent write method is embodied.
As shown in the figure an active module 10a and a standby module 10b respectively include a central processing unit (CPU) 11, a main memory 12, a data transmission channel 13 and an input/output (I/O) matching device 14.
The solid line as shown in FIG. 1 indicates the memory write operation in the active module 10a which is executed to the main memory 12 by the CPU 11 or the I/O matching device 14, and the dotted line indicates an answer signal process to inform the CPU 11 or the I/O matching device 14 that specific data is stored to each main memory 12 of the active module 10a and the standby module 10b. In this case, for the duplexing operation, the main memory write operation in the active module 10a is extended to a local bus in the standby module 10b through the data transmission channel 13 to be thereby transmitted to the main memory 12 in the standby module 10b, such that the data in the main memory 12 within the active module 10a and the standby module 10b are constantly maintained to be same as each other. In the data transmission channel where the coupled data transmission method is employed, however, there occurs a problem that since an answer signal to write operation completion should be received from the main memory 12 of the standby module 10b to complete the write operation to a specific region of the corresponding main memory 12, no following operation can be executed before the answer signal is received from the standby module 10b. 
Accordingly, the overhead caused due to the waiting time for the answer signal from the standby module forcibly renders system performance in the duplexing device in which the conventional coupled transmission system is employed to be deteriorated, such that the data transmission channel using the conventional coupled transmission system can not be well employed in the system having a high performance processor.
FIGS. 2A and 2B are block diagrams illustrating a data transmission channel in which a conventional decoupled data transmission system which is designed to minimize performance deterioration caused in FIG. 1 is employed, where a main memory write operation within an active module and a concurrent write as a main memory write operation within a standby module through a data transmission channel are independently separated and operated.
As shown in the figure, an active module 20a and a standby module 20b respectively include a central processing unit (CPU) 21, a main memory 22, an input/output (I/O) bus matching device 23, an SCSI Ethernet, miscellaneous I/O matching device 24, and a high speed data transmission channel 25. Further, a buffer for separating the operation in the two modules is disposed in the interior of each of the modules. However, in the conventional decoupled data transmission system, there occurs a problem that various troubles may be produced in accordance with the complication of hardware and increment of electronic parts caused due to the operation separation of the two modules. In addition, since the operation where the fault occurs has been already completed by the CPU 21, it is difficult to find and recovery the part in which the fault has occurred. Also, in the case where the trouble occurs during duplexing separation of system, it is impossible to overcome the fault. Meanwhile, it is understood that the decoupled data transmission system where a corresponding operation in the memory write operation of the CPU is monitored and the monitored data is stored in the buffer is not desirable because the time for sensing the corresponding operation is shortened due to the high speed system bus. Specifically, the system can not be embodied in a high performance microprocessor having a high speed system bus.
Current commercial high performance microprocessor provides several hundreds MIPS (Million Instruction Per Second) performance, based upon hundreds of MHz of clocks, and uses upper 100 MHz of system bus clock to solve system bottle-neck which is generated in the system bus, which will be increased. However, since the embodiment of the conventional duplexing fault tolerance system using the concurrent write is achieved by extending the system bus, there still remains a problem that if the system bus clock is increased, the system can not be embodied appropriately. Also, since the alteration of hardware associated with the duplexing device is accompanied in accordance with upgrade and change of the CPU, it requires a long period of time to develop new hardware and software.
Accordingly, the present invention is directed to a concurrent write duplexing device with extension of memory bus in a tightly coupled fault tolerance system that substantially obviates one or more of the problems due to limitations and disadvantages of the related art.
An object of the invention is to provide a concurrent write duplexing device with extension of memory bus in a tightly coupled fault tolerance system which can extend the memory bus between a memory controller and a memory which has a feature of a lower speed less than a system bus and is regardless of the change of CPU and connects the extended bus to a duplexing data channel.
Preferably, the duplexing device of the present invention in which the memory bus is extended is achieved by a minimum hardware and should meet basic requirements as follows:
1) a memory switch for connecting data channel which maintains data consistency of active/standby processor modules to be same as each other,
2) a memory switch control function for setting a memory switch direction by performing memory read, write, and concurrent write,
3) a function for setting active/standby operation modes and determining a channel hang-up mode, and
4) a minimum hardware occupation for minimizing fault occurrence probability caused due to increment of hardware components.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory and are intended to provide further explanation of the invention as claimed.