The present invention relates in general to computer and telecommunication networks and, more particularly, to file synchronization for fault tolerant telecommunication networks.
With the advent of increasingly sophisticated telecommunication services, telecommunication networks are increasingly distributed. For example, rather than having telecommunication services performed by a centralized computer having multiple processors, these telecommunication services are increasingly performed by a distributed network of computers and servers, in which each such computer or server generally contains a single or central processor, such as an Intel Pentium class processor. The advantages of such a distributed network of computers and servers, typically connected to each other via a high speed bus, an ethernet or fiber optic cable, include cost effectiveness and the capability for incremental network growth.
A particular difficulty with such distributed, networked computers concerns fault tolerance, such that if one computer becomes disabled, another computer on the network may immediately take over all the functions previously performed by the disabled computer, with minimal disruption of service. For example, if the primary computer providing telecommunication services (referred to as the active application processor (xe2x80x9cactive APxe2x80x9d) (also known as a distinguished application processor)) should become disabled (crash), to avoid an interruption of service, a fault tolerant system may provide for a secondary computer (referred to as a standby application processor (xe2x80x9cstandby APxe2x80x9d)), to immediately assume the performance of all services previously provided by the active AP. In order for the standby AP to immediately come on line with minimal disruption of service, as if no fault or other major event occurred, the standby AP preferably should have access to identical information and be in synchrony with the active AP.
In the prior art, attempts to maintain such synchrony have typically involved copying files by the standby AP from the active AP. Such a copying process is typically very time consuming, involving minutes for copying of gigabit sized files, necessitating an intervening loss or disruption of service. As a consequence, a need remains to reduce any such delay or interruption dramatically, to avoid service interruptions lasting longer than a few seconds.
Other prior art systems, while providing synchrony, typically do not allow such computers to operate autonomously, but only in locked step. Other fault tolerant systems do not provide for a standby mechanism, but merely provide a disk array (raid) for automatic backing up of information stored to a disk. Other systems require additional hardware for redundant clustering computers, and are platform dependent.
As a consequence, a need remains for an apparatus, method and system to provide information synchrony in a fault tolerant network. Such synchrony should occur within a very small time frame, such as seconds, to avoid service interruptions. In addition, such an apparatus, method and system should not require any additional hardware, should be platform independent, and should be application independent, with such fault tolerance occurring transparently to the user and to the application.
In accordance with the present invention, an apparatus, method and system are provided for file synchronization for a fault tolerant network, in which the fault tolerant network generally includes an active network entity, such as a telecommunication server, and a standby network entity to assume the functionality of the active network entity in the event of a failure of the active network entity. The apparatus, method and system of the present invention provide such information synchrony within a very small time frame, such as seconds, to avoid service interruptions in the event that the active network entity fails and the standby network entity becomes active. In addition, the apparatus, method and system of the present invention do not require any additional hardware, are platform and application independent, with such fault tolerance occurring transparently to the user and to the application.
The method of the present invention begins with accessing a file within the active network entity, such as through a read or write request of any network application. A file access request within the active network entity is generated and transmitted to the standby network entity, which also performs the file access request. The standby network entity then generates and transmits a file access confirmation to the active network entity. The active network entity then determines whether the file access request of the active network entity has a corresponding file access confirmation from the standby network entity. When the file access request has the corresponding file access confirmation, indicating that the files are in synchrony between the active and standby network entities, the active network entity then deletes the file access request and the corresponding file access confirmation from memory. When the file access request does not have the corresponding file access confirmation, however, indicating a lack of synchrony, the active network entity then generates an error message and transfers the file access request to an error log, for subsequent use. Such subsequent use may include generating an alarm condition and transferring the standby network entity to an active status.
As indicated above, this methodology is transparent to and independent of the network application. The methodology is also independent of an operating platform within the active and standby network entities. The various file access requests typically include a read request, a write request, an open request, and a close request, and may be invoked through any type of network application.