In a computer information system, to ensure the security and stability of an information service, two or more service processing systems having the same functions need to be established, and functional disaster recovery can be implemented among the two or more service processing systems. That is, when a problem occurs in one service processing system, another service processing system can be used to provide a service to the outside, so that the security and stability of an externally-oriented service can be ensured. Disaster recovery is an important component of system high availability technology, and an impact of the external environment or an emergency on the system needs to be taken into consideration in advance, so that an incapability of the system to provide a service or data loss is avoided when a disaster occurs. The so-called disaster refers to an event that makes it impossible to provide a normal service, such as a machine hardware fault, a network fault, a program crash, and an overload caused by an emergency and so on.
Currently, in the industry a disaster recovery solution is generally implemented in a composition and service architecture of a computer system.
FIG. 1 is a schematic diagram of an architecture of an Internet service system in the prior art. Referring to FIG. 1, the Internet service system is a specific application field of computer information systems, and in this architecture, all service nodes are peers. For example, there are three peer service nodes 101, 102, and 103 in FIG. 1, each of the service nodes simultaneously providing a processing logic of all services (it is assumed that the service categories are classified class A, B, and C) to the outside, and these peer service nodes form a service cluster. The system architecture in FIG. 1 is currently adopted by many websites, and the disaster recovery principle thereof is that after a client initiates a category of service request, the service request is randomly allocated to a certain service node in the service cluster through a load balancing system on the Transmission Control Protocol (TCP) layer of the system, and the service node responds to the service request. When a disaster event, for example, a hardware fault, occurs in a certain service node, the service request will be allocated to other normally operating service nodes for responding.
FIG. 2 is a schematic diagram of another architecture of an Internet service system in the prior art. Referring to FIG. 2, in this architecture, three service clusters are classified according to the service categories thereof, all of the service nodes in each service cluster only provides a respective fixed category of service, and the service nodes within a respective service cluster are peers. For example, in FIG. 2, a service cluster 201 provides a class A service, a service cluster 202 provides a class B service, and a service cluster 203 provides a class C service. Taking the service cluster 201 as an example, each of the service nodes 211, 212, and 213 therein only has a processing logic for the class A service, and the address of the service cluster corresponding to each category of service is set in a client. After the client initiates a service request of the class A service, the class A service request is sent to the service cluster 201, then the service request is randomly allocated to a certain service node in the service cluster 201 through a load balancing system on the TCP layer, and that service node responds to the service request. When a disaster event, for example, a hardware fault, occurs in a service node in the service cluster 201, the class A service request is allocated to other normally operating service nodes within the service cluster 201 for responding. Currently, the system architecture in FIG. 2 is generally adopted by many Internet game service systems.
However, the abovementioned prior arts have the following technical problems: the entire computer information system is poor in robustness. For example, in the architecture illustrated in FIG. 1, if the faulty service nodes in the service cluster reach a certain number, where the actual load of the system is greater than a load that can be borne by the normally operating service nodes, the system is overloaded and is therefore entirely unavailable. In the architecture illustrated in FIG. 2, though services of different categories are separately processed by different service clusters, the problem of poor robustness same as that in the architecture illustrated in FIG. 1 also exists in each of the service clusters. That is, if the faulty service nodes in a service cluster reach a certain number, where the actual load of the service cluster is greater than a load that can be borne by the normally operating service nodes, the service cluster is overloaded and is therefore entirely unavailable, and thus a service of a respective category corresponding to the unavailable service cluster cannot be provided to the outside.