In the field of communication it is known to provide entities of different functionality in a communication system that may comprise one or more networks. An entity within the meaning of the present application and claims is a device or a plurality of devices for providing a particular functionality, e.g. a single unit or node, or a collection of units or nodes that act together. One known type of entity is a gateway entity that acts as a gate between one or more entities on one side and one or more entities on the other side. For example, gateway entities can be provided at the transition between two different networks, for allowing communication between the entities of the two networks. Another example of a gateway entity is in a context where a plurality of application components are provided in a redundant structure (also referred to as a high availability system or fault tolerant systems), connected to a gateway entity that offers entities of an outside network access to the application components, where one application component can take over when another fails.
A basic problem with such gateway entities is their complexity. Namely, the gateway entities are designed to process messages being sent between the entities on either side, such that the gateway entities must be able to understand the protocol being used for the messages.
For example, in the case of a gateway entity between two different networks, the gateway entity should be arranged to implement every protocol used for messages passed by the gateway entity.
In redundant systems, further problems occur in the gateways.
Due to the complexity of telecommunication networks, there are various reasons for possible failures, which may occur in the network components themselves, e.g. the hardware or the software running on those components but may also be triggered by environmental effects on the network components, preventing users from receiving the offered services.
Service retainability is the key to success of these networks, which means that even during the failure process, when a backup process takes over to provide for continuous service, there is no or only a minimal impact on the user receiving the service.
Offering telecom grade high availability in general entails hardware and software components that are designed to have or support high availability functions. These kinds of platforms have telecom grade operating systems and specifically written applications to make use of high availability functions. The design and implementation process of such operating systems and applications are long and expensive.
There are many high availability solutions that can be categorized as being stateless or stateful depending on the state of the application preserved during failover so that they can continue smoothly, or those states have to be re-built after failover. Stateful solutions are more appropriate for smooth services, because re-building states may take a considerable amount of time possibly leading to service disruption or degradation.
For example a stateful high availability system may comprise one or more primary and backup components and additional mechanisms to ensure that states of the primary components are replicated to the backup components. The most widely known system uses the hot-standby or 1+1 redundancy scheme, in which a primary and backup component work in a mated pair relationship and the primary component serves all requests, while the backup component waits to take over when the primary component goes down due to a failure. During normal operation application states from the primary component are periodically copied to the backup component to have an up-to-date version in case the primary component stops. The failover process changes the role of the backup component to be the primary component as long as the failed primary component has recovered.
Another example is a fault tolerant system, in which, two or more identical components work parallel with the same input data. The outputs of these components are compared to identify if there is a faulty one among them. These dual or multiple modular redundant systems have redundant hardware setups and are inherently stateful as each component processes the same data in exact parallelism with other components. Therefore, if one component fails, it can be disabled, while the others can provide the service immediately as there is no switchover time or other delay in its simple failover mechanism.
However, the approaches discussed above that are built up following the primary-backup principle have to apply additional mechanisms to be able to replicate states in the backup component. State replication imposes several requirements, e.g. states need to be consistent and up-to-date in corresponding components, the mechanism that is responsible for moving states has to be resilient in itself. Therefore, for state replication, both the operating system and the application have to be designed to support this feature. The operating system has to implement and handle resilient databases and processes, manage failover of these ones and also support various high availability functions, like fault detection and failover control logic. Further, applications have to be specifically coded to assist the state replication. Therefore, this implies a long and expensive design and implementation process and a complex system with a long time to market cycle. Moreover, neither the operating systems nor the applications are portable among different platforms.
Furthermore, dual or multiple modular redundant systems require exact parallel processing in each component, which generally is extremely hard to achieve. It is not only that instructions have to be processed simultaneously in terms of clock cycle in different processors but randomness is also introduced by both the operating system (e.g., port selection, interrupts, task scheduling) and the applications (e.g., adding random fields) that have to be controlled to achieve exact parallel operation. Moreover, the systems require additional mechanism to decide on the correct value when there is a comparison mismatch at the outputs. This support comes from the board (hardware) itself or from software components, those are proprietary extensions that are unknown in detail. To fulfill these requirements special hardware and operating systems are usually needed rendering fault tolerant systems very costly.
EP 1 599 099 A1 describes improvements in message-based communications. Here, a method of communicating information between an intermediate element and a source element in a message-based communication system in which request messages are sent from the source element and in response a corresponding response message is sent from a destination element is provided. An exchange of messages is known as a transaction in SIP, wherein a transaction comprises all messages from the first request message up to a final response message. The proxy is a stateful proxy determining whether the received message is part of the current transacation, e.g. by matching a transaction identifier for the current message with the information stored in a transaction context. Further, another proxy is provided, which is stateless, i.e., it does not maintain a transaction context and thus does not require a context storage means.
U.S. Pat. No. 6,360,270 B1 describes hybrid and prediction admission control stragegies for a server. Here, and admission control system for a server including an admission controller that receives a stream of messages from one or more clients targeted for the server is described. The admission controller relays the messages to the server in a stream that corresponds to a number of sessions underway between the clients and the server. The admission controller processes individual ones of the arriving messages based upon the indications provided by the resource monitor and a determination of whether the arriving messages correspond to session already underway with the server. For example, a transaction list identifies any session.