Load balancing has been integral to providing high availability and scalability to web-based and non web-based applications. However, the type of protocol used in network communications between the client devices and the servers affect the ability to effectively handle load balancing. The HTTP protocol, for example, is synchronous and stateless, whereas other protocols such as Diameter, RADIUS, and Session Initiation Protocol (SIP) are asynchronous and do not adhere to a single request-reply communication sequence. The use of such asynchronous protocols makes it difficult to perform load balancing, because most load balancing systems are designed to operate in a synchronous messaging environment where a single request is made and responded to before another is processed.
Asynchronous protocols such as Diameter and SIP also maintain the one-to-one (1:1) relationship in which there is always a matching reply for every request. However, unlike traditional web-based protocols, they do not need to maintain a strict synchronous exchange. In other words, multiple requests may be sent before a reply is received in an asynchronous protocol. This makes load balancing systems that use traditional protocols, like HTTP or TCP, unable to handle load balancing responsibilities as they cannot process more than one request at a time and are limited to load balancing on a per-connection basis.
Load balancing is accomplished in typical systems at the Layer 4 (TCP) protocol on a per-session or per-connection basis. All requests received over the same session are load balanced to the same server. When communications are complete, the session is terminated. This behavior is not acceptable for some protocols, particularly those associated with service provider and telecommunications implementations that utilize SIP, Diameter, Lightweight Directory Access Protocol (LDAP) and RADIUS protocols. These protocols carry communications over longer-lived sessions, whereby the communications are potentially required to be processed by different servers. This means traditional load balancing mechanisms are incapable of supporting the scalability and availability requirements of such protocols.
In a typical synchronous request-reply protocol, such as TCP, each request can be directed to a specific server based on a variety of parameters, such as the content or request type. This behavior is also desirable in message-oriented communication, but it is typically more difficult to support for SIP, RADIUS and Diameter protocols due to the need to scale intermediaries to open and maintain multiple connections to different servers. As stated above, traditional TCP based load balancing maintains a 1:1 ratio between requests and server-side connections. However, in a message-oriented protocol such as SIP or Diameter, there may be a need to maintain a one-to-many (1:N) ratio between requests and server-side connections. One way to solve the challenges associated with scaling message oriented protocols such as SIP and Diameter is the ability to extract individual messages out of a single, shared TCP connection.
What is needed is a system and method that is configured to inspect application layer data in split out individual messages from a connection-oriented protocol and distribute them appropriately to different servers using a connection-less protocol.