A load balancer distributes client connections and requests across several servers. This distribution is intended to balance the load of each server such that no one or more servers receives a disproportionate amount of the load while other servers go underutilized.
The load balancer is disposed at a network point of ingress. The network point of ingress is typically a common address of a point-of-presence (PoP) at which content and services hosted or served by the set of servers can be accessed by clients.
Persistent request distribution is one manner by which the load balancer can distribute requests across the servers. With persistent request distribution, the load balancer distributes requests for the same subset of content or services to the same servers. Each server is therefore tasked with serving a specific subset of the content or services that are hosted or otherwise accessible from the PoP in which the servers operate.
Persistent request distribution involves the load balancer receiving and inspecting client object requests. This typically includes inspecting the Uniform Resource Locator (URL) of the object request in order to identify the content or service being requested. The load balancer can perform a hash on the URL or other request parameters to identify which of the servers is tasked with delivering the requested content or service. The Cache Array Routing Protocol (CARP) is one such persistent request distribution scheme.
The greater usage of secure connections has caused many of the existing persistent request distribution schemes to fail. With secure connections, such as Secure Sockets Layer (SSL) or Transport Layer Security (TLS) connections, the client object request is encrypted. The load balancer is unable to inspect the request without establishing the secure connection with the client and performing computationally expensive decryption operations. This can create a potential bottleneck if the load balancer is the termination point for connections of the PoP and if the load balancer performs all decryption for all connections and requests directed to the set of servers. A further issue is transferring a secure connection from the load balancer to one of the servers so that the server may respond to the client request over the secure connection. Without secure connections, the packets could simply be forwarded to the client either through or around the load balancer. With secure connections, the packets served by the server have to be encrypted using the encryption parameters for the secure connection. If the secure connection is established with the load balancer, the server has to pass the content or services to the load balancer so that the load balancer can encrypt the objects before they can be sent to the client over the secure connection. Here again, the load balancer becomes a bottleneck. Alternatively, the load balancer could engage in time and resource intensive operations to hand-off the secure connection to the server. This becomes infeasible as the number of secure connections increases. If the load balancer was to forego establishing the secure connection with the client, it would be unable to receive or inspect the encrypted object request, and would therefore be unable to perform a persistent request distribution.
The shift from HypterText Transfer Protocol (HTTP) version 1 to HTTP/2 has also caused many of the existing persistent request distribution schemes to fail for different reasons. HTTP/2 allows for multiple object requests for different content or services to be passed over the same connection. The requested content or services may be served from different servers. Since there is one connection over which the requests are sent, the load balancer is limited to sending the requests to one server. The receiving server can be overloaded if it receives too many such requests over a short period of time. Alternatively, the load balancer can perform a repeated hand-off and hand-back of the connection so that each incoming request over that connection is distributed to a different server. As noted above, each such connection hand-off or hand-back is both time-consuming and resource intensive for both the load balancer and the servers.
Losing persistent request distribution can lead to extensive cache pollution. Cache pollution is where the same content or services are inefficiently cached by multiple servers, thereby reducing the aggregate cache footprint of the set of servers. Losing persistent request distribution also leads to significant intra-PoP cross traffic. When a first server of a set of servers operating in a PoP receives an object request for content or services that it has not cached and is not tasked with serving, that first server will attempt to retrieve the content or service from a second server of the set of servers that is tasked with serving that content or service. This intra-PoP retrieval is faster than if the first server was to retrieve the content or service from a remote origin server outside the PoP. However, the intra-PoP retrieval consumes server bandwidth that is otherwise used in responding to client requests and serving content and services to the requesting clients.
In the worst-case scenario, half of all bandwidth in the PoP could be lost to this cross retrieval of content and services. Such a loss in bandwidth leads to a significant degradation in the performance of the set of servers and their ability to respond to incoming object requests.
Accordingly, there is a need to minimize the cache pollution and amount of intra-PoP cross traffic that results when losing the ability to persistently distribute specific object requests to specific servers. There is therefore a need to preserve or adapt persistent request distribution for at least some subset of the set of encrypted requests arriving over secure connections. There is also a need to preserve or adapt persistent request distribution for at least some subset of the set of multiple requests arriving over the same single connection without creating a bottleneck at the load balancer or PoP point of ingress.