Network data transfer facilitates much of the distribution of digital content today. Through the Internet and other such networks, computers, electronic devices, and other network enabled appliances receive news, music, videos, games, data, communications, etc. from any number of content providers located throughout the world. As the number of clients (i.e., content requestors) increases and as the size of the content being distributed increases, so too does the amount of resources that are needed to distribute such content. Consequently, content providers have turned to server farms and Content Delivery Networks (CDN) to provide the necessary resources to accommodate the increasing demands of content requesting clients.
FIG. 1 is an exemplary server farm architecture 105 used by content providers and CDNs. More specifically, the architecture 105 is representative of a particular point-of-presence (POP) of a content provider or a CDN which may have many such POPs geographically distributed. The architecture 105 includes a core router 110, multiple directors 120, and multiple servers 130.
The core router 110 attaches the POP to an external network such as the Internet and routes Internet Protocol (IP) datagrams in to and out from the POP. In many instances, the core router 110 distributes incoming IP datagrams to the directors 120 based on a hash of the source address contained in the datagrams. The core router 110 is a device available from a number of vendors including but not limited to Cisco Systems, Juniper Networks, and Brocade.
The directors 120 perform load-balancing functionality to distribute load to the servers 130. When selecting which server of the set of servers 130 to distribute load to, the directors 120 may use any load-balancing algorithm, such as a round-robin distribution algorithm or more complicated algorithms that take into account the status of each server in the set of servers 130. The directors 120 include commercially available load-balancing equipment that are often built using Intel® servers running Linux. The load-balancing functionality may be implemented using a Linux kernel module known as Linux Virtual Server (LVS).
The servers 130 host content from one or more content providers. The hosted content can be mirrored across each server of the set of servers 130 or can be segmented such that each server of the set of servers 130 is responsible for distributing unique content. The servers 130 include large amounts of persistent storage in the form of solid-state drives and traditional disk drives.
While the server farm architecture 105 of FIG. 1 and other similar architectures can be scaled to meet increased demand, inherent architectural shortcomings result in inefficient usage of resources and poor scalability. FIGS. 2 and 3 below illustrate some of the shortcomings affecting scalability in a typical server farm architecture. Specifically, FIGS. 2 and 3 illustrate common methods of operating a server farm architecture and the shortcomings associated with these operational methods.
In FIG. 2, a director 210 establishes a network connection 220 with a client 230 in order to receive a content request from the client 230. The network connection 220 may include a Transmission Control Protocol (TCP) connection and the content request may include a HyperText Transfer Protocol (HTTP) request that is sent over the TCP connection. The content request identifies the particular content being requested by the client 230. Upon receiving the content request, the director 210 makes an intelligent routing decision to determine which server of the set of servers is responsible for hosting the requested content. In this method of operation, the same content need only be cached or hosted at a single server and all requests for that content are served from that server, thereby maximizing the storage utilization of the servers. Moreover, this creates strong locality of reference in the server farm and increases performance substantially over essentially random routing algorithms.
In FIG. 2, the director 210 determines that server 240 is responsible for hosting the requested content. The director 210 then establishes a second network connection 250 with the selected server 240 in order to forward the content request to the selected server 240. Upon receiving the content request, the selected server 240 distributes the requested content to the client 230. The selected server 240 may retrieve the requested content from an origin server when the content has not been previously cached in its local storage or the selected server 240 may distribute the content from its local storage when the content has been previously stored or cached to the server's storage. The requested content is passed from the server 240 to the client 230 through each of the established network connections 220 and 250.
In this method of operation, resources of the director 210 are unnecessarily consumed (i) by maintaining at least two network connections (see 220 and 250 of FIG. 2) for each client or for each content request and (ii) by having the requested content be forwarded through the director 210 in order to reach the client 230. This consumes critical resources of the director 210 including processing cycles, memory, and network bandwidth. As a result, the director 210 is limited in the number of content requests that it can handle as its resources are also being consumed maintaining network connections and forwarding content back to the client. This further degrades the overall performance within the server farm as the internal passage of content between the server and the director occupies inter-POP bandwidth that is otherwise needed by other directors in routing content requests to the servers. This also increases the cost of operating the server farm as each director is capable of handling fewer incoming requests and additional bandwidth is needed to handle the inter-POP traffic.
FIG. 3 illustrates an alternative method of operating a server farm. In this figure, the director 310 performs basic load-balancing functionality to distribute load across the set of servers. Specifically, the director 310 does not terminate a network connection with the client 320. As a result, the director 310 does not receive and does not inspect the content request from the client 320, and the director 310 is therefore unable to base its load-balancing decision on which server of the set of servers is responsible for hosting the requested content. Rather, the load-balancing decision is based on other factors such as which server is least loaded or has the fewest active network connections as some examples.
In this figure, the director 310 forwards packets from the client 320 to the server 340. The server 340 establishes a first network connection 330 with the client in order to receive the content request from the client 320. The server 340 then performs a routing procedure to identify which server of the set of servers is responsible for hosting the requested content. As noted above, by ensuring that each server uniquely hosts content, usage of the storage resources of the set of servers is maximized since the same content is not redundantly stored at multiple servers.
When the server 340 is responsible for hosting the requested content, the requested content is passed through the network connection 330 to the client 320. However, it is likely that the server 340 does not host the requested content. In this figure, the server 340 identifies server 350 as the appropriate server for hosting the requested content. Therefore, a second network connection 360 is established between the server 340 and the server 350. The content request is forwarded from the server 340 to the server 350. A proxy-HTTP connection may facilitate the forwarding of the content request over the network connection 360. The server 350 will attempt to satisfy the request from cache. If the requested content is not present in cache, the server 350 retrieves the content from an origin server using a third network connection (not shown). The server 350 forwards the requested content through the network connection 360 to the server 340, which then forwards the requested content outside the server farm to the client 320.
In this method of operation, resources of the directors are no longer consumed in forwarding content from the servers to the clients. This is because the first network connection 330 is established between the server 340 and the client 320 and the content can be passed through this connection 330 using direct server return, direct routing, or IP forwarding techniques in a manner that avoids the director 310 as a hop. Moreover, other resources of the directors are freed as the directors no longer have to maintain multiple network connections. Accordingly, resources of the directors are fully dedicated to performing load-balancing functionality. However, this method of operation requires that a second level of load-balancing be introduced at the servers so that the content request can be forwarded to the appropriate server that is responsible for hosting the requested content. Therefore, when the server selected by the directors is not responsible for hosting the requested content, the requested content will still pass through the server farm's internal network twice (e.g., passing content from the server 350 to the server 340 before passing to the client). This requires the server farm operator to incur high monetary and operational costs to maintain at least a 2-to-1 ratio of internal-to-external bandwidth capacity. Furthermore, resources of the servers are unnecessarily consumed in maintaining network connections amongst other servers of the set of servers.
Accordingly, there is need to reduce resource usage in the server farm in order to improve the scalability of the server farm. More specifically, there is need for a server farm or CDN architecture and operational method that intelligently routes user content requests to the appropriate hosting server without the need to redundantly forward the requested content within the server farm and without the need to maintain multiple network connections for each content request.