With the development of internet devices, internet traffic increases gradually. The scale of content delivery network (CDN) becomes larger and larger. Load balancing has become an important way of controlling costs.
The main function of the CDN includes transmitting content from the source station to the client terminal in a shortest time. The basic idea of CDN is to avoid bottlenecks and links in network that may potentially affect the speed and the stability of data transmission such that the content can be transmitted faster and better. By arranging content delivery network, formed by edge node servers in various locations of the network, the user's requests can be redirected to the best edge node that is closest to the user based on comprehensive information including network traffic, load information at each edge node, distance from an edge node to the user, and response time. In this system, a global scheduling layer is added into the existing network infrastructure, and the content provided by the source station is transmitted to the network edge that is the closest to the user so that the user can obtain the desired content, and network congestion can be improved. The response time the user accesses a website can be improved. Problems such as small bandwidth at the exit of the source station, a high volume of user access, non-uniformly distributed network outlets, complex carrier networks, and low response speed to access sites caused by the small bandwidth of the user's network.
The main load balancing techniques in conventional CDN mainly include domain name system (DNS) scheduling and hypertext transfer protocol (HTTP) scheduling. The DNS scheduling is mainly used for load balancing of images and contents of dynamic site acceleration. HTTP scheduling is based on the 30× response code of the http protocol, and is mainly used for media streaming applications. Considering that media streaming is a significant portion of the CDN business, this portion becomes an important part of the CDN to control server and computer-room load balancing.
HTTP scheduling includes a central mode and an edge mode. The central mode refers to directing the user's requests to several groups of scheduling computers through the DNS. These scheduling computers do not provide contents. Instead, the decision-making computer first analyzes the bandwidth data in all the computer rooms and server clusters of the CDN, and timely schedules the traffic of server clusters based on the bandwidth data. The scheduling computers execute the scheduling such that the load of the server clusters can be balanced. Second, the scheduling computers perform Hash calculations based on the contents of the requests, and select a desired server to reduce content redundancy among the server clusters and among servers in a server cluster. The edge mode refers to distributing the user's requests to an edge server cluster through the DNS. The scheduling computer in the edge server cluster determines the states of the edge server cluster and the computer room. The scheduling machine determines whether to provide local service or to request a redirection to a backup server cluster to obtain load balancing.