Embodiments of the invention relate generally to information retrieval in a network and, more particularly, to hosting and distributing content on a content delivery network such as the Internet.
The World Wide Web is the Internet's content retrieval system. In the Web environment, client systems effect transactions to Web servers using the Hypertext Transfer Protocol (HTTP), which provides clients with access to files (e.g., text, graphics, images, sound, video, etc.) using a standard page description language, for example, Hypertext Markup Language (HTML). A network path to a server is identified by a so-called Uniform Resource Locator (URL) having a special syntax for defining a network connection. Use of a Web browser at a client (end user) system involves specification of a link via the URL. In response, the client system makes a request to the server (sometimes referred to as a “Web site”) identified in the link and, in return, receives content from the Web server. The Web server can return different content data object types, such as .gif and jpeg files (graphics), .mpeg files (video), .wav files (audio) and the like.
As the Web server content provided by World Wide Web has continued to grow over time, so too has the number of users demanding access to such content. Unfortunately, the ever-increasing number of end users requesting Web content from Web sites has resulted in serious bandwidth and latency issues, which manifest themselves in delay to the end user.
To address these problems, many networking product and service providers have developed solutions that distribute Web site content across the network in some manner. One class of solutions involves replicating Web servers at multiple locations and directing traffic (by modifying the URL and forwarding, or using HTTP re-direct) to the “best” server based on a predefined selection policy, e.g., load balancing, network topology. Another class of solutions distributes content strategically and/or geographically, and often uses some type of centralized or hierarchical Domain Name System (DNS)-based site selection. The distributed sites include servers that perform reverse proxy with (or without) caching. One such technique routes traffic to a content distribution site nearest the requestor by modifying URLs in the top-level Web page. Other DNS techniques use a round robin traffic distribution to distribute load to the content sites, but do not take into account the location of the requester relative to those content sites.