Content Delivery Service Providers (CDSP), such as Akamai Technologies, Inc. of Cambridge, Mass., Speedera Networks, Inc. of Santa Clara, Calif., among others, distribute content from origin sites to cache servers on the edge of the network, and deliver content to the users from these edge servers (referred to as Content Delivery Servers (CDSs)). The distribution mechanism may be based on push technologies such as multicasting the data to all the edge servers through terrestrial or satellite links, and/or pull technologies such as those used by proxies. The goal is to decrease the latency of user access to the objects by delivering the objects from an edge server closest to the user.
A content delivery network (CDN) comprises a set of CDSs dispersed across the Internet, as well as a domain name server (DNS) infrastructure, which is used to route user requests to the nearest CDS. The DNS requests sent from the user browser need to be directed to the DNS of the CDSP. One technique is for the CDSP to “takeover” the DNS functionality of the origin site so as to become the “authoritative DNS” for the origin site. This is an easy approach to implement, but the problem with this approach is that all the objects from the domain that has been taken over needs to be served from the content delivery servers. This technique will not work, for example, if it is required that all the html pages are served from the origin site, but all the images (e.g., gifs and jpgs) are served from the CDSs.
In a second approach, termed “uniform resource locator (URL) rewrite,” the authoritative DNS functionality still stays with the origin site's DNS. Any top level page requested by a user will be served from the origin server. However, before the page is served, some or all of the embedded links found in the top level page are rewritten to point to the CDSP DNS, so that requests to embedded objects can be redirected by the CDSP DNS to the closest CDS.
The main objective of performing user specific request redirection is to provide service differentiation based on user specific information. User specific request redirection refers to the process of redirecting requests to the same set of embedded objects in a top level page, arriving from different users, to different CDSs, based on user and/or other information. User specific request redirection is useful for providing quality-of-service (QOS) differentiation, elimination of inaccuracies due to DNS based redirection, and performance enhancement.
With user specific redirection, it becomes possible for the content provider to instruct the CDSP to provide different service levels to different users. For example, if two users carrying cookies “priority=high” and “priority=low” access a top-level page, the CDSP could provide the requested page with some or all URLs rewritten in different ways, so that the service level provided to these two users, when they access the embedded objects in the top-level page, is different. The CDSP may provide service differentiation by using DNS hierarchy, URL rewriting, or a combination of both.
Service differentiation using DNS hierarchy redirects the embedded object URLs to two different CDSP DNS hierarchies. One DNS hierarchy resolves the DNS query to one of a set of fast servers close to the user, and the other DNS hierarchy resolves the query to one of a set of slower servers, or a server that is on a site with limited bandwidth. In addition, requests from two different users could be redirected to two different service provider networks. For example, a high priority user may be directed to www.cdsp1.com DNS hierarchy, which provides fast service but costs more, while the low priority user may be directed to www.cdsp2.com, which provides slower service but is cheaper.
Alternatively, service differentiation may be provided by using a URL prefix to redirect user requests. In particular, if URL rewriting is used to redirect user requests, when URLs are rewritten, different prefixes can be added to the URL path to provide indication to the CDS about the service level that should be provided to the user. For example, if embedded objects are prefixed with level 1 when the top level page is served to one user, and with level 2 when it is served to another user, this could indicate to the CDS that if both requests arrive at the CDS, the request with a prefix of level 1 should be provided better service than the one with the prefix level 2.
Of the two redirection mechanisms discussed above, the authoritative DNS is not flexible enough to provide user specific request redirection because DNS requests do not carry any user specific information. Although the URL rewrite is a possible option for this purpose, it is not deemed a viable option for user specific redirection in practice. The URL rewrite process involves rewriting the embedded links inside an html page to point to the CDSP DNS. For the most part, embedded objects that are automatically retrieved by the browser are images that are large and will benefit from being served from the edge. During the content delivery process, the request to the top level page is sent to the content provider's origin server. DNS requests to resolve the rewritten URLs inside the top level page will then be sent to the CDSP's DNS. The CDSP DNS server returns the IP address for the CDS which is “closest” to the user. The requests for these objects are then sent to this CDS, where the objects are retrieved.
Normally, the process of URL rewrite is performed statically using a commercially available software tool, where pages that need to be rewritten are parsed and the embedded URLs are rewritten. This technique does not provide the flexibility to rewrite the same page in multiple different ways. For example, static URL rewriting does not provide different versions of the rewritten page to different users, in order to personalize downloads of embedded objects based on information found in user requests (which is what is needed for user specific redirection). Thus, static URL rewriting is not a valid option for user specific redirection.
CDSP companies have also partnered with caching and switching vendors to provide what is referred to as “dynamic URL rewrite.” In this technique, a reverse proxy cache or a load balancing switch is placed in front of the content provider's servers to perform URL rewrite on the objects. The transformations that are required to be performed are downloaded into the device. When a user request is received at the device for the first time, the request is sent to a server and the object is retrieved. This object is then URL-rewritten at the proxy/switch and then sent to the user.
There are two obvious ways in which dynamic URL rewrite may be used to implement user specific redirection. One technique is that every time a user request is received, the html page is parsed and the embedded URLs are transformed appropriately based on the requesting user. For example, if the transformation is performed based on IP addresses, each URL will be transformed to point to the IP address of the best server for that user. The first technique for dynamic URL rewrite may not be practical as the html page needs to be parsed every time a user request is received.
A second technique is that all the embedded URLs in the page are rewritten a priori in all possible ways in which it needs to be delivered to the users and all these copies are cached. When a user request is received, the appropriate page is delivered. The second technique may only be possible if the number of ways in which URL rewrite needs to be performed on a page is small. However, if the number of ways in which URL rewrite is performed is large, then the second technique is not viable.
Specifically, URL rewrite may be performed by using the IP address or domain name of the server. If the transformation is performed using IP addresses, then the number of ways in which URL rewrite can be performed on a page equals the number of server IP addresses. However, this will become impractical even for a reasonably large number of servers. Providing IP address resolution using the DNS of the CDSP also has certain drawbacks, such as source IP address inaccuracies and inaccuracies due to DNS caching.
For example, when a DNS request is received, the CDSP DNS checks the source IP address on the request, and based on this, returns the IP address of the CDS “closest” to the source IP address. This decision is made based on the assumption that the source IP address on the request is either the IP address of the user or one “close” to the user. But this may not be the case in practice. The source address on the DNS request is the IP address of the entity that sends the DNS request to the CDSP DNS. Normally, this is the local DNS server on the user site. Depending on how DNS requests are forwarded, it could also be a DNS server further along the hierarchy from the local DNS. Therefore, the selected CDS is “closest” to the entity that sends the DNS request, but not necessarily “closest” to the user. Server selection based on local DNS server IP addresses can result in a non-optimal server selection, since users are frequently distant from their local DNS severs.
Inaccuracy due to DNS caching may occur when the CDSP DNS returns the IP address of the “closest” CDS. This IP address is cached by the browser and subsequently used to resolve domain names to IP addresses locally. This means that subsequent DNS queries to the same domain name will not be sent to the CDSP DNS until the cached information is flushed. A non-optimal CDS may be used for this period of time if the network conditions change. Similarly, the local DNS, or one of the DNS servers upstream towards the CDSP DNS, could also cache DNS information. This type of DNS caching may lead to inaccurate server selection. One way to address this issue is to specify a DNS timeout (i.e., time-to-live (TTL)) that is very small. There are, however, two problems with this approach. One problem is that the DNS caches do not need to obey the timeouts.
A second problem is that it is difficult to select this timeout. The timeout needs to be small enough so that dynamic server selection is possible. On the other hand, a DNS timeout value that is too small will lead to very frequent DNS lookups at the CDSP DNS server. In addition to the drawbacks discussed above, DNS requests themselves add to the response time when content is retrieved from a CDN. DNS requests for rewritten URLs account for a significant overhead and clearly reduce the benefits of having content replicated at the network edge. For example, studies have shown that the user response time to access a top-level page (where a user enters the URL in a browser or clicks on a link) needs to be within three to four seconds; otherwise, the user may stop the download of the top-level page.
Accordingly, neither authoritative DNS nor URL rewrite is a viable technique for providing user specific redirection. Therefore, there is a need in the art for a method and apparatus that enables efficient implementation of user specific redirection in a packet switched network.