A Content Delivery Network is a system of content-providing servers (or more generally, computers) generally operating under common control, the servers/computers each containing copies of items of data that one or more Content Provider organizations may wish to be able to provide (on request) to their existing or potential clients. The CDN computers are placed at various points in a data network so as to allow clients to access and obtain data content they require from a computer nearby (in the network), rather than all clients needing to access a single central server of the Content Provider organization in question.
Content Delivery Networks (henceforth CDNs) are increasingly being used by Content Provider organizations wishing to distribute their content to clients (sometimes referred to as “eyeballs”). The client devices can be as simple as web browsers, or may involve applications such as Internet Protocol television (IPTV) clients and set-top boxes.
The Content Provider organization is motivated in several ways. Firstly, using a CDN removes or reduces the requirement to host content on its own servers and ensure that these servers can offer the capacity required by numerous clients. Secondly the Content Provider reduces the capacity it requires from network service providers to connect its servers. Lastly the CDN can provide better service to the clients to experience the content. In this regard the CDN provides multiple content hosting sites (known as “surrogates”) nearer to the client in the network. It does this by maintaining a large number of distributed surrogates, and selecting the one that will provide the best performance (in terms of network location and load on the server). For reasons that will be explained below, CDNs often make a poor choice of surrogate, because they often cannot tell where the client actually is in the network.
Many of today's CDNs use the Domain Name System (DNS) to decide which surrogate should serve the client. To do so the content is identified by a Uniform Resource Location (URL) which contains a domain name registered to the CDN. This can be because a link inside a website has been re-written to incorporate the CDN domain name, or simply because the CDN is operated by the content provider itself (for example Google/YouTube).
An example of a process for enabling a client device to obtain data from a Content Provider via a suitably-selected surrogate of a CDN will now be described with reference to FIG. 1.
In FIG. 1, a client device 11 contacts its DNS resolver 12 (stage 1, indicated by the number “1” in a circle) with a DNS query in respect of data from a Content Provider from which the client intends to obtain data. The DNS resolver 12 is typically located in the network of, or operated under the control of the client's Internet Service Provider (ISP), but may instead be provided by a third-party. Two examples of such third-party DNS services are “Google DNS” and “Open DNS”. It should be noted that the practice of using non-ISP provided DNS resolvers is growing as users become disillusioned with ISP DNS performance or DNS hijacking practices (where mistyped DNS addresses are diverted, maliciously or otherwise, to marketing or other pages in order to generate “click-through” or sales revenue).
If the Content Provider organization intends the data item in question to be obtained from a CDN, it will generally have arranged for the DNS query from the client device to contain an indication at least of a domain name registered to the CDN. It should be noted that in receiving the DNS query from the client device, the DNS resolver 12 will (generally) also become aware (if it is not already aware) of the IP address of the client device 11. The IP address, which is effectively a routing identifier relating to the location in the network of the client device, enabling data to be routed thereto via an IP network, is generally needed by the DNS resolver for it to be able to provide a response to the client device once it has obtained the required information.
Before providing such a response, and in order to obtain the required information, the DNS resolver 12 identifies the appropriate CDN from the domain name information it has received, then interacts with a domain authoritative server 14 of the relevant CDN (stage 2), first contacting it, then receiving information from it. For reasons that will be explained below, the domain authoritative server 14 cannot (generally) ‘see’ the location of the client 11, but instead sees and uses the IP address of the DNS resolver 12 as representative of the location of the client 11, and selects a CDN surrogate 16 on the basis thereof. Where the DNS resolver 12 is provided centrally by an ISP, this location is often less granular than the distribution of the CDN surrogates also situated within the ISP network, often leading to poor or random selection of a CDN surrogate within that network, higher network costs and a less-satisfactory user experience. Further, where the DNS resolution is provided ‘off-net’ (i.e. by a third-party such as
Open DNS or Google DNS, rather than by the ISP itself), the CDN may not even be able to tell which ISP the client 11 is using, which generally makes it difficult or impossible for an appropriate local CDN surrogate 16 to be selected and used (at least on the basis of the IP address seen by the domain authoritative server 14 of the CDN), and therefore generally leads to such content requests being served from CDN nodes at major peering locations.
A further interaction (shown as a dotted arrow marked as stage 3 in FIG. 1) may also happen between the DNS resolver 12 and a localized CDN DNS resolver 15. This may happen if, for example, the result of the interaction (step 2) between the DNS resolver 12 and the domain authoritative server 14 of the CDN is merely to identify a cluster of CDN surrogates (those in a particular country, for example), in which case the possible further interaction (stage 3) between the DNS resolver 12 and the localized CDN DNS resolver 15 may be in order to identify a particular CDN surrogate within that cluster. This may be done in order to allow for load-balancing between the CDN surrogates within the cluster in question, for example.
After stage 2 (and possibly stage 3), the DNS resolver 12 then provides a response to the client device 11 (stage 4) indicating the selected CDN surrogate 16, allowing the client device 11 to request and obtain the required content from that CDN surrogate (stage 5).
ISPs may not want to highly distribute their DNS resolvers in the light of the possible cost savings obtainable by operating them centrally, such as higher and more predictable utilization, central hosting facilities and higher DNS cache hit-rates. Third-party providers do not generally have the option to deploy DNS servers/resolvers locally (within the ISP) and this would be prohibitively expensive in any case.
Further, even if the DNS resolver location is representative of client location, such techniques do not take into account any performance or preferences concerning the network used to deliver the content from the CDN surrogate to the client. A surrogate may be chosen despite the path from it (currently or generally) being under-provisioned or suffering from latency or congestion.
An IETF proposal entitled “Client IP information in DNS requests” dated 21 May 2010 (available online at http://tools.ietf.org/html/draft-vandergaast-edns-client-ip-01) proposes a modification of the DNS resolution process to allow the client's IP address to be passed through the DNS resolver. While effectively solving the inability identified above of a domain authoritative server of a CDN to determine the client device's actual location, network information would not be available to the domain authoritative server, so would still need to come from another source if account is to be taken of it in the selection of an appropriate CDN surrogate. In addition, the standards need to be set and adopted, which will take time and may ultimately fail. There is also a potential concern over the user's privacy. Although a subnet mask can be applied to the user's full IP address (effectively aggregating the behavior of users together), there is still a strict mapping between a set of users and the IP address subnet visible to the domain level DNS resolver. By analyzing traffic over time it may still be possible to group traffic that is likely to belong to the same user. Although it might be argued that the CDN will eventually see the full IP addresses and URLs that these clients are interested in (if when they eventually obtain content from the chosen surrogate), it is possible that in future scenarios the surrogate may belong to another CDN (in situations where two or more CDNs federate together to provide service).
Another IETF working group, ALTO (information about which is available online from https://datatracker.ietf.org/wg/alto/charter/), has been working on solutions for ISPs to be able to share network cost information with CDN providers (see for example an Internet Draft entitled: “ALTO and Content Delivery Networks” dated 25 Oct. 2010, available online from: http://tools.ietf.org/html/draft-penno-alto-cdn-02), an aim being to enable CDN providers to make an informed choice of surrogate. This does not solve the problem of locating the client, however. It also potentially exposes confidential information from the network provider to the CDN.
Together, DNS extensions to show client IP addresses and possible developments under the ALTO project discussed above could, if used together in an appropriate manner, lead to a solution to the problem outlined above. Both proposals are at an early stage, however, and may suffer setbacks from issues of privacy or confidentiality, or simply failure to develop an appropriate standard and implement it within ISPs and CDNs.
The other approach is simply to build a highly distributed DNS resolver network within the ISP and encourage clients to use those DNS resolvers, The ISP “knows” the location of the client and can assign the client to use the nearest DNS resolver (for example using Dynamic Host Configuration Protocol (DHCP) or capability in the Remote Access Server (RAS)). However the build-out of distributed DNS infrastructure is costly, lowers the ability to perform dynamic DNS load balancing (while running the DNS servers at high utilization), and may also lower the effectiveness of the DNS caching in each resolver.
Referring briefly to prior art patent citations, U.S. Pat. No. 7,228,359 relates to methods and apparatus for providing domain name service based on a client identifier. In particular, it describes a content distribution system which has a DNS server which is configured to provide DNS responses in response to DNS requests, and a device which interconnects between a client and the DNS server. The device includes an interface which communicates with the client, and a controller coupled to the interface. The controller can intercept a first DNS request en route from the client to the DNS server, and provide a second DNS request to the DNS server through the interface in response to interception of the first DNS request.
US patent application US 2010/0161799 relates to a system and method for obtaining content from a CDN. The method involves receiving from a first server a first DNS request including a first IP address of a first server, and a second IP address received by the first server from a first system. The method maps a correlation between the first IP address and the second IP address, and receives from the first server a second DNS request. In response to receiving the second DNS request, the method responds to the first server with a third IP address of a second server, wherein the third IP address is chosen based upon the second IP address.
According to a first aspect of the present invention, there is provided a method of enabling selection of a remote service node from a plurality of possible remote service nodes, the remote service nodes each being capable of providing a service to a user device via a data network, and each being associated with at least one service node control entity, the method comprising:                receiving from a user device a user request in respect of a service required by a user, the user request containing a first user device routing identifier relating to the location in the network of the user device, and a service indication indicative of a service provider from which the user requires service;        identifying from the service indication a service node control entity associated with the service provider;        sending to the service node control entity a service node request, the service node request containing a second user device routing identifier, the second user device routing identifier differing from the first user device routing identifier and being selected in dependence on the first user device routing identifier from a plurality of predetermined user device routing identifiers;        receiving from the service node control entity an indication of at least one remote service node capable of providing the required service to the user device; and        providing an indication of the at least one remote service node to the user device.        
By differing from the first user device routing identifier, the second user device routing identifier is able to relate to or to identify a location in the network that differs from the location in the network to which the first user device routing identifier relates.
According to preferred embodiments, the predetermined user device routing identifiers each differ from the first user device routing identifier, and may thus each relate to or identify different locations in the network that differ from the location in the network to which the first user device routing identifier relates.
According to preferred embodiments, in particular those for use in relation to IP networks, the routing identifiers are IP addresses.
According to preferred embodiments, the method is performed by or under the control of a selection entity, which may comprise a suitably-enabled Domain Name System resolver module. In such embodiments, the predetermined user device routing identifiers may be routing identifiers allocated to the selection entity, the user device routing identifiers therefore relating in fact to actual or virtual locations in the network of the selection entity.
According to preferred embodiments, the service comprises delivery of data, via one or more CDNs, for example. Alternatively, the service may comprise a remote data processing service, or another type of networked service, such as an “online gaming” service, for example. Alternatively, the service may comprise a load balancing service, or selection of a further service node for provision of a further service such as one of the above.
According to preferred embodiments, the service indication may comprise an indication of the service node control entity. The service indication may be in the form of or include a domain name, a Uniform Resource Identifier (URI), or a Uniform Resource Locator (URL), found on and selected from a website by a user. In relation to scenarios where the service concerned involves the delivery of data via a CDN, the service indication may comprise an indication of the actual data item(s) required, and/or an indication of the identity of a content provider purporting to be capable of providing the data item(s) required, and/or an indication of a CDN purporting to be capable of delivering the data item(s) required.
According to preferred embodiments, the method may further comprise obtaining one or more indications of network conditions, the second user device routing identifier being selected in dependence on the one or more indications of network conditions (i.e. as well as in dependence on the first user device routing identifier).
According to a second aspect of the present invention, there is provided an apparatus for enabling selection of a remote service node from a plurality of possible remote service nodes, the remote service nodes each being capable of providing a service to a user device via a data network, and each being associated with at least one service node control entity, the apparatus comprising means for performing a method according to any of the preceding claims.
The various options and preferred embodiments referred to above in relation to the first aspect are also applicable in relation to the second aspect.
Briefly, preferred embodiments of either aspect may be used to enable a better selection of remote service node to be made for a client. In the context of the provision of data via a CDN, preferred embodiments may be used to enable a better selection of CDN surrogate to be made, by overcoming the current problem whereby CDNs often cannot tell where the client actually is in the network.
Looking at this in more detail, preferred embodiments allow an ISP essentially to “intelligently” select and use a ‘virtual’ DNS resolver for a service node selection operation, regardless of the physical location of the DNS resolver used by the client. This allows the ISP to use any pattern of DNS resolver deployments, including the use of centralized DNS clusters of shared DNS resolvers with high utilization, high availability and close load balancing. Irrespective of the actual DNS resolver location, the ISP can “pretend” that a DNS resolver exists, and is being used, anywhere in their network. This also allows extremely rapid deployments of ‘new’ (virtual) DNS resolver locations (since these do not need to be physical servers).
Methods according to preferred embodiments can be deployed immediately since they do not generally depend on or require any new standardization activity or extension to existing protocols. Beyond the ISP's DNS resolver, the DNS system may work as currently standardized and operated. Further, no information regarding network topology, costs or utilization needs to be shared with the CDN. Network cost information can be incorporated into the decision process about which virtual DNS resolver location to use for a particular client call. This allows the network cost decision to be placed firmly within the ISP domain rather than the ISP needing to rely on the CDN to react to network cost information. It also means that preferred embodiments may work immediately without requiring each and every CDN to adopt standards such as “ALTO” standards to be able to ingest network cost information. The surrogate choice of any CDN can be affected regardless of the implementation of technology within the CDN (provided they use a DNS lookup or similar operation to make the choice of surrogate), even where the CDN is not trusted with network information or to make a decision in the interests of the ISP.