Technical Field
This application relates generally to content delivery networks and to the creation and operation of a test environment to enable a content provider to test integration of their origin infrastructure with the content delivery network.
Brief Description of the Related Art
A “content delivery network” or “CDN” is often operated and managed by a service provider. The service provider typically provides the content delivery service on behalf of third party content providers. A distributed system of this type typically refers to a collection of autonomous computers linked by a network or networks, together with the software, systems, protocols and techniques designed to facilitate various services, such as content delivery or the support of outsourced site infrastructure. The infrastructure is generally used for the storage, caching, or transmission of content—such as web page objects, streaming media and applications—on behalf of such content providers or other tenants. The platform may also provide ancillary technologies used therewith including, without limitation, DNS query handling, provisioning, data monitoring and reporting, content targeting, personalization, and business intelligence. The CDN's components may be located at nodes that are publicly-routable on the Internet, within or adjacent nodes that are located in mobile networks, in or adjacent enterprise-based private networks, or in any combination thereof.
In operation, the CDN platform retrieves content from the content provider's origin infrastructure, and delivers it to requesting end-user clients. A CDN typically employs a set of proxy servers, and when a given server in the CDN receives a request for an object (e.g., an HTML document, an image file, scripts, cascading style sheets, videos, XML documents) from an end user client device, it identifies the content as being from the content provider and applies a set of configurations instructions, sometimes referred to herein as ‘metadata’. The configuration instructions are usually customized for the particular content provider, i.e., each content provider can independently specify how they want the CDN to handle such requests. In a basic reverse proxy operation, the server may check whether it has a valid copy (i.e., unexpired) of the object in its local cache. If so, it can serve the request from the cache. If not, it can issue a forward request to obtain the content from an origin server. It may check whether the content is in a cache parent (i.e., in a hierarchical caching system) before requesting the content from the origin; or, cache parent may pass the forward request on to the origin if there is another cache miss.
Given these operations, the content provider's origin infrastructure plainly must be integrated with the CDN platform. It is often necessary to conduct extensive testing of the integration before the content provider's website goes live. Further testing is likely necessary when implementing significant changes in the website, or in the origin infrastructure that hosts the website, and/or in the configuration of the CDN.
It is difficult to conduct quality testing of the integration between a content provider's origin infrastructure and a CDN platform.
Ideally, a content provider's test traffic is isolated from its production traffic, and indeed it is desirable to have many isolated test environments (referred to here as ‘sandboxes’) readily available to a content provider. Preferably the test environment is as similar as possible to an actual production environment, meaning that the test traffic runs through the actual production CDN platform, both hardware and software, albeit with a test CDN configuration applied. Moreover, a content provider likely employs multiple developers and teams. Developers should be able to run tests with the same origin hostname, but with an test origin server and CDN test configuration of the individual developer's (or individual team's) choosing.
Enterprise security layers at the content provider complicate matters. A content provider's development team is almost certainly working on an enterprise LAN behind the corporate firewall. But to test content provider to CDN integration, CDN servers ought to be able to reach the developer's chosen test client and just as important chosen test origin. Developers want flexibility and speed in setting up new tests, but it is not feasible to quickly and repeatedly re-configure the enterprise firewall to allow a CDN server to contact an arbitrary test client and/or test origin.
What is needed is a safe and secure way for developers working on behalf of a content provider to test new origin and CDN configurations, preferably within a production CDN server environment, while still having the ability to quickly and flexibly instantiate new sandboxes that are compatible with the content provider's enterprise security layer.
The teachings hereof address this technical problem. The teachings hereof also provide other benefits and improvements that will become apparent in view of this disclosure.
A general background on CDNs is now provided.
In a known system such as that shown in FIG. 1, a distributed computer system 100 is configured as a content delivery network (CDN) and has a set of computers 102 distributed around the Internet. Typically, most of the computers are configured as servers and located near the edge of the Internet, i.e., at or adjacent end user access networks. A network operations command center (NOCC) 104 may be used to administer and manage operations of the various machines in the system. Third party sites affiliated with content providers, such as web site 106, offload delivery of content (e.g., HTML or other markup language files, embedded page objects, streaming media, software downloads, and the like) to the distributed computer system 100 and, in particular, to the servers (which are sometimes referred to as content servers, or sometimes as “edge” servers in light of the possibility that they are near an “edge” of the Internet). Such servers may be grouped together into a point of presence (POP) 107.
Typically, content providers offload their content delivery by aliasing (e.g., by a DNS CNAME) given content provider domains or sub-domains to domains that are managed by the service provider's authoritative domain name service. End user client devices 122 that desire such content may be directed to the distributed computer system to obtain that content more reliably and efficiently. The CDN servers 102 respond to the client device requests, for example by obtaining requested content from a local cache, from another CDN server 102, from the origin server 106, or other source.
Although not shown in detail in FIG. 1, the distributed computer system may also include other infrastructure, such as a distributed data collection system 108 that collects usage and other data from the CDN servers, aggregates that data across a region or set of regions, and passes that data to other back-end systems 110, 112, 114 and 116 to facilitate monitoring, logging, alerts, billing, management and other operational and administrative functions. Distributed network agents 118 monitor the network as well as the server loads and provide network, traffic and load data to a DNS query handling mechanism 115, which is authoritative for content domains being managed by the CDN, and which acts as a request routing mechanism to direct clients to a selected CDN server 102. A distributed data transport mechanism 120 may be used to distribute control information (sometimes referred to as “metadata”) to the CDN servers.
A more detailed description of an embodiment of a CDN server 102 is now provided. A given CDN server can be implemented as a computer that comprises commodity hardware (e.g., a microprocessor with memory holding program instructions) running an operating system kernel (such as Linux (or variant) that supports one or more applications. To facilitate content delivery services, for example, given computers typically run a set of applications, such as an HTTP (web) proxy server, a name service (DNS), a local monitoring process, a distributed data collection process, and the like. The HTTP proxy server (sometimes referred to herein as a HTTP proxy for short) is a kind of web server and it typically includes a manager process for managing a local cache and delivery of content from the machine. For streaming media, the machine may include one or more media servers, as required by the supported media formats.
A CDN server 102 may be configured to provide one or more extended content delivery features, preferably on a domain-specific, content-provider-specific basis, preferably using configuration files that are distributed to the CDN servers using a configuration system. A given configuration file preferably is XML-based and includes a set of content handling rules and directives that facilitate one or more advanced content handling features. The configuration file may be delivered to the CDN server via the data transport mechanism. U.S. Pat. No. 7,240,100, the contents of which are hereby incorporated by reference, describe a useful infrastructure for delivering and managing CDN server content control information and this and other control information (again sometimes referred to as “metadata”) can be provisioned by the CDN service provider itself, or (via an extranet or the like) the content provider customer who operates the origin server. U.S. Pat. No. 7,111,057, incorporated herein by reference, describes an architecture for purging content from the CDN.
Preferably, the CDN operates a DNS infrastructure to route client requests (i.e., request routing service) to a selected CDN server 102. In a typical operation, a content provider identifies a content provider domain or sub-domain that it desires to have served by the CDN. The CDN service provider associates (e.g., via a canonical name, or CNAME, or other aliasing technique) the content provider domain with a CDN hostname, and the CDN provider then provides that CDN hostname to the content provider. When a DNS query to the content provider domain or sub-domain is received at the content provider's domain name servers, those servers respond by returning the CDN hostname. That CDN hostname is then resolved through the CDN name service. To that end, the CDN domain name service returns one or more IP addresses (via consultation with the mapmaker shown in FIG. 1). The requesting client application (e.g., a web browser) then makes a content request (e.g., via HTTP or HTTPS) to a CDN server 102 associated with the IP address. The request includes a host header that includes the original content provider domain or sub-domain. Upon receipt of the request with the host header, the CDN server 102 checks its configuration file to determine whether the content domain or sub-domain requested is actually being handled by the CDN. If so, the CDN server 102 applies its content handling rules and directives for that domain or sub-domain as specified in the configuration. These content handling rules and directives may be located within an XML-based “metadata” configuration file, as described previously. Thus, the domain name or subdomain name in the request is bound to (associated with) a particular configuration file, which contains the rules, settings, etc., that the CDN server 102 should use when processing that request.
A CDN may have a variety of other features and adjunct components. For example the CDN may include a network storage subsystem (sometimes referred to as “NetStorage”) which may be located in a network datacenter accessible to the CDN servers, such as described in U.S. Pat. No. 7,472,178, the disclosure of which is incorporated herein by reference. The CDN may operate a server cache hierarchy to provide intermediate caching of customer content; one such cache hierarchy subsystem is described in U.S. Pat. No. 7,376,716, the disclosure of which is incorporated herein by reference. In a typical cache hierarchy, each client-facing server has a cache parent (or cache parent group), which may be statically or dynamically assigned. The child server goes to the cache parent to see if it has the object before going to the origin. If the parent does not have the object in cache either, then either the parent or the child server goes to origin. Some cache hierarchies have additional layers. For more information on cache hierarchies in CDNs, see U.S. Pat. No. 7,376,716 and see also Chankhunthod et al., “A Hierarchical Internet Object Cache”, Proceedings of the USENIX 1996 Annual Technical Conference, San Diego, Calif. 1996, the disclosure of both of which is incorporated herein by reference for all purposes. For information on how cache parents can be dynamically chosen (and cache hierarchies formed based on network conditions and distances), see U.S. Pat. No. 7,274,658 the disclosure of which is incorporated by reference herein for all purposes.
Communications between CDN servers and/or across the overlay may be enhanced or improved using techniques such as described in U.S. Pat. Nos. 6,820,133, 7,274,658, 7,660,296, the disclosures of which are incorporated herein by reference.
For live streaming delivery, the CDN may include a live delivery subsystem, such as described in U.S. Pat. No. 7,296,082, and U.S. Publication No. 2011/0173345, as well as a transcoding system as described in U.S. Pat. No. 9,432,704, the disclosures of which are incorporated herein by reference.