Technical Field
This application relates generally to distributed data processing systems and to the delivery of content to end users over computer networks, and more particularly to the measurement and assessment of content delivery services.
Brief Description of the Related Art
Distributed computer systems are well-known in the prior art. One such distributed computer system is a “content delivery network” or “CDN” that is operated and managed by a service provider. The service provider typically provides the content delivery service on behalf of third parties. A “distributed system” of this type typically refers to a collection of autonomous computers linked by a network or networks, together with the software, systems, protocols and techniques designed to facilitate various services, such as content delivery or the support of outsourced site infrastructure. Typically, “content delivery” refers to the storage, caching, or transmission of content—such as web pages, streaming media and applications—on behalf of content providers, and ancillary technologies used therewith including, without limitation, DNS query handling, provisioning, data monitoring and reporting, content targeting, personalization, and business intelligence.
In a known system such as that shown in FIG. 1, a distributed computer system 100 is configured as a content delivery network (CDN) and is assumed to have a set of machines 102 distributed around the Internet. Typically, most of the machines are content servers located near the edge of the Internet, i.e., at or adjacent end user access networks. A network operations command center (NOCC) 104 may be used to administer and manage operations of the various machines in the system. Third party sites affiliated with content providers, such as web site 106, offload delivery of content (e.g., HTML, embedded page objects, streaming media, software downloads, and the like) to the distributed computer system 100 and, in particular, to the servers (which are sometimes referred to as proxy servers if running a proxy application as described below, or sometimes as “edge” servers in light of the possibility that they are near an “edge” of the Internet). Such servers may be grouped together into a point of presence (POP) 107.
Typically, content providers offload their content delivery by aliasing (e.g., by a DNS CNAME) given content provider domains or sub-domains to domains that are managed by the service provider's authoritative domain name service. End user client machines 122 that desire such content may be directed to the distributed computer system to obtain that content more reliably and efficiently. The servers respond to the client requests, for example by obtaining requested content from a local cache, from another content server, from the origin server 106, or other source.
Although not shown in detail in FIG. 1, the distributed computer system may also include other infrastructure, such as a distributed data collection system 108 that collects usage and other data from the content servers, aggregates that data across a region or set of regions, and passes that data to other back-end systems 110, 112, 114 and 116 to facilitate monitoring, logging, alerts, billing, management and other operational and administrative functions. Distributed network agents 118 monitor the network as well as the server loads and provide network, traffic and load data to a DNS query handling mechanism 115, which is authoritative for content domains being managed by the CDN. A distributed data transport mechanism 120 may be used to distribute control information (e.g., metadata to manage content, to facilitate load balancing, and the like) to the servers.
As illustrated in FIG. 2, a given machine 200 in the CDN (sometimes referred to as an “edge machine”) comprises commodity hardware (e.g., an Intel processor) 202 running an operating system kernel (such as Linux® or variant) 204 that supports one or more applications 206a-n. To facilitate content delivery services, for example, given machines typically run a set of applications, such as an HTTP proxy 207, a name server 208, a local monitoring process 210, a distributed data collection process 212, and the like. The HTTP proxy 207 (sometimes referred to herein as a global host or “ghost” application) typically includes a manager process for managing a cache and delivery of content from the machine. For streaming media, the machine typically includes one or more media servers, such as a Windows® Media Server (WMS) or Flash® 2.0 server, as required by the supported media formats.
The machine 200 shown in FIG. 2 may be configured to provide one or more extended content delivery features, preferably on a domain-specific, content-provider-specific basis, preferably using configuration files that are distributed to the content servers using a configuration system. A given configuration file preferably is XML-based and includes a set of content handling rules and directives that facilitate one or more advanced content handling features. The configuration file may be delivered to the servers via the data transport mechanism. U.S. Pat. Nos. 7,111,057 and 7,240,100 illustrate a useful infrastructure for delivering and managing CDN server content control information and this and other content server control information (sometimes referred to as “metadata”) can be provisioned by the CDN service provider itself, or (via an extranet or the like) the content provider customer who operates the origin server.
The CDN may include a network storage subsystem for the content providers to store and originate content (sometimes referred to herein as “NetStorage”) which may be located in a network datacenter accessible to the content servers, such as described in U.S. Pat. No. 7,472,178, the disclosure of which is incorporated herein by reference. For live streaming delivery, the CDN may include a live delivery subsystem, such as described in U.S. Pat. No. 7,296,082, and U.S. Publication No. 2011/0173345, the disclosures of which are incorporated herein by reference.
As an overlay, the CDN resources may be used to facilitate wide area network (WAN) acceleration services between enterprise data centers (which may be privately managed) and third party software-as-a-service (SaaS) providers.
Given the ability to configure the CDN servers described above, a wide variety of content delivery features may be implemented in the CDN platform generally and by the CDN servers specifically. For example, a server may be configured to apply modifications to a given web page as it traverses the server (e.g., going from an origin/source to an end-user client) so as to reduce the number of requests the client has to make, to reduce the payload of the content, to accelerate client application processing/rendering, to tailor the content for a particular client device (and its capabilities), or otherwise enhance the performance and functionality of the content. A wide variety of such treatments are known in the art and often referred to as ‘front-end’ web optimizations or as ‘web content’ optimizations.
By way of example, U.S. Publication No. 2011/0314091 describes systems and methods for applying performance-enhancing modifications to web pages, and teachings of this publication are hereby incorporated by reference herein. A dynamic image delivery system is described in U.S. Pat. No. 8,060,581, the content of which are hereby incorporated by reference. U.S. Patent Publication No. 2012/0265853 and U.S. Patent Publication No. 2012/0259942 describes systems and methods for streaming media and for executing a byte-based interpreter in a proxy server that can be used to modify content, to add rights management information and/or watermarks and the like. The content of all of the foregoing patent documents are hereby incorporated by reference in their entireties.
Other performance-enhancing aspects of the CDN platform relate to the ability to intelligently map end-user clients to servers, and to the ability to intelligently route and manage the transmission of content across the network. For example, the CDN may operate a cache hierarchy to provide intermediate caching of customer content; one such cache hierarchy subsystem is described in U.S. Pat. No. 7,376,716, the disclosure of which is incorporated herein by reference. A transport and routing mechanism for arbitrary data flows is described in U.S. Pat. No. 7,660,296, the disclosure of which is hereby incorporated by reference. A system and method for delivery of content using intermediate nodes to facilitate content delivery is described in U.S. Pat. No. 6,820,133, the content of which are hereby incorporated by reference. A global hosting system that can utilize a network map is described in U.S. Pat. No. 6,108,703, the contents of which are hereby incorporated by reference.
There are many ways to measure the performance of web pages (and of a CDN) in a general sense, be it using synthetic monitoring or so-called real-user monitoring from within a browser. However, current performance measurement approaches are limited. It would be desirable to be able to better show the value of individual features or enhancements offered by a CDN on page performance. Furthermore, it is desirable to improve the ability to identify and address performance issues that may be affecting particular features or particular aspects of the CDN platform, or particular kinds of end-user clients (particular browsers or other client applications or particular devices) served by the CDN, or particular combinations of the foregoing. The teachings herein address such needs and offers and other benefits and advantages that will become clear in view of this disclosure.