1. The Field of the Invention
The present invention generally relates to adaptive streaming video. More specifically, the present invention relates to performance measurement and management for delivery of video content using adaptive streaming video protocols.
2. The Relevant Technology
There are many digital services available today, including mobile applications and services, Internet websites, social networking services, media and information distribution services, communications services, software and application services, data services, electronic commerce services, payment services, interactive services, and other digital services. Most digital services operate in a networked environment, such as a mobile network, the Internet, a local area network, a wide area network, another network, or a combination of such networks, and depend on the interaction of at least two, and often many more than two, different computing devices that communicate over the network, for example a user computer communicating with one or more servers over the Internet. Networked environments can be public, private, or a combination of public and private. Networked communications can occur directly, for example a user computer connecting to a server, or indirectly, for example a user computer connecting to a server that in turn connects to one or more other servers, or as another example, a user computer connecting to a server that in turn forwards the connection, or the information conveyed via that connection, to another server.
When a digital service operates in a networked environment, the user's experience with the digital service, and satisfaction with that experience, often depends to a substantial extent on the availability and performance of infrastructure resources, such as data and other resources available via the network, the availability and performance of network resources (such as the connections among the devices needed for that user's digital service session), or a combination of such resources. Infrastructure resources can include digital objects that are transmitted to or from the user device or among the servers or other devices supporting the digital service; compute resources, particularly compute resources that are not part of the user device, and that therefore are located remotely and accessed over the network; storage resources; data resources; network resources that connect various other resources and facilities in a networked environment, including mobile networks, the Internet, and private networks; image rendering resources; and other resources that support the digital service. Such resources may be available in multiple instances, limited instances, or single instances; may be available to all potential users, some potential users, or only a single potential user; and may be controlled and managed by the digital service (for example, servers that are operated by the digital service that host an
Internet website), may be controlled and managed by a service provider with a relationship to the digital service (for example, a content delivery network service provider that provides delivery of website objects), may be controlled and managed by a service provider with a relationship to the user (for example, a user's mobile network), or may be controlled and managed by an entity with no relationship to either the digital service or the user (for example, an intermediate Internet network over which data is routed from the digital service to the user device).
Infrastructure resources are often subject to demand extremes that are unpredictable, making these resources important to manage effectively over a variety of time periods, including over long, intermediate, short, and sometimes very short periods of time; are often interconnected with limited flexibility, sophistication, and responsiveness, especially across independently managed organizations (for example, Border Gateway Protocol [BGP], the fundamental Internet routing protocol that controls the routing and flow of data between separate Internet Autonomous Systems, takes as an input the interconnection topology of networks but not the volume of traffic flows across, or the performance of, the network links that interconnect the networks); are often operated at levels approaching capacity limits, resulting in performance variability as well as significant dislocations in the event of partial or temporary failures; and are frequently dependent on the performance and/or availability of other resources operated by one or more disparate entities. Additionally, a user's actual or perceived experience with a digital service is commonly a “weakest link” circumstance, where a single unavailable or poorly performing resource may degrade the entire actual or perceived user experience; for example, an HTML web page may not render properly in a browser window, or may not render at all, until the last page object to be delivered to the user device is received at the browser (which then finally enables the browser to render the full page) and as a result if one banner advertisement, embedded image, or other embedded page object is slow to arrive at the user device, the entire web page appears to the user to be slow to render; as another example, in many cases a video that begins playing properly will freeze and the user's device will display a buffering indicator if the data that makes up the video does not arrive at the device at the rate at which video playback renders the video data into video frames played out to the user, and as a result if one data packet is slow to arrive at the user device and arrives only after it is needed, video playback is interrupted and the video is not experienced normally and continuously by the user.
The variety of resources required, proliferation of interconnected resource infrastructure service providers, performance variability of infrastructure and other service providers, lack of control over necessary and/or unavoidable intermediate resource infrastructure and other service providers, the volatility and unpredictability of demand for (and utilization of) infrastructure resources, and the weakest link characteristics of many digital services, combined with the high performance expectations of users and low tolerance by users for error and delay, creates a demanding technical environment within which digital services operate.
A variety of commercially available services have been developed to address various aspects of this technical environment. Cloud computing services, such as EC2 from Amazon Web Services and Cloud Servers from Rackspace, provide rapidly scalable and highly available computing infrastructure services, including virtual servers addressed through IP (Internet Protocol) addresses in the same manner as physical servers, but provisioned on demand rather than physically. Cloud storage services, such as S3 from Amazon Web Services and Orchestrate™ Cloud Storage from Limelight Networks, provide rapidly scalable and highly available data storage services, where objects are typically addressed through Uniform Resource Locators (URLs) rather than through a file access method or file system associated with a physical storage device. Content delivery networks, such as Akamai Technologies and
Limelight Networks, provide content (data and media object) delivery services by assigning servers to content requests through resolution of the hostname contained in a URL, leveraging the Internet's Domain Name System (DNS) infrastructure; the IP addresses returned when hostnames are resolved are intended to identify servers that are located close to the requesting point (in network terms) and are well-performing (typically, not under severe load conditions), thereby reducing network-related delays (by reducing network distance and the number of intermediate networks) and server-related delays (by avoiding overloading servers and by increasing server resources as demand for those resources increases). Network optimization services, such as Internap and Level 3, reduce network-related delays by reducing network-induced latency and improving consistency by selecting better network routes, using private network segments, and/or classifying and prioritizing network traffic across networks or network segments, or network interconnections.
Each of these approaches has inherent limitations.                Using virtual server IP addresses provides an element of scalability and availability, but since virtual server IP addresses are generally each provisioned at a particular network location, the virtual server IP addresses may be far (in terms of network distance) from a given user computing device and/or subject to intermediate network performance; in addition, such virtual servers, like real physical servers, can become overloaded and therefore slower to respond.        Using cloud storage addresses provides an element of scalability and availability, but since cloud storage addresses are generally each provisioned at a particular network location or small group of network locations, they may be far (in terms of network distance) from a given user computing device and/or subject to intermediate network performance.        The DNS resolutions performed by content delivery networks allow for selection of servers that are ideally close, in network terms, to the user device, and that are not overloaded, but such DNS resolutions are not always accurately based on the location of the user's device, and further are generally not very granular. DNS resolution requests are not sent directly by the user's device to the content delivery network's DNS servers; rather, the user's device, typically along with thousands (or tens or hundreds of thousands) of other user devices, sends DNS requests to the local name server of the access network or mobile data network (generally, an Internet Service Provider, or ISP) that the user device is connected to, and the ISP local name server then sends the request to the content delivery network DNS servers for resolution. As a result, content delivery network DNS servers know only the location of the ISP local name server, and not the location of the user device itself, at the time of DNS resolution; while it is often the case that the network location of the ISP local name server is a good indicator of the network location of the user device, this is not always true—in a significant percentage of cases the network location of the ISP local name server is not an accurate indicator of the network location of the user device and in many cases is misleading. Further, DNS resolutions received by the ISP local name server from the content delivery network's DNS servers are cached (stored for a specified period of time) by the ISP local name server and reused, without re-contacting the content delivery network's DNS servers, for many other DNS resolution requests sent to the ISP local name server from other user devices; as a consequence, a single content delivery network DNS resolution may be reused thousands or tens of thousands of times before the ISP local name server re-contacts the content delivery network's DNS servers. And finally, a DNS resolution request, which is an Internet standard message, incorporates just the hostname contained in a URL, not the entire URL, and specifically does not include the path portion of the URL typically needed to identify the specific resource addressed by the URL; further, DNS resolutions are performed before the entire URL (which identifies the specific resource) is transmitted by the user device to the content delivery network (this URL transmission is the actual request from the user device to the content delivery network for the resource). Accordingly, a content delivery network's DNS hostname resolutions generally must each accommodate many (likely thousands or tens of thousands) possible content objects, many (likely thousands or tens of thousands) prospective user requests, and unknown (and possibly changing) potential server demand, all at the moment the hostname is resolved into one or more IP addresses.        While selecting a better network route or prioritizing network traffic can improve network performance over the managed portion of the end-to-end network route, it does not manage the unmanaged portion of the end-to-end network route, nor does it address server performance, server scalability, or server overloading.        
Within each, and across all, of these approaches, one of a group of service providers may perform best under one set of technical conditions (comprising, for example, particular user devices, particular access networks, particular access network types [such as mobile or broadband], particular network and/or geographic locations, particular object types and sizes or resource characteristics, particular object library sizes and access patterns or resource access characteristics, and particular demand scenarios), while another service provider may perform better under a second set of technical conditions. Service provider performance under various sets of technical conditions may change over time, slowly or rapidly, may fluctuate regularly or irregularly, and may be significantly impacted by unusual occurrences, which can be long-lived or short-lived, such as sudden changes in aggregate demand or localized demand, outages, network and equipment failures, and security attacks and breaches. Service provider performance can be difficult to test fairly and objectively, especially when service providers are aware of testing and/or when performance testing itself is detectable by technical methods, especially automated technical methods. Service provider performance is also difficult to monitor in actual use (as opposed to in testing use), in detail, from a user's perspective, across multiple service providers and sets of technical conditions, within a timeframe that allows for effective action.
Further complicating the technical environment, in many cases digital services elect to allocate their use of each given infrastructure service among two or more infrastructure service providers, for example, concurrently using two (or more than two) content delivery networks to service content requests, allocating a given request to one or the other content delivery network. There are both business and technical reasons for this; a digital service may increase its negotiating leverage over service providers by utilizing more than one, and at the same time gains redundancy that allows it to continue operating if one of the service providers suffers an outage. Also in many cases, resources are supplied from multiple disparate originators, for example in the case of editorial content combined with advertising content (wherein the editorial content commonly originates from a publisher and the advertising content commonly originates from a separate advertising agency, network, server, or service provider); in such cases it is common that each originator utilizes a separate, independently managed technical infrastructure. When a digital service uses multiple infrastructure service providers, or needs to supply resources to end users from multiple originators, or both, the aggregate technical performance and actual or perceived user experience is subject to a greater and more complex range of technical factors, resulting from the performance variations among the infrastructure service providers, performance variations among the multiple originators, and performance characteristics of interactions among the multiple originators. Under these conditions, the weakest link condition may exacerbate effects on the overall user experience, or the experiences of a significant portion of users.
Some of these challenges might be partially addressed through the implementation on the user's device of software that collects data associated with technical performance and then adjusts infrastructure service provider selections and infrastructure services based on that data. This approach is limited, however, by limitations on the software environments implemented on user devices by the manufacturers of user devices, and by the increased complexity and other consequences to digital service operators of increasing the amount of programming included in the digital service's user device software or applications. For example, the browsers built into most mobile devices, including the mini Safari browser built into Apple's iOS and the mobile Chrome browser built into Android-certified devices, do not support the installation of browser extensions; as a result, browser-based applications can only interact with the browser, and the software function of the browser cannot be extended. Similarly, the software programming environments available on most mobile devices (including iOS-based devices and most Android-certified devices) do not support direct, low-level interaction with the Transmission Control Protocol (TCP) handling module on the device; this means that a software application running on an iOS device, for example, cannot directly measure connection latency, TCP packet loss, or TCP packet jitter (variations in the arrival rate of TCP packets). As a result, relying on software on the user's device would require relying on software that, in the case of many devices (including most mobile devices), is limited in terms of what it can observe, measure and implement. Even in the less restricted environment of a desktop browser, wherein using a browser extension installed into the browser can enable lower-level interaction with desktop operating system functions, the explicit user action required to install the browser extension operates as a significant impediment to practical, widespread implementation. And finally, implementing additional technical function in an end user application or via a browser extension increases the amount of programming code in the application (or in the application combined with the extension), which in turn increases programming development scope and cost, quality assurance scope, programming and operational complexity, the risk of software failure, and time to market for new software products and new releases of existing software products.
What is needed, then, is a way to assign (where appropriate) or deliver (where appropriate) resources needed by a digital service operating in a networked environment, and to assign infrastructure service providers to resource tasks required by a digital service, that operates without software extensions added to user device apps or user device browsers, that measures and manages performance of infrastructure resources, that measures and manages performance of multiple disparate infrastructure service providers, and that as a result improves the user's actual and perceived experience with the digital service.
Adaptive bitrate streaming can improve a user's experience when streaming multimedia content, such as video, over a data network or telecommunications network. To enable adaptive streaming of a video file, typically the video is encoded into multiple separate files, sometimes referred to as renditions or variants, each of which represents the same video encoded at a different reference bitrate. These files are then divided into segments in a consistent manner across the related group of files, with each segment typically (but not necessarily) a few seconds to several seconds in duration, e.g., the first segment of each variant file comprising the first ten seconds of the video, the second segment of each variant file comprising the second ten seconds, etc. Note that while this time-based segmentation is consistent across the bitrate files, it is not necessary that each sequential segment be the same duration; for example, the first segment of each bitrate can be ten seconds in duration, the second segment of each bitrate can be five seconds in duration, and the third segment of each bitrate can be six seconds in duration, and so on. Then, during playback of the video, the video player downloads the video segment file by segment file and can shift between different reference encoding bitrates as it proceeds from one segment to another, if necessary and depending on the rate at which segment files are downloaded to the device and other performance considerations, which can be affected by network conditions, server performance, device performance or characteristics, and/or other technical issues. For example, to maintain continuous playback, the player can downshift to a lower reference encoding bitrate when the network is congested and throughput is reduced; later, if network performance improves and throughput increases, the player can upshift to a higher reference encoding bitrate. Note that although playback is not interrupted, the user's experience can still be affected since a downshift in reference bitrate can cause a variation in video quality that is noticeable to the user, and since a lower bitrate video file may have reduced definition and other visual characteristics.
Generally, video file encoding can be performed in a consistent manner or in a variable manner. When a video file is encoded in a consistent manner, the encoded bit rate of the video is consistent during the video; accordingly, when a consistently encoded video is divided into segments, each segment of a given duration will be a comparable size file to other segments of the same duration. When a video file is encoded in a variable manner, the encoded bit rate of the video may vary during the video, for example when greater motion in a given sequence of video frames results in a higher bitrate in order to preserve the visual consistency of that sequence compared to other portions of the video; accordingly, when a variably encoded video is divided into segments, a segment of a given duration may be a different file size, larger or smaller, compared to one or more other segments of the same duration. Not all adaptive streaming video specifications and/or implementations support, or work properly with, variably encoded video files, but some may.
Typically, in order to obtain the video segment files for playback the video player first requests a master manifest file, sometimes also called an index file or a playlist, by issuing an HTTP GET request for the master manifest Uniform Resource Locator (URL). The master manifest is typically a text file comprising a plurality of URLs, each of which identifies a variant manifest; these URLs can be absolute or relative URLs, and are commonly relative URLs. The video player then requests some or all of the variant manifest files by issuing HTTP GET requests for the URLs of the required variant manifests. The video player may also issue HTTP header requests for the URLs of some or all of the variant manifests that are not requested in full (if any); this enables the video player to confirm that a manifest file is available for later download, and to obtain information about the file contained in the header. Each variant manifest is typically a text file comprising a plurality of URLs, each of which identifies a video segment file; these URLs can also be absolute or relative URLs, and are commonly relative URLs. Manifest files can contain other information in addition to URLs, for example metadata and other descriptive or control information. In the case of live or linear video, as the video player proceeds through playback of the segments identified in the then-current variant manifest, it will request an updated variant manifest, which should contain additional video segment URLs; in normal operation, updated variant manifest files will continue to be requested by, and available to, the video player until a manifest file is reached that contains an endlist tag or comparable indicator that the video stream has reached its end.