The overall capacities of broadband satellites are increasing exponentially, and such capacity increases present unique challenges in the associated ground system and network designs. The goal of the system designers, system operators, and service providers is to support and provide efficient, robust, reliable and flexible services, in a shared bandwidth network environment, utilizing such high capacity satellite systems.
According to recent internet traffic studies, media streaming traffic (e.g., video streaming) makes up more than 50% of forward link bandwidth from web servers to client devices, and more than 15% of the return link bandwidth from client devices to web servers. Further, the trend is moving upwards as more and more content providers start offering media (e.g., video) streaming services. For example, recent additions include HBO, CBS and other network and content provider streaming services. When a user watches or otherwise consumes a video, if the video is stored in a local storage device (e.g., a local cache) the streaming video content is provided directly from the local storage location. Alternatively, when the video content is not resident in a local storage device, the streaming content is provided over a wide area communications network or WAN (e.g., the Internet) from a remote content server. When the video is provided to the user or client device/application via adaptive video streaming, the user client device/application (e.g., a video player application running on a client personal computer or other media device) selects a playback rate and retrieves video segments of the respective playback rate from the content server via a request/response protocol. Further, such client playback devices/applications typically buffer a certain amount of content in order to provide the content from the local buffer at a consistent rate (thereby not having to rely on a consistent delivery rate over the WAN).
Typically, such content is addressed via a uniform resource locator (URL) or web address. The URL is a character string utilized as a reference to a web resource (e.g., an enterprise website or media content) that specifies the location of the resource on a computer network (e.g., the Internet) and a mechanism for retrieving it. Traditionally, with the caching of web content, such as streaming media content, the URLs have been used as cache keys for the HTTP objects in the cache. An HTTP object is a web resource transferred as an entity body in an HTTP response, and a cache key is an identifier to an HTTP object stored in the cache. In recent times, however, it has become commonplace to have more than one URL pointing to the same HTTP object. To solve this issue, recently URL de-duplication approaches have been devised in order maintain the feasibility of using the URL as a cache key for the respective HTTP objects. The main idea in URL de-duplication involves determination of identifiable strings in a URL that are unique to a respective HTTP object. Further, such unique identifiable strings may not be located in a URL, but instead may be located in HTTP request headers. In such cases, de-duplication approaches that rely on the URL alone to create a cache key may no longer work. Moreover, for security and prevention of pirating, providers of adaptive video streaming services have started obscuring the URL structure. The obscuring of the URL structure has resulted in URLs becoming ephemeral, where identifiable strings in the URL and HTTP header have become arbitrary and not necessarily unique.
In one current de-duplication approach, URLs are used as cache keys for HTTP objects in a squid caching proxy. See, e.g., squid-cache.org, “Squid: Optimising Web Delivery,” http://www.squid-cache.org/. As an HTTP object may be represented by more than one URL in today's Web, several techniques have been provided in squid to de-duplicate those URLs, and to map all those URLs to a single cache key for a HTTP object. Examples of such squid features include “Store URL Rewriting” and “Store ID.” The approaches require all URLs corresponding to the same HTTP object to include uniquely identifiable strings. Such approaches, however, as discussed above, may no longer work in the cases of intentionally obfuscated ephemeral URLs, as identifiable string in such URLs may not be unique. Furthermore, identifiable strings may not be located in the URLs, but instead may be included in HTTP header fields of the HTTP request.
In another recent approach, a solution is proposed that employs the use of a collision resistant hash (SHA-256) to identify an HTTP response body at a caching proxy. See, e.g., Chris Drechsler, “Hypertext Transfer Protocol: Improved HTTP Caching,” IETF Internet-Draft, May 16, 2014 (http://tools.ietf.org/id/draft-drechsler-httpbis-improved-caching-00.txt). According to the approach, the hash is computed by the content server and is transferred as a new HTTP header field in the HTTP response. In contrast to the common HTTP caching operation, regardless of cache hit or cache miss, HTTP requests are sent to the content server and the caching proxy waits for an HTTP response to process. In the case of a cache miss, the HTTP response as well as the corresponding hash carried in HTTP response header is stored in the caching proxy, and the response is relayed to the client. In the case of cache hit, after receiving the whole HTTP response header, the caching proxy aborts the transfer of the HTTP response body by issuing TCP_RST to the content server. This proposed caching operation, however, exhibits deficiencies in that it does not reduce one round-trip-time (RTT) for the HTTP in case of a cache hit, and it also requires modification to the existing HTTP protocol (which would require modification of already available off-the-shelf devices and already deployed systems).
In a further approach, a solution is proposed that also adds a cryptographic hash to a web resource to provide integrity check. See, e.g., Frederik Braun, Devdatta Akhawe, Joel Weinberger, Mike West, “Subresource Integrity,” W3C Editor's Draft, 2015 Jan. 8 (https://w3c.github.io/webappsec-subresource-integrity/). The hash is computed by the content server. It can provide a third party link to fetch the web resource. All the information, which includes a URL to a web resource, optionally a URL pointing to a third party site such as a CDN, hash algorithm, hash and content type, are provided by the original content server to a browser. To have security benefits, it is recommended that integrity metadata should be delivered via HTTPS. The approach requires the content server to provide the hash of the content. Caching based on the hash in integrity metadata can be done only when integrity metadata is transferred over HTTP. The specification is still in early stage at W3C and it is not clear that the content provider using obfuscated ephemeral URLs will apply the technique to enable caching.
What is needed, therefore, is an approach for effective and accurate content identification for caching of adaptive video streaming that does not require modification of existing video streaming protocols, such as HTTP and HTTPS.