A media content provider or distributor may deliver various media contents to subscribers or users using different coding schemes suited for different devices, such as televisions, notebook computers, and mobile handsets. The media content provider may support a plurality of media encoder and/or decoders (codecs), media players, video frame rates, spatial resolutions, bit-rates, video formats, or combinations thereof. A media content may be converted from a source or original representation to various other representations to suit the different user devices.
A media content may comprise a media presentation description (MPD) and a plurality of segments. The MPD may be an extensible markup language (XML) file or document describing the media content, such as its various representations, uniform resource locator (URL) addresses (or, more generally, uniform resource identifiers (URIs)), and other characteristics. For example, the media content may comprise several media components (e.g. audio, video, and text), each of which may have different characteristics that are specified in the MPD. Each media component comprises a plurality of segments containing the parts of actual media content, and the segments may be stored collectively in a single file or individually in multiple files. Each segment may contain a pre-defined byte size (e.g., 1,000 bytes) or an interval of playback time (e.g., 2 or 5 seconds) of the media content.
Depending on the application, the media content may be divided into various hierarchies. For example, the media content may comprise multiple periods, where a period is a time interval relatively longer than a segment. For instance, a television program may be divided into several 5-minute-long program periods, which are separated by several 2-minute-long commercial periods. Further, a period may comprise one or multiple adaptation sets (AS). An AS may provide information about one or multiple media components and its/their various encoded representations. For instance, an AS may contain different bit-rates of a video component of the media content, while another AS may contain different bit-rates of an audio component of the same media content. A representation may be an encoded alternative of a media component, varying from other representations by bit-rate, resolution, number of channels, or other characteristics, or combinations thereof. Each representation comprises multiple segments, which are media content chunks in a temporal sequence. Moreover, sometimes to enable downloading a segment in multiple parts, sub-segments may be used each having a specific duration and/or byte size. One skilled in the art will understand the various hierarchies that can be used to deliver a media content.
In adaptive streaming, when delivering a media content to a user device, the user device may select appropriate segments dynamically based on a variety of factors, such as network conditions, device capability, and user choice. Adaptive streaming may include various technologies or standards implemented or being developed, such as Dynamic Adaptive Streaming over Hypertext Transfer Protocol (HTTP) (DASH), HTTP Live Streaming (HLS), or Internet Information Services (HS) Smooth Streaming. For example, the user device may select a segment with the highest quality (e.g., resolution or bit-rate) possible that can be downloaded in time for playback without causing stalling or rebuffering events in the playback. Thus, the user device may seamlessly adapt its media content playback to changing network conditions. To prevent tampering or attacks to a media content, segments of the media content need to be protected via authentication schemes. Various attacks (e.g., replication attacks with segments from unexpected representations) may need to be prevented, even when those segments are correct in terms of source and scheduling/timing.
The International Organization for Standardization (ISO)/International Electrotechnical Commission (IEC) published the ISO/IEC 23009-4 document entitled “Dynamic Adaptive Streaming over HTTP (DASH)—Part 4: Segment Encryption and Authentication”, herein incorporated by reference in its entirety, describes some mechanisms for authenticating DASH segments. However, ISO/IEC 23009-4 provides no mechanism to support authentication of segment URLs in a given MPD, when the MPD is used to provide source URL information for segments to be streamed. Segment URL authentication is important in a number of situations.
One situation to introduce incorrect URL mappings may occur when a list of valid URLs is compiled by a reseller. In this case, the reseller can be an internet service provider or mobile operator, and an original DASH service provider provides DASH content such as including some advertisements, but lets the reseller deliver the DASH content to end users. If the reseller has the intention to replace an original advertisement with another advertisement, the reseller can only modify the segment URLs of the original advertisement and map them to the advertisement that it intends to stream. Therefore, it is desirable to bind a URL to the content that it represents, ensuring that the original content gets delivered to the users and that it can be validated that the intended content is actually accessed by the intended client(s).
Another situation is when an original content service provider wants to restrict access to its content from MPDs that are not generated by the provider but (re)use its segment URLs in an unauthorized manner, in order to protect the copyright and prevent hotlink misuse from malicious parties.
For the case of URL authenticity, the URL signature scheme has been discussed in MPEG as part of the on-going DASH Core Experiment (CE) on URL authentication. The main idea is to sign each individual URL and create another URL pointing to the URL signature. Then, the client requests the URL signature and checks for authenticity of the URL. This scheme can ensure the authenticity, integrity, and origin of the URL, provided the signature URL has not been tampered with. Nevertheless, this scheme is subject to three types of attacks.
A tamper attack on individual URLs and their signature URLs. The attacker can replace both an authentic URL and its signature URL by another authentic URL and its corresponding signature URL generated by a creator of both of the signatures.
A shuffle attack on all or parts of URLs. The attacker can change the order of URLs and the client is unable to detect that attack because each signature verification is successful.
A deletion attack on one or more URLs. The client cannot detect this attack via URL authentication.
Another purpose of URL authentication is to verify authenticity of the segment to ensure the received segment is the segment that the client intended to receive. However, if the technical solution does not associate the URL with the segment, a segment tamper attack may occur. That is, the intended segment can be replaced on the segment server or during segment delivery, even though the segment is signed. For instance, the attacker can replace both segment A and its signature with segment B and its signature, and the client cannot detect this attack.
For the content access authorization case, a URL-based signature scheme was proposed in S4-130680, Gap Analysis on DASH Authentication Use Cases, 3GPP TSG-SA4 Meeting #74 and RFC 6983, Models for HTTP-Adaptive-Streaming-Aware Content Distribution Network Interconnection, July 2013, to sign each segment URL for each client as an authorization and the content delivery network (CDN) checks the signature before delivering the segment to the client. The scheme is secure from attack however it may cause heavy workload for the signing server because the server needs to sign every URL in the MPD for each authorized user.