Field
The current disclosure relates to audio-visual (AV) content delivery, and more specifically, but not exclusively, to AV content delivery using a packet-switched network.
Description of the Related Art
Audio-visual (AV) content, in the form of digital data, may be delivered to consumers using content delivery networks (CDNs). A typical CDN uses a packet-switched network, such as the Internet, to deliver encoded AV content from an origin server to a large group of user devices via a collection of edge servers, each of which is typically located proximately to a subgroup of the user devices. User devices include players configured to decode received digital audio-visual content. A typical CDN comprises an origin server and a plurality of edge servers. Each edge server is connected to one or more different types of end-user devices, such as, for example, set-top boxes (STBs), desktops, laptops, tablets, smart phones, and cellular phones.
AV content items, also known as assets, are created in an original high-resolution format—such as, for example, AVCHD (Advanced Video Coding High Definition) or AVC-Intra—that typically stores information about each of the frames in the asset with only limited, if any, compression, in order to preserve a maximal amount of information. The original-format asset may be further compressed to generate a high-resolution mezzanine lossy intermediate file of a smaller size than the original-format asset. Common standards for additional compression are ProRes 422, JPEG2000, and long GOP MPEG-2 (H.222/H.262 as defined by the ITU).
The mezzanine lossy intermediate file is then commonly compressed again for end user devices. A common final compression format is part 10 of the MPEG-4 suite, known as H.264 or AVC (advanced video coding). H.264 defines an improved compression process that, compared with older standards, allows for (i) a higher-quality AV segment at the same bitrate or (ii) the same quality AV segment at a lower bitrate.
The H.264 standard is computationally more intensive relative to older standards and allows for encoding a relatively more-compressed version of an AV file. The compression uses multiple techniques that are based on the way moving pictures are structured—having, for example, typically a lot of similarity between neighboring sections of a single frame and between consecutive frames—and the way moving pictures are perceived by humans—with, for example, greater sensitivity to changes in brightness than changes in color (in other words, a greater sensitivity to luminance than to chromaticity). The result of this H.264 encoding can be stored in a defined standard containers such as MP4 (MPEG-4 part 14) or MOV (QuickTime File Format; QuickTime is a registered trademark of Apple Inc. of Cupertino, Calif.). End user devices typically download the final compressed H.264 encoded video and compressed audio in chunked segments of the MP4 container. The chunked segments are defined by standards such as Smooth Streaming (using ISM files), HTTP Live Streaming (HLS), and HTTP Dynamic Streaming (HDS), and referred to in this application as transport-stream files. One common file type for transport-stream files, which are used in the user segment, is the .ts file type. A mezzanine intermediate file may be transcoded into a myriad different corresponding transport-stream files of various quality levels. As noted above, a CDN serves different types of user devices. These devices may have different types of media-playing programs on them and the devices may have data connections of different bandwidths to their respective edge servers. Consequently, a user device requesting an asset specifies a particular format, resolution, and/or bitrate (or bitrate range).
In order to be ready to provide the asset to a variety of devices, running a variety of client media-playing programs, at a variety of resolutions and bitrates, multiple versions of the asset are created. Each different version requires a corresponding transcoding of the mezzanine intermediate file into transport-stream files. The origin server stores multiple transport-stream versions of the asset, where the different versions, in the form of different transport-stream files, have different encoding formats, resolutions, and/or bitrates. Multiple versions of the same asset are needed for compatibility with a wide range of user-device types, client programs, and data-transmission rates. These different multiple versions are stored at the origin server.
When a user device requests a particular asset from its corresponding edge server, one or more compatible transport-stream files—corresponding to, for example, different bitrates—are transmitted by the origin server to the edge server for caching at the edge server. The particular versions depend on factors such as (a) the device type of the user device, (b) the user device's client program that requested the asset, and (c) the available bandwidth of the connection between the user device and the edge server. The edge server caches the entirety of the received transmission-ready version of the requested item and streams it to the user device for presentation to a user.
The above-described system requires both the generation, transport, and storage of a large number of versions of every asset that is to be readily available to users.