This invention generally relates to presenting content, and more specifically to generating a progressive version of a digital media content, such as images and videos, using machine learning techniques.
Streaming of digital media makes a large portion of internet traffic with projections to reach an even higher portion by 2020. Existing approaches to digital media content compression such as image compression, however, have not been able to adapt to the growing demand and the changing landscape of applications. Compression of digital media content, in general, is to identify and reduce irrelevance and redundancy of the digital media content for compact storage and efficient transmission over a network. If the structure in an input (e.g., image or video) can be discovered, then the input can be represented more succinctly. Hence, many compression approaches transform the input in its original type of representation to a different type of representation, e.g., the discrete cosine transform (DCT), where the spatial redundancy of the input can be more conveniently exploited by a coding scheme to attain a more compact representation. However, in existing image compression approaches deployed in practice, the mechanisms for structure exploitation are hard-coded: for instance, JPEG employs 8×8 DCT transforms, followed by run-length encoding; JPEG 2000 applies wavelets followed by arithmetic coding, where the wavelet kernels used in the transform are hard-coded, and fixed irrespective of the scale and channel of input data.
Additionally, it is often desirable to send different client devices different bitrate versions of the same content, as a function of their bandwidths. Thus, a user of the client device can consume a version of the content that is best suited for the client device. However, this implies that for every target bitrate, the content must be re-encoded, and the corresponding code must be stored separately. Therefore, given the non-optimal nature of existing approaches to compression, having to re-encode the content for each target bitrate requires significant computational resources both for generating each compression and for continually maintaining and/or storing each generated compression