Streaming media content over a digital network involves transmuxing. Transmuxing allows the same media content to be streamed to different devices and device types with different streaming protocols. Media content is first encoded into a digital file, such as an mp4 file (i.e., MPEG-4 file). The encoded media file is then transmuxed in order to be transmitted to different platforms using the HyperText Transfer Protocol (HTTP) Live Streaming or HLS protocol, the Dynamic Adaptive Streaming over HTTP or DASH protocol, the HTTP Dynamic Streaming or HDS protocol, or other streaming protocols.
Transmuxing involves producing a manifest file and splitting the original media file (i.e., the mp4 file) into different segments that can be separately passed to a media playback application. The manifest file lists the different segments. The manifest also specifies a correspondence between the different segments and the media content playback times. The manifest file is sent to a client player requesting the media content. Based on the manifest, the client player can begin to request the segments in order to playback the media content from the beginning or request different segments to advance playback of the media content to particular points in time.
Since the client player can begin or advance playback by requesting any segment specified in the manifest file, each of the segments in the manifest must commence with a key frame or I-frame. The key frame can be rendered using only the information in the key frame without referencing any information from other frames in the same segment or a different segment. This contrasts with P-frames or B-frames that reference information from frames before or after, and are therefore incomplete by themselves. Encoders intermix I, P, B, and other frame types to reduce media file size, thereby allowing the media file to be transferred with less data than if every frame was a key frame or I-frame.
The intermixing of I, P, B, and other frame types complicates the transmuxing operation. Prior art transmuxers typically produce segments of about equal length. The starting frame of each produced segment is unlikely to correspond to a key frame in the original media content encoding, thereby causing the prior art transmuxers to re-encode the frames falling within each of the produced segments.
FIG. 1 conceptually illustrates the transmuxing problem. The figure conceptually illustrates frames of an encoded media file 110. The figure also illustrates transmuxing the media file into a set of segments 120, 130, and 140. The start of the segments do not necessarily align with existing key frame placement in the media file 110. Therefore, the transmuxing operation involves re-encoding parts of the media file 110 so that each produced segment 120, 130, and 140 starts with a key frame.
Encoding is a computationally intensive operation. This, in turn, limits the number of concurrent streams that a transmuxer can support at any given time, especially when the segment re-encoding occurs dynamically in response to requests from different clients and client devices. A bigger issue is that delays from the transmuxer dynamically encoding the segments anew from the original encoding of the media file can degrade the user experience. Yet another issue with traditional transmuxing is quality loss. Encoding, at any stage, is a lossy process. Therefore, as the transmuxer encodes segments from the already encoded original media content file, the segments will lose quality relative to the original media content file encoding.
Accordingly, there is a need for an improved transmuxer. In particular, there is a need for a transmuxer that can dynamically generate segments in response to client requests without the need to encode or re-encode any part of the originally encoded media content or media file.