The explosion of streaming video on Internet Protocol (IP) networks has led to the development of so-called Adaptive HTTP Streaming protocols for video. While various different implementations of these protocols exist, they share certain features. In particular, a video stream is broken into short, several-second-long files which are downloaded by a client and played sequentially to form a seamless video view. The files or ‘chunks’ may be encoded at different bitrates and resolutions (referred to as “profiles”). A playlist file is used to let the client know the various available profiles, so that it can select which chunks to download based on local conditions, such as the available download bandwidth. In a typical scenario, the client may start downloading chunks at low resolution and low bandwidth and then switch to downloading chunks from higher bandwidth profiles, giving the user a fast tune-in with subsequently improved video quality.
In order to play back the chunks seamlessly (that is, without any video artifacts caused by errors at the chunk boundaries), it is convenient to have each chunk start with an IDR (Instantaneous Decoder Refresh) frame. IDR frames are special video frames that are encoded to be decodable independently of preceding video frames, and thus a chunk that starts with an IDR frame can be played back irrespective of what chunks were downloaded and decoded before.
In order for the client to be able to play back chunks from all of the available profiles, the following criteria should be satisfied:                Every chunk in a particular profile must have an identifiable, corresponding chunk in each of the other available profiles        Each chunk must start with an IDR frame        
Optionally:                The chunks from each profile can be of equal duration        The chunks can each have the same presentation time stamp (PTS)        
The scheme must also be robust enough to recover from all error conditions. It is noted that there may be other conditions on the chunks, such as those involving audio, which are not pertinent to the present invention.
The live video transcoding chain involves ingest of an encoded video bit stream, a video transcoder, and output of multiple video profiles encoded at different bitrates and resolutions. The input video streams ingested by the transcoder are already encoded. These are often streams delivered by satellite (or other means) to service providers that subsequently re-encode the video for various reasons, for example, in order to change the encoding format, resolution, or bitrate. The output of the transcoder may then be further processed by a Segmenter (sometimes called a Packager or a Fragmenter) that breaks the output profiles into chunks and makes them available for delivery to multiple clients over HTTP.
In order to guarantee continuity of service in the case of a transcoder failure, it is common to run multiple transcoders, often from different physical locations. In the case of Adaptive HTTP Streaming, it is desirable that the chunks generated from both the primary and back-up transcoder be IDR aligned. In that case, failure of the primary transcoder will result in delivery of chunks created from the back-up transcoder. If these are exactly aligned with the chunks in the primary transcoder, the client experience will remain smooth.
The present invention provides a methodology for enabling transcoders to create IDR aligned output profiles suitable for segmenting and adaptive delivery. It also ensures that different transcoders which ingest the same input will have all their outputs be IDR aligned, so that primary and back-up transcoders can create chunks that are aligned and compatible with each other.
In the past, the creation of video output suitable for adaptive HTTP streaming from multiple encoding processes has used messaging between the encoding processes. This type of messaging has the advantage that it works for encoders as well as transcoders (that is, it works when the input is in “baseband”). However, such messaging implementations are very complicated, costly and inefficient.
It would be advantageous to provide methods and apparatus that enable the creation of video output suitable for adaptive HTTP streaming from multiple encoding processes without the need for messaging between different encoding processes. It would be further advantageous to provide such a system that works for the transcoding case, where the input is ingested in a compressed format and transcoded into a different compressed format. Still further, it would be advantageous to enable the system to provide an arbitrary number of encoding processes with synchronized output. It would also be advantageous to provide the ability to have encoders at separate locations with synchronized output.
The present invention provides methods and apparatus having the aforementioned and other advantages. Moreover, the unique combination of components/techniques disclosed herein provides various improvements over previously known structures and techniques.