The MPEG-2 transport stream format is a standard format for transmission and storage of video data, encoded audio data, and related data. The MPEG-2 transport stream format is specified in the standard known as MPEG-2 Part 1, Systems (ISO/IEC standard 13818-1 or ITU-T Rec. H.222.0). An MPEG-2 transport stream has a specified container format which encapsulates packetized elementary streams.
MPEG-2 transport streams are commonly used to broadcast audio and video content, for example, in the form of DVB (Digital Video Broadcasting) or ATSC (Advanced Television Systems Committee) TV broadcasts. It is often desirable to implement a splice between two MPEG-2 transport streams
Although the invention is not limited to generation and/or splicing of transport streams having MPEG-2 transport stream format, typical embodiments are methods and systems for generating and/or splicing MPEG-2 transport streams. Transport streams having other formats may be generated and/or spliced in accordance with other embodiments of the invention, if each such transport stream includes frames (including I-frames) of video data and frames (including I-frames) of encoded audio data which satisfy certain properties described herein. A transport stream generated and/or spliced in accordance with a class of typical embodiments of the invention may also include metadata indicative of at least one splicing property satisfied by the transport stream.
An MPEG-2 transport stream carries (i.e., includes data indicative of) elementary streams (e.g., an elementary stream of video data output from a video encoder, and at least one corresponding elementary stream of encoded audio data output from an audio encoder) in packets. Each elementary stream is packetized by encapsulating sequential data bytes from the elementary stream in packetized elementary stream (“PES”) packets having PES packet headers. Typically, elementary stream data (output from video and audio encoders) is packetized as PES packets, the PES packets are then encapsulated inside Transport Stream (TS) packets, and the TS packets are then multiplexed to form the transport stream. Typically, each PES packet is encapsulated into a sequence of TS packets.
An MPEG-2 transport stream may be indicative of one or more audio/video programs. Each single program is described by a Program Map Table (PMT) which has a unique identification value (PID), and the elementary stream(s) associated with that program has (or have) a PID listed in the PMT. For example, a transport stream may be indicative of three television programs, each program corresponding to a different television channel. In the example, each program (channel) may consist of one video elementary stream and a number of (e.g., one or two) encoded audio elementary streams, and any necessary metadata. A receiver wishing to decode a particular program (channel) must decode the payloads of each elementary stream whose PID is associated with the program.
An MEG-2 transport stream includes Program Specific Information (PSI), typically comprising data indicative of four PSI tables: a program association table (PAT), a program map table (PMT) for each program, a conditional access table (CAT), and a network information table (NIT). The program association table lists all programs indicated by (included in) the transport stream, and each of the programs has an associated value of PID for its program map table (PMT). The PMT for a program lists each elementary stream of the program, and includes data indicative of other information regarding the program.
An MPEG-2 transport stream includes presentation time stamp (“PTS”) values which are used to achieve synchronization of separate elementary streams (e.g., video and encoded audio streams) of a program of the transport stream. The PTS values are given in units related to a program's overall clock reference, which is also transmitted in the transport stream. All TS packets that comprise an audio or video frame (indicated by a PES packet) have the same PTS time stamp value.
A typical transport stream (e.g., an MPEG-2 transport stream) includes encoded audio data (typically, compressed audio data indicative of one or more channels of audio content), video data, and metadata indicative of at least one characteristic of the encoded audio (or encoded audio and video) content. Although the invention is not limited to generation of transport streams whose audio content is audio data encoded in accordance with the AC-4 format (“AC-4 encoded audio data”), typical embodiments are methods and systems for generating and/or splicing transport streams (e.g., MPEG-2 transport streams) including AC-4 encoded audio data.
The AC-4 format for encoding of audio data is well-known, and was published in April 2014 in the document entitled the “ETSI TS 103 190 V1.1.1 (2014-04), Digital Audio Compression (AC-4) Standard.”
MPEG-2 transport streams are commonly used to broadcast audio and video content, for example in the form of DVB (Digital Video Broadcasting) or ATSC (Advanced Television Systems Committee) TV broadcasts. Sometimes it is desirable to implement a splice between two MPEG-2 (or other) transport streams. For example, it may be desirable for a transmitter to implement splices in a first transport stream to insert an advertisement (e.g., a segment of another stream) between two segments of the first transport stream. Conventional systems, known as transport stream splicers, are available for performing such splicing. The sophistication of conventional splicers varies, and conventional transport streams are usually generated with the assumption that splicers will be aware of and able to understand all the codecs contained in them (i.e., will be able to parse their video and encoded audio content, and metadata) in order to perform splices on them. This leaves much room for errors for the implementation of splices, and gives rise to many problems with interoperability between muxers (which perform multiplexing to generate transport streams) and splicers.
It is conventional to include, in an MPEG-2 transport stream, metadata which specifically identifies available splice points (specific times at which there are opportunities to splice in a desired manner, e.g., samples or packets having specific time codes immediately before (or after) which there are opportunities to splice in the desired manner) for each program indicated by the stream. Such metadata must be inserted in the transport stream during the encoding process. However, it would be desirable to reduce the complexity of methods (and systems) for generating transport streams (e.g., MPEG-2 transport streams) by eliminating the need to identify and indicate specific available splice points in the transport streams. In accordance with typical embodiments of the present invention, transport streams are generated without the need to identify and indicate specific available splice points therein, and such that splicers having any of variety of capabilities (including simple splicers having very limited capabilities) can splice the transport streams in a manner which guarantees that audio/video synchronization (“A/V” sync) is maintained in the resulting spliced streams without any need for modification of any encoded audio elementary stream of any of the transport streams.
Some conventional audio codecs encode audio in a format which supports the maintenance of perfect A/V sync upon the splicing of programs which include the encoded audio data. However, when it is necessary to splice transport streams (which include such encoded audio, or audio encoded in other formats), extra care needs to be taken in the preparation of each transport stream and the execution of each splice, as it is easy to damage an audio elementary stream (indicated by a transport stream) during execution of a splice. The muxer (which performs multiplexing to generate each transport stream) and the splicer downstream implicitly work together to generate a good splice, with the muxer multiplexing in some way and the splicer expecting the transport stream to have specific properties. Typical embodiments of the present invention define a set of features (of a transport stream) that help splicing of the transport stream (even by very simple splicers), and typically also include metadata in a transport stream which communicate such feature set to splicers.