The present invention characterizes a session as any communication between two devices set up at a particular point in time to facilitate a desired operation, such as but not necessary limited to communications established between two devices for a period of time sufficient to facilitate media/content/television streaming, file downloads, voice over Internet protocol (VoIP) communications, cellular phone calls, etc. Various protocols, specification and standards exist to facilitate session-based communications, such as but not necessary limited to those associated with Hypertext Transfer Protocol (HTTP) sessions (application layer), Session Initiation Protocol (SIP) sessions (session layer) and Transmission Control Protocol (TCP) sessions (circuits, connections, sockets-transport layer), the disclosures of which are hereby incorporated by reference in their entireties herein. While not necessarily intending to limit the scope and contemplation of the present invention, the presentation herein is predominately described with respect to the administration of sessions used to facilitate a user interface (UI), referred to as UI sessions, and to facilitate media transport, referred to as media sessions using HTTP protocol.
Media and UI sessions are contemplated for use in any number of environments to enable navigation, second screen information, guides and any number of other applications associated with facilitating media transport, access, download, streaming, etc. One exemplary, non-limiting aspect of the present invention contemplates the use of media and UI session to facilitate access to television related media, such as in the manner described in the specification entitled Mapping from MPEG-2 Transport to HTML5, CL-SP-HTML5-MAP-103-140207, as published by Cable Television Laboratories, Inc. (CableLabs specification), the disclosure of which is hereby incorporated by reference in its entirety. While the present invention is not necessarily limited to MPEG-2 or HTML5 and fully contemplates its use with other transport related protocols, such as MPEG-4, HTML5, MPEG-DASH, ISOBMFF, the media and UI session described in the CableLabs specification with respect to MPEG-2 and HTML5 are illustrative of the sessions contemplated for administration herein.
The CableLabs specification notes HTML5 user agents (UAs), per HTML5, may playback MPEG-2 TS media resources that contain a multiplex of video, audio, text, and private data elementary streams. Television program providers and distributors may use these streams to deliver services associated with the primary video and audio in the multiplexed stream. These services may be collectively termed “TV Services”. A common HTML representation of these TV services tracks may be used in order for these TV Services to be made available to Web content in a consistent way. The CableLabs specification defines requirements for how such MPEG-2 TS elementary streams should be translated by the HTML5 UA into the equivalent HTML5 video, audio, and text track elements.
FIG. 1 illustrates a media session 10 and a UI session 12 with respect to a relationship of the translation function in the context of MPEG-2 TS media resources delivered to a Web page by a UA in a client application. The Web page providing the user interface (e.g., program guide) may not always be provided by the originator of the program content. For example, the guide may be provided by the television manufacturer or the cable or satellite TV provider, while the multiplexed streams are provided by hundreds of independent television program providers. Therefore, the Web page may not have a prior knowledge of which streams are in the programs at any given time.
The CableLabs specification uses the following terms:                Descriptor: Structure, used to extend the definitions of programs and streams, consisting of an 8-bit tag followed by an 8-bit descriptor length and data fields.        Elementary Stream: A generic term for a coded video, audio, or other data stream carried in a sequence of PES packets with the same stream id.        Media Resource Timeline: Maps times (in seconds) to positions in the media resource. The origin of a timeline is its earliest defined position. The duration of a timeline is its last defined position. Identical to the media timeline defined in HTML5.        Packet Identifier: A number that uniquely identifies an elementary stream in a program.        Packetized Elementary Stream: An elementary stream encoded in sequence of PES packets where each packet consists of a header followed by a number of contiguous bytes from the elementary stream.        Program Map Table: Specifies the PID stream type, PID, and optional stream descriptors that identify the elementary streams that form each program.        Program Stream: A generic term for an elementary stream that is part of a program.        User Agent: The function that conforms to the HTML specification        
The CableLabs specification uses the following abbreviations:                DASH: Dynamic adaptive streaming over HTTP.        HTML: Hypertext markup language.        MPEG-2 TS: MPEG-2 transport stream.        PES: Packetized elementary stream.        PID: Packer identifier.        PMT: Program map table.        UA: user agent.        
The CableLabs specification defines the requirements for an HTML5 user agent (UA) to recognize and make available to Web content all elementary streams in an MPEG-2 TS media resource so that the following set of TV Services can be provided:                Closed Captioning: Textual representation of the media resource audio dialogue intended for the hearing impaired.        Subtitles: Alternate language textual representation of the media resource audio dialogue.        Content Advisories: Content rating information used by parental control applications.        Synchronized Content: Signaling messages to control the execution of a client application in a manner synchronized with the media resource playback.        Client ad insertion: Signaling messages that convey advertisement insertion opportunities to a client application.        Audio translations: Alternate language representation of the primary audio track.        Audio descriptions: Audio descriptions of the video intended for the visually impaired.        
The CableLabs specification applies to single program MPEG-2 transport streams, and while multi-program MPEG-2 transport streams may be out of scope for the CableLabs specification, the use thereof are fully contemplated by the present invention.
The sub-sections noted below according to section references included in the CableLabs specification define requirements for how the UA is required to recognize MPEG-2 TS video, audio, and other data tracks, and how the HTML5 elements representing those tracks are to be created. HTML5 VideoTrack, AudioTrack and TextTrack elements have additional attributes, beyond those referenced in this specification that may be set by the UA, consistent with user preferences. If a user seeks backward (in time) in a media resource and resumes playback, the UA replays tracks it has previously played. In this case, the UA may not create duplicate TextTrackCues for TextTracks in the media resource as TextTrackCues may not be deleted from TextTracks and may still exist in the TextTrack that is being replayed. Creation of new TextTrackCues when the TextTrack is replayed may result in duplicate TextTrackCues. How the UA accomplishes this may be implementation-specific.
5.1 Video, Audio and Text Track Creation
HTML5 VideoTracks, AudioTracks, and TextTracks MUST be created by the UA as defined in HTML5.
5.1.1 Program Description TextTrack
Different types of video, audio, and text elementary streams will be present in a MPEG-2 TS media resource, depending on geographical region, or service or content provider. In order that UA implementations are independent of region and provider, UAs MUST make program map table (PMT) metadata in the MPEG-2 TS [H.222.0] media resource available so that a Web page script can be used to interpret elementary stream types not recognized by the UA.
The UA MUST create a TextTrack in the media resource TextTrack List and set the TextTrack attributes using the following rules:                1) kind=“metadata”        2) id=“video/mp2t track-description”, i.e., the string concatenation of the MPEG-2 TS MIME type [RFC 3555] and “track-description”        3) language=“ ” (empty string)        4) mode=TextTrack DISABLED        
For each PMT received in the program stream by the UA, the UA MUST create a DataCue only in the case where the PMT differs from the PMT represented by the previously created DataCue. This is in recognition of the fact that the PMT is received at a minimum rate of every 140 msec but rarely changes.
For each new PMT, a UA MUST create a new DataCue in the text track as described in [HTML5] section “Text track model” with attributes set as follows:                1) startTime is set to the current time in the media resource timeline        2) endTime=startTime        3) data is set to the PMT data in its unparsed binary form        4) text is set to null        5) pauseOnExit=false        
5.1.2 VideoTrack
For all MPEG-2 video stream types that the UA can render, the UA MUST create a new VideoTrack in the VideoTrackList of the media resource.
The UA MUST create VideoTracks in the VideoTrackList in the same order as they appear in the PMT. This is to comply with VideoTrackList creation requirements in [HTML5].
The UA MUST set the VideoTrack id attribute to the string representation of the PID in the PMT, interpreted as a decimal number, of the elementary stream represented by this track.
The UA MUST set VideoTrackList[0]. VideoTrack.kind=“main”.
If the UA cannot determine the values for the VideoTrack kind and language attributes [HTML5], it MUST set them to the empty string.
5.1.3 AudioTrack
For all MPEG-2 audio stream types that the UA can render, the UA MUST create a new AudioTrack in the AudioTrackList of the media resource.
The UA MUST create AudioTracks in the AudioTrackList in the same order as they appear in the PMT. This is to comply with AudioTrackList creation requirements in [HTML5].
For each AudioTrack created, the UA MUST set the id attribute to the string representation of the PID in the PMT, interpreted as a decimal number, of the elementary stream represented by this track.
The UA MUST set AudioTrackList[0].AudioTrack.kind=“main”.
If the AudioTrack contains an associated service for the visually impaired [ATSC_53] the UA SHOULD set AudioTrack.kind=“main-desc” and AudioTrack.language to the BCP47 formatted[BCP 47] value of the ISO_639 language descriptor[H.222.0].
For AudioTracks that are not at AudioTrackList[0] and are a main audio service [ATSC_53], the UA SHOULD set AudioTrack.kind=“translation” and AudioTrack.language to the BCP47 formatted [BCP 47] value of the ISO_639 language descriptor[H.222.0].
If the UA cannot determine values for the AudioTrack kind and language attributes [HTML5], it MUST set them to the empty string.
5.1.4 Other TextTracks
For stream types 0x05 and 0x80-0xFF [H.222.0], the UA MUST create a new TextTrack in the TextTrackList of the media resource.
The UA MUST create TextTracks in the TextTrackList in the same order as they appear in the PMT. This is to comply with TextTrackList creation requirements in [HTML5].
The UA MUST set the TextTrack kind and language attributes for the elementary stream as defined in Table 1.
TABLE 1Text Track KindStream DescriptionKindlanguageContent advisory descriptor [SCTE_54]metadata. . .Region rating table [CEA_766]metadata. . .Enhanced TV messaging [EISS]metadata. . .Program insertion Cue messages [SCTE_35]metadata. . .Subtitles [SCTE_27] rendered by the UAsubtitleBCP47 [BCP 47] formatted contents of the ISO639 Language Descriptor [H.222.0]Subtitles [SCTE_27] not rendered by the UAmetadata. . .Any other private user elementary streamsmetadata. . .(stream_type == 0 × 05, 0 × 80 − 0 × FF)containing private sections(payload_unit_start_indictor == 1) [H.222.0]
The UA MUST set the TextTrack id and mode attributes as follows:                1. id=the string representation of the PID in the PMT interpreted as a decimal number of the elementary stream represented by this track.        2. mode=TextTrack DISABLED        
The MPEG-2 TS packets with the PID corresponding to the TextTrack contain private data packets as defined in [H.222.0]. The UA MUST create a cue containing one or more complete private data packets in the elementary stream. A cue must be created within at most 100 milli-seconds of the detection of the first private data packet in the elementary stream.
For each private data packet in the elementary stream represented by the TextTrack, the UA MUST create a cue in the TextTrack as described in [HTML5] section “Text track model.” The type of cue depends on the value of the kind attribute. For kind ==metadata the UA MUST create a DataCue. The UA MUST set the following attributes for all DataCues:                1. startTime is set to the current time in the media resource timeline.        2. endTime is set to startTime.        3. pauseOnExit is set to false.        4. data is set to the contents of the unparsed private data packet.        5. text is set to null.        
It is important to note that the semantics of metadata TextTrack and TextTrackCue are opaque to the UA. So, for example, if the UA does not recognize a subtitle track but creates a generic metadata text track as defined above, the UA behavior defined in [HTML5] for subtitle tracks will not occur since the UA is not aware this is a subtitle track.
It is up to a Web page script to identify the subtitle track and process the subtitle messages in the TextTrackCues in a manner appropriate for the subtitle format.
5.2 Closed Captioning
Video elementary streams contain closed captioning data as indicated by a caption_service_descriptor [ATSC_53], [SCTE_54], or closed-caption type (cc_type) in the User Data [CEA_708]. For all MPEG-2 video stream types that the UA can render, closed captioning, if present, MUST be made available by the UA as follows:                1. Create a new TextTrack as defined in [HTML5] section “Sourcing in-band text tracks” with the track element attributes set as follows:                    a. kind=“caption”            b. language is set to a [BCP 47]-conformant representation of the caption data language            c. id is set to a text string of the decimal representation of the PID of the MPEG-2 video program stream containing the caption data            d. mode=TextTrack DISABLED                        
5.3 Subtitles
[SCTE_27] subtitles may be made available by the UA as follows:                1. Create a new TextTrack as defined in [HTML5] section “Sourcing in-band text tracks” with the track element attributes set as follows:                    a. kind=“subtitle”            b. language is set to a [BCP 47]-conformant representation of the caption data language            c. id is set to a text string of the decimal representation of the PID of the MPEG-2 video program stream containing the caption data            d. mode=TextTrack DISABLED                        
References, the disclosures of which are hereby incorporated by reference in their entireties herein:                [ATSC_53] AC-3 Audio System Characteristics, ATSC Standard A/53, Part 5:2010.        [ATSC_65] Program and System Information for Terrestrial Broadcast and Cable (PSIP), ATSC Standard A/65.        [BCP 47] IETF BCP 47, Tags for Identifying Languages, http://tools.ietf.org/html/bcp47.        [CEA_708] Digital Television (DTV) Closed Captioning, Doc. CEA-708-D.        [CEA_766] U.S. and Canadian Rating Region Tables (RRT) and Content Advisory Descriptors for Transport of Content Advisory Information Using ATSC Program and System Information Protocol (PSIP), Doc. ANSI/CEA-766-C.        [EISS] OpenCable™ Enhanced TV Application Messaging Protocol 1.0, OC-SP-ETV-AM1.0.1-120614, Jun. 14, 2012, Cable Television Labs, Inc.        [H.222.0] ISO/IEC 13818-1|ITU-T H.222.0 May 2006, Infrastructure of audiovisual services—Transmission multiplexing and synchronization, http://www.itu.int/rec/T-REC-h.222.0/en.        [HTML5] HTML5—A vocabulary and associated APIs for HTML and XHTML, http://www.w3.org/TR/html5/.        [RFC 3555] IETF RFC 3555, MIME Type Registration of RTP Payload Formats, http://tools.ietf.org/html/rfc3555.        [SCTE_27] ANSI/SCTE 27 2011, Subtitling Methods for Broadcast Cable.        [SCTE_35] ANSI/SCTE 35 2012, Digital Program Insertion Cueing Message for Cable.        [SCTE_54] ANSI/SCTE 54 2006, Digital Video Service Multiplex and Transport System Standard for Cable.        