The World Wide Web Consortium (W3C) is an international community where member organizations, a full-time staff, and the public work together to develop web standards. Hypertext Markup Language version 5 (HTML5) is one of the Web standards associated with the W3C. A persistent draft of the HTML5 standard is identified as http://www.w3.org/TR/2011/WD-html5-20110525/, the disclosure of which is hereby in corporate by reference in its entirety.
Section 4.8.10.12—Timed Text Tracks—of the noted HTML5 standard states:
I. 4.8.10.12.1 Text Track Model
A media element can have a group of associated text tracks, known as the media element's list of text tracks. The text tracks are sorted as follows:                1. The text tracks corresponding to track element children of the media element, in tree order.        2. Any text tracks added using the addTextTrack( ) method, in the order they were added, oldest first.        3. Any media-resource-specific text tracks (text tracks corresponding to data in the media resource), in the order defined by the media resource's format specification.        
A text track consists of:
The kind of text track                This decides how the track is handled by the user agent. The kind is represented by a string. The possible strings are:                    subtitles            captions            descriptions            chapters            metadata                        The kind of track can change dynamically, in the case of a text track corresponding to a track element.A label        This is a human-readable string intended to identify the track for the user. In certain cases, the label might be generated automatically.        The label of a track can change dynamically, in the case of a text track corresponding to a track element or in the case of an automatically-generated label whose value depends on variable factors such as the user's preferred user interface language.A language        This is a string (a BCP 47 language tag) representing the language of the text track's cues. [BCP47]        The language of a text track can change dynamically, in the case of a text track corresponding to a track element.A Readiness State        One of the following:        
Not Loaded                Indicates that the text track is known to exist (e.g. it has been declared with a track element), but its cues have not been obtained.        
Loading
Indicates that the text track is loading and there have been no fatal errors encountered so far. Further cues might still be added to the track.
Loaded                Indicates that the text track has been loaded with no fatal errors. No new cues will be added to the track except if the text track corresponds to a MutableTextTrack object.        
Failed to Load                Indicates that the text track was enabled, but when the user agent attempted to obtain it, this failed in some way (e.g. URL could not be resolved, network error, unknown text track format). Some or all of the cues are likely missing and will not be obtained.        The readiness state of a text track changes dynamically as the track is obtained.A mode        One of the following:        
Disabled                Indicates that the text track is not active. Other than for the purposes of exposing the track in the DOM, the user agent is ignoring the text track. No cues are active, no events are fired, and the user agent will not attempt to obtain the track's cues.        
Hidden                Indicates that the text track is active, but that the user agent is not actively displaying the cues. If no attempt has yet been made to obtain the track's cues, the user agent will perform such an attempt momentarily. The user agent is maintaining a list of which cues are active, and events are being fired accordingly.        
Showing
Showing by Default                Indicates that the text track is active. If no attempt has yet been made to obtain the track's cues, the user agent will perform such an attempt momentarily. The user agent is maintaining a list of which cues are active, and events are being fired accordingly. In addition, for text tracks whose kind is subtitles or captions, the cues are being displayed over the video as appropriate; for text tracks whose kind is descriptions, the user agent is making the cues available to the user in a non-visual fashion; and for text tracks whose kind is chapters, the user agent is making available to the user a mechanism by which the user can navigate to any point in the media resource by selecting a cue.        The showing by default state is used in conjunction with the default attribute on track elements to indicate that the text track was enabled due to that attribute. This allows the user agent to override the state if a later track is discovered that is more appropriate per the user's preferences.A List of Zero or More Cues        A list of text track cues, along with rules for updating the text track rendering.        The list of cues of a text track can change dynamically, either because the text track has not yet been loaded or is still loading, or because the text track corresponds to a MutableTextTrack object, whose API allows individual cues can be added or removed dynamically.        
Each text track has a corresponding TextTrack object.
The text tracks of a media element are ready if all the text tracks whose mode was not in the disabled state when the element's resource selection algorithm last started now have a text track readiness state of loaded or failed to load.
A text track cue is the unit of time-sensitive data in a text track, corresponding for instance for subtitles and captions to the text that appears at a particular time and disappears at another time.
Each text track cue consists of:
An Identifier
                An arbitrary string.A Start Time        A time, in seconds and fractions of a second, at which the cue becomes relevant.An End Time        A time, in seconds and fractions of a second, at which the cue stops being relevant.A Pause-on-exit Flag        A boolean indicating whether playback of the media resource is to pause when the cue stops being relevant.A Writing Direction        A writing direction, either horizontal (a line extends horizontally and is positioned vertically, with consecutive lines displayed below each other), vertical growing left (a line extends vertically and is positioned horizontally, with consecutive lines displayed to the left of each other), or vertical growing right (a line extends vertically and is positioned horizontally, with consecutive lines displayed to the right of each other).A Size        A number giving the size of the box within which the text of each line of the cue is to be aligned, to be interpreted as a percentage of the video, as defined by the writing direction.The Text of the Cue        The raw text of the cue, and rules for its interpretation, allowing the text to be rendered and converted to a DOM fragment.        
A text track cue is immutable.
Each text track cue has a corresponding TextTrackCue object, and can be associated with a particular text track. Once a text track cue is associated with a particular text track, the association is permanent.
In addition, each text track cue has two pieces of dynamic information:
The Active Flag
                This flag must be initially unset. The flag is used to ensure events are fired appropriately when the cue becomes active or inactive, and to make sure the right cues are rendered.        The user agent must synchronously unset this flag whenever the text track cue is removed from its text track's text track list of cues; whenever the text track itself is removed from its media element's list of text tracks or has its text track mode changed to disabled; and whenever the media element's readyState is changed back to HAVE NOTHING. When the flag is unset in this way for one or more cues in text tracks that were showing or showing by default prior to the relevant incident, the user agent must, after having unset the flag for all the affected cues, apply the rules for updating the text track rendering of those text tracks.The Display State        This is used as part of the rendering model, to keep cues in a consistent position. It must initially be empty. Whenever the text track cue active flag is unset, the user agent must empty the text track cue display state.        
The text track cues of a media element's text tracks are ordered relative to each other in the text track cue order, which is determined as follows: first group the cues by their text track, with the groups being sorted in the same order as their text tracks appear in the media element's list of text tracks; then, within each group, cues must be sorted by their start time, earliest first; then, any cues with the same start time must be sorted by their end time, earliest first; and finally, any cues with identical end times must be sorted in the order they were created (so e.g. for cues from a WebVTT file, that would be the order in which the cues were listed in the file).
II. 4.8.10.12.2 Sourcing In-band Text Tracks
A media-resource-specific text track is a text track that corresponds to data found in the media resource.
Rules for processing and rendering such data are defined by the relevant specifications, e.g. the specification of the video format if the media resource is a video.
When a media resource contains data that the user agent recognises and supports as being equivalent to a text track, the user agent runs the steps to expose a media-resource-specific text track with the relevant data, as follows:                1. Associate the relevant data with a new text track and its corresponding new TextTrack object. The text track is a media-resource-specific text track.        2. Set the new text track's kind, label, and language based on the semantics of the relevant data, as defined by the relevant specification.        3. Populate the new text track's list of cues with the cues parsed so far, following the guidelines for exposing cues, and begin updating it dynamically as necessary.        4. Set the new text track's readiness state to the value that most correctly describes the current state, and begin updating it dynamically as necessary.                    For example, if the relevant data in the media resource has been fully parsed and completely describes the cues, then the text track would be loaded. On the other hand, if the data for the cues is interleaved with the media data, and the media resource as a whole is still being downloaded, then the loading state might be more accurate.                        5. Set the new text track's mode to the mode consistent with the user's preferences and the requirements of the relevant specification for the data.        6. Leave the text track list of cues empty, and associate with it the rules for updating the text track rendering appropriate for the format in question.        7. Add the new text track to the media element's list of text tracks.        
When a media element is to forget the media element's media-resource-specific text tracks, the user agent must remove from the media element's list of text tracks all the media-resource-specific text tracks.
III. 4.8.10.12.3 Sourcing Out-of-band Text Tracks
When a track element is created, it must be associated with a new text track (with its value set as defined below) and its corresponding new TextTrack object.
The text track kind is determined from the state of the element's kind attribute according to the following table; for a state given in a cell of the first column, the kind is the string given in the second column:
StateStringSubtitlessubtitlesCaptionscaptionsDescriptionsdescriptionsChapterschaptersMetadatametadata
The text track label is the element's track label.
The text track language is the element's track language, if any, or the empty string otherwise.
As the kind, label, and srclang attributes are set, changed, or removed, the text track must update accordingly, as per the definitions above.
Changes to the track URL are handled in the algorithm below.
The text track list of cues is initially empty. It is dynamically modified when the referenced file is parsed. Associated with the list are the rules for updating the text track rendering appropriate for the format in question; for WebVTT, this is the rules for updating the display of WebVTT text tracks.
When a track element's parent element changes and the new parent is a media element, then the user agent must add the track element's corresponding text track to the media element's list of text tracks.
When a track element's parent element changes and the old parent was a media element, then the user agent must remove the track element's corresponding text track from the media element's list of text tracks.
When a text track corresponding to a track element is added to a media element's list of text tracks, the user agent must set the text track mode appropriately, as determined by the following conditions:
If the text track kind is subtitles or captions and the user has indicated an interest in having a track with this text track kind, text track language, and text track label enabled, and there is no other text track in the media element's list of text tracks with a text track kind of either subtitles or captions whose text track mode is showing
If the text track kind is descriptions and the user has indicated an interest in having text descriptions with this text track language and text track label enabled, and there is no other text track in the media element's list of text tracks with a text track kind of descriptions whose text track mode is showing
If the text track kind is chapters and the text track language is one that the user agent has reason to believe is appropriate for the user, and there is no other text track in the media element's list of text tracks with a text track kind of chapters whose text track mode is showing                Let the text track mode be showing.        If there is a text track in the media element's list of text tracks whose text track mode is showing by default, the user agent must furthermore change that text track's text track mode to hidden.        
If the track element has a default attribute specified, and there is no other text track in the media element's list of text tracks whose text track mode is showing or showing by default                Let the text track mode be showing by default.Otherwise        Let the text track mode be disabled.        
When a text track corresponding to a track element is created with text track mode set to hidden, showing, or showing by default, and when a text track corresponding to a track element is created with text track mode set to disabled and subsequently changes its text track mode to hidden, showing, or showing by default for the first time, the user agent must immediately and synchronously run the following algorithm. This algorithm interacts closely with the event loop mechanism; in particular, it has a synchronous section (which is triggered as part of the event loop algorithm). The step in that section is marked with 1.                1. Set the text track readiness state to loading.        2. Let URL be the track URL of the track element.        3. Asynchronously run the remaining steps, while continuing with whatever task was responsible for creating the text track or changing the text track mode.        4. Download: If URL is not the empty string, and its origin is the same as the media element's Document's origin, then fetch URL, from the media element's Document's origin, with the force same-origin flag set.                    The tasks queued by the fetching algorithm on the networking task source to process the data as it is being fetched must examine the resource's Content Type metadata, once it is available, if it ever is. If no Content Type metadata is ever available, or if the type is not recognised as a text track format, then the resource's format must be assumed to be unsupported (this causes the load to fail, as described below). If a type is obtained, and represents a supported text track format, then the resource's data must be passed to the appropriate parser as it is received, with the text track list of cues being used for that parser's output.            If the fetching algorithm fails for any reason (network error, the server returns an error code, a cross-origin check fails, etc), or if URL is the empty string or has the wrong origin as determined by the condition at the start of this step, or if the fetched resource is not in a supported format, then queue a task to first change the text track readiness state to failed to load and then fire a simple event named error at the track element; and then, once that task is queued, move on to the step below labeled monitoring.            If the fetching algorithm does not fail, then, when it completes, queue a task to first change the text track readiness state to loaded and then fire a simple event named load at the track element; and then, once that task is queued, move on to the step below labeled monitoring.            If, while the fetching algorithm is active, either:                            the track URL changes so that it is no longer equal to URL, while the text track mode is set to hidden, showing, or showing by default; or                the text track mode changes to hidden, showing, or showing by default, while the track URL is not equal to URL                                    . . . then the user agent must run the following steps:                            3. Abort the fetching algorithm.                4. Queue a task to fire a simple event named abort at the track element.                5. Let URL be the new track URL.                6. Jump back to the top of the step labeled download.                                    Until one of the above circumstances occurs, the user agent must remain on this step.                        5. Monitoring: Wait until the track URL is no longer equal to URL, at the same time as the text track mode is set to hidden, showing, or showing by default.        6. Wait until the text track readiness state is no longer set to loading.        7. Await a stable state. The synchronous section consists of the following step. (The step in the synchronous section is marked with .)        8.  Set the text track readiness state to loading.        9. End the synchronous section, continuing the remaining steps asynchronously.        10. Jump to the step labeled download.IV. 4.8.10.12.4 Text Track APImedia.textTracks.length        Returns the number of text tracks associated with the media element (e.g. from track elements). This is the number of text tracks in the media element's list of text tracks.media.textTracks [n]        Returns the TextTrack object representing the nth text track in the media element's list of text tracks.track.track        Returns the TextTrack object representing the track element's text track.        
The textTracks attribute of media elements must return an array host object for objects of type TextTrack that is fixed length and read only. The same object must be returned each time the attribute is accessed. [WEBIDL]
The array must contain the TextTrack objects of the text tracks in the media element's list of text tracks, in the same order as in the list of text tracks.
interface TextTrack {  readonly attribute DOMString kind;  readonly attribute DOMString label;  readonly attribute DOMString language;  const unsigned short NONE = 0;  const unsigned short LOADING = 1;  const unsigned short LOADED = 2;  const unsigned short ERROR = 3;  readonly attribute unsigned short readyState;     attribute Function onload;     attribute Function onerror;  const unsigned short OFF = 0;  const unsigned short HIDDEN = 1;  const unsigned short SHOWING = 2;     attribute unsigned short mode;  readonly attribute TextTrackCueList cues;  readonly attribute TextTrackCueList activeCues;     attribute Function oncuechange;};TextTrack implements EventTarget;textTrack.kind                Returns the text track kind string.textTrack.label        Returns the text track label.textTrack.language        Returns the text track language string.textTrack.readyState        Returns the text track readiness state, represented by a number from the following list:        TextTrack.NONE (0)        The text track not loaded state.        TextTrack.LOADING (1)        The text track loading state.        TextTrack.LOADED (2)        The text track loaded state.        TextTrack.ERROR (3)        The text track failed to load state.textTrack.mode        Returns the text track mode, represented by a number from the following list:        TextTrack.OFF (0)        The text track disabled mode.        TextTrack.HIDDEN (1)        The text track hidden mode.        TextTrack.SHOWING (2)        The text track showing and showing by default modes.        Can be set, to change the mode.textTrack.cues        Returns the text track list of cues, as a TextTrackCueList object.textTrack.activeCues        Returns the text track cues from the text track list of cues that are currently active (i.e. that start before the current playback position and end after it), as a TextTrackCueList object.        
The kind attribute must return the text track kind of the text track that the TextTrack object represents.
The label attribute must return the text track label of the text track that the TextTrack object represents.
The language attribute must return the text track language of the text track that the TextTrack object represents.
The readyState attribute must return the numeric value corresponding to the text track readiness state of the text track that the TextTrack object represents, as defined by the following list:    NONE (numeric value 0)            The text track not loaded state.            LOADING (numeric value 1)            The text track loading state.            LOADED (numeric value 2)            The text track loaded state.            ERROR (numeric value 3)            The text track failed to load state.        
The mode attribute, on getting, must return the numeric value corresponding to the text track mode of the text track that the TextTrack object represents, as defined by the following list:    OFF (numeric value 0)            The text track disabled mode.            HIDDEN (numeric value 1)            The text track hidden mode.            SHOWING (numeric value 2)            The text track showing and showing by default modes.        
On setting, if the new value is not either 0, 1, or 2, the user agent must throw an INVALID ACCESS ERR exception. Otherwise, if the new value isn't equal to what the attribute would currently return, the new value must be processed as follows:
If the new value is 0                Set the text track mode of the text track that the TextTrack object represents to the text track disabled mode.        
If the new value is 1                Set the text track mode of the text track that the TextTrack object represents to the text track hidden mode.        
If the new value is 2                Set the text track mode of the text track that the TextTrack object represents to the text track showing mode.        If the mode had been showing by default, this will change it to showing, even though the value of mode would appear not to change.        
If the text track mode of the text track that the TextTrack object represents is not the text track disabled mode, then the cues attribute must return a live TextTrackCueList object that represents the subset of the text track list of cues of the text track that the TextTrack object represents whose start times occur before the earliest possible position when the script started, in text track cue order. Otherwise, it must return null. When an object is returned, the same object must be returned each time.
The earliest possible position when the script started is whatever the earliest possible position was the last time the event loop reached step 1.
If the text track mode of the text track that the TextTrack object represents is not the text track disabled mode, then the activeCues attribute must return a live TextTrackCueList object that represents the subset of the text track list of cues of the text track that the TextTrack object represents whose active flag was set when the script started, in text track cue order. Otherwise, it must return null. When an object is returned, the same object must be returned each time.
A text track cue's active flag was set when the script started if its text track cue active flag was set the last time the event loop reached step 1.
interface MutableTextTrack : TextTrack { void addCue(in TextTrackCue cue); void removeCue(in TextTrackCue cue);};mutableTextTrack=media.addTextTrack(kind [, label [, language]])                Creates and returns a new MutableTextTrack object, which is also added to the media element's list of text tracks.mutableTextTrack.addCue(cue)        Adds the given cue to mutableTextTrack's text track list of cues.        Raises an exception if the argument is null, associated with another text track, or already in the list of cues.mutableTextTrack.removeCue(cue)        Removes the given cue from mutableTextTrack's text track list of cues.        Raises an exception if the argument is null, associated with another text track, or not in the list of cues.        
The addTextTrack (kind, label, language) method of media elements, when invoked, must run the following steps:                1. If kind is not one of the following strings, then throw a SYNTAX ERR exception and abort these steps:                    subtitles            captions            descriptions            chapters            metadata                        2. If the label argument was omitted, let label be the empty string.        3. If the language argument was omitted, let language be the empty string.        4. Create a new text track, and set its text track kind to kind, its text track label to label, its text track language to language, its text track readiness state to the text track loaded state, its text track mode to the text track hidden mode, and its text track list of cues to an empty list.        5. Add the new text track to the media element's list of text tracks.        
The addCue(cue) method of MutableTextTrack objects, when invoked, must run the following steps:                1. If cue is null, then throw an INVALID ACCESS ERR exception and abort these steps.        2. If the given cue is already associated with a text track other than the method's MutableTextTrack object's text track, then throw an INVALID STATE ERR exception and abort these steps.        3. Associate cue with the method's MutableTextTrack object's text track, if it is not currently associated with a text track.        4. If the given cue is already listed in the method's MutableTextTrack object's text track's text track list of cues, then throw an INVALID STATE ERR exception.        5. Add cue to the method's MutableTextTrack object's text track's text track list of cues.        
The removeCue(cue) method of MutableTextTrack objects, when invoked, must run the following steps:                1. If cue is null, then throw an INVALID ACCESS ERR exception and abort these steps.        2. If the given cue is not associated with the method's MutableTextTrack object's text track, then throw an INVALID STATE ERR exception.        3. If the given cue is not currently listed in the method's MutableTextTrack object's text track's text track list of cues, then throw a NOT FOUND ERR exception.        4. Remove cue from the method's MutableTextTrack object's text track's text track list of cues.        
In this example, an audio element is used to play a specific sound-effect from a sound file containing many sound effects. A cue is used to pause the audio, so that it ends exactly at the end of the clip, even if the browser is busy running some script. If the page had relied on script to pause the audio, then the start of the next clip might be heard if the browser was not able to run the script at the exact time specified.
var sfx = new Audio(‘sfx.wav’);var sounds = a.addTextTrack(‘metadata’);// add sounds we care aboutsounds.addCue(new TextTrackCue(‘dog bark’, 12.783, 13.612, ‘ ’, ‘ ’,‘ ’, true));sounds.addCue(new TextTrackCue(‘kitten mew’, 13.612, 15.091, ‘ ’,‘ ’, ‘ ’, true));function playSound(id) { sfx.currentTime = sounds.getCueById(id).startTime; sfx.play( );}sfx.oncanplaythrough = function ( ) { playSound(‘dog bark’);}window.onbeforeunload = function ( ) { playSound(‘kitten mew’); return ‘Are you sure you want to leave this awesome page?’;}interface TextTrackCueList { readonly attribute unsigned long length; getter TextTrackCue (in unsigned long index); TextTrackCue getCueById(in DOMString id);};cuelist.length                Returns the number of cues in the list.cuelist[index]        Returns the text track cue with index index in the list. The cues are sorted in text track cue order.cuelist.getCueById(id)        Returns the first text track cue (in text track cue order) with text track cue identifier id.        Returns null if none of the cues have the given identifier or if the argument is the empty string.        
A TextTrackCueList object represents a dynamically updating list of text track cues in a given order.
The length attribute must return the number of cues in the list represented by the TextTrackCueList object.
The supported property indicies of a TextTrackCueList object at any instant are the numbers from zero to the number of cues in the list represented by the TextTrackCueList object minus one, if any. If there are no cues in the list, there are no supported property indicies.
To determine the value of an indexed property for a given index index, the user agent must return the indexth text track cue in the list represented by the TextTrackCueList object.
The getCueById(id) method, when called with an argument other than the empty string, must return the first text track cue in the list represented by the TextTrackCueList object whose text track cue identifier is id, if any, or null otherwise. If the argument is the empty string, then the method must return null.
interface TextTrackCue {  readonly attribute TextTrack track;  readonly attribute DOMString id;  readonly attribute double startTime;  readonly attribute double endTime;  readonly attribute boolean pauseOnExit;  DOMString getCueAsSource( );  DocumentFragment getCueAsHTML( );      attribute Function onenter;      attribute Function onexit;};TextTrackCue implements EventTarget;cue.track                Returns the TextTrack object to which this text track cue belongs, if any, or null otherwise.cue.id        Returns the text track cue identifier.cue.startTime        Returns the text track cue start time, in seconds.cue.endTime        Returns the text track cue end time, in seconds.        cue.pauseOnExit        Returns true if the text track cue pause-on-exit flag is set, false otherwise.source=cue.getCueAsSource( )        Returns the text track cue text in raw unparsed form.        fragment=cue.getCueAsHTML( )        Returns the text track cue text as a DocumentFragment of HTML elements and other DOM nodes.        
The track attribute must return the TextTrack object of the text track with which the text track cue that the TextTrackCue object represents is associated, if any; or null otherwise.
The id attribute must return the text track cue identifier of the text track cue that the TextTrackCue object represents.
The startTime attribute must return the text track cue start time of the text track cue that the TextTrackCue object represents, in seconds.
The endTime attribute must return the text track cue end time of the text track cue that the TextTrackCue object represents, in seconds.
The pauseOnExit attribute must return true if the text track cue pause-on-exit flag of the text track cue that the TextTrackCue object represents is set; or false otherwise.
The direction attribute must return the text track cue writing direction of the text track cue that the TextTrackCue object represents.
The getCueAsSource( ) method must return the raw text track cue text.
The getCueAsHTML( ) method must convert the text track cue text to a DocumentFragment for the media element's Document, using the appropriate rules for doing so.
V. 4.8.10.12.5 Event Definitions
The following are the event handlers that must be supported, as IDL attributes, by all objects implementing the TextTrack interface:
Event handlerEvent handler event typeonloadloadonerrorerroroncuechangecuechange
The following are the event handlers that must be supported, as IDL attributes, by all objects implementing the TextTrackCue interface:
Event handlerEvent handler event typeonenterenteronexitexit