The invention relates to the presentation of information on-line and more specifically to the display of multimedia presentations wherein the various media are provided from different sources and are synchronized for presentation.
The Internet and various intranets are well known communication networks for the transfer of digital data. While most of the data transmitted on these networks correspond to text or certain computer programs, more and more of it now pertains to multimedia content such as images, audio and video. An Internet or intranet user will request a single medium or multimedia presentation generally by implementing a technology called xe2x80x9chypertext linkingxe2x80x9d or xe2x80x9chyperlinkingxe2x80x9d.
A hypertext document is one which is linked to other documents via hyperlinks. A hyperlink often appears in a hypertext document as a piece of highlighted text. Hyperlinks make it easy to follow cross-references between documents. The text is usually a word or phase describing something about which a user might wish to obtain further information. When the user activates the hyperlink, typically by clicking on it using a mouse, a link command is initiated; which causes a program at the linked address to be executed. The program execution, in turn, causes the user""s view to be updated to show the linked document, typically containing more information on the highlighted word or phase. Such information may be in the form of text, audio, video, two-dimensional image or three-dimensional image. Hypertext documents with multimedia capabilities are referred to as xe2x80x9chypermedia documents.xe2x80x9d The regions on the screen which are active hyperlinks are called hot-links. While presently hypertext technology is most common in text and image media, it is beginning to also appear in animation, video and audio.
Nowadays, most people are familiar with the application of hypertext by using a mouse to click on a hot-link provided on a computer display of a homepage from the World Wide Web (the Web) on the Internet. Data on the Web is located via Uniform Resource Locators, or URLs. URLs comprise the draft standard for specifying an object on the Internet. Each URL specifies the access method and the location for the files. Documents on the Web are written in a simple xe2x80x9cmarkup languagexe2x80x9d called HTML, which stand for Hypertext Markup Language. File formats of data on the Web are specified as MIME formats, where MIME stands for xe2x80x9cMultipurpose Internet Mail Extensions.xe2x80x9d (Reference: on the Web at address oac.uci.edu/indiv/ehood/MIME/MIME.html). Examples of file formats on the Web are .au (probably the most common audio format), .html (HTML files), .jpg (JPEG encoded images), .mid (Midi music format), mpg (MPEG encoded video), and .ps (postscript files). In addition to being encoded in .au format, audio is also encoded in wav format and stored in files labeled with the suffix wav. Wav audio is not compresses beyond the quantization due to sampling rate and bits per sample. Radio quality audio is typically 22,050 Hz sampled at 8 bit per channel stereo, which gives an encoding at data rates of 43 KBps. Reasonable quality speech can be obtained at 11,025 Hz sampling, 8 bit mono, yielding data rates of 11 KBps. MPEG provides various standards for audio compression, typically derived from 44,100 Hz sampling stereo at 16 bit per sample. MPEG audio is typically compressed to between 16 Kbps to 384 Kbps. Other standards, such as G.723 and GSM, are tailored to speech signals and compress to 5 Kbps.
Typical Web servers follow the HTTP protocol. When a user requests the content of a URL on a server, the entire content associated with that URL is sent to the user""s client machine. Such content may be comprised of an html or htm document with auxiliary information attached to it, such as images and perhaps animation software. The server will commence sending the data and continue sending same until either it has completed sending all the data or until it has received a message from the client to stop sending any more data. Some servers serve in streaming mode, wherein data is sent at some prescribed average data rate, say K bits every N seconds. A streaming server is serviced by a scheduling algorithm to maintain this average data rate.
Media players for decoding and playing audio and video have been standard features on personal computers for more than a decade. Example computer media players include the QuickTime Player of Apple Computer and the Microsoft Media Player. The players typically required that all of the data for the entire presentation be resident locally on the computer before the player starts playing. Such an arrangement means that when media content is coming from some other source on the Web, the player must wait until all content is downloaded before starting to play. Newer versions of computer media players have begun to support streaming capabilities, whereby the streaming players buffer some data from outside sources on the Web and then start playing, even though much of the data has not yet arrived. In a streaming implementation, if the data rate of the incoming data is not fast enough, the player pauses when the data in its buffer is depleted, rebuffers with more data, and then resumes play.
Streaming media have found novel new applications. One such application is the delivery of audio presentations augmented with images or transparencies. The images are displayed at appropriate time intervals during the audio playback, as prescribed by the authors of the presentation. Various technologies have been invented to accommodate such presentations. Real Networks is using a file format called SMIL, which encapsulates all the relevant information in one file. SMIL makes certain that all the data that is required to be provided at a particular point in a presentation is already present in one file at the client at that instant, and then streams this file using a streaming server at some prescribed data rate. Microsoft""s NetShow utilizes a similar scheme but with its ASF data format. All known techniques for delivery of such synchronized content utilize multiplexing of all of the content into a single file, followed by streaming that file using a streaming server. Often, however, the two requirements of a single file and a streaming server are undesirable added complexities.
What is desirable, therefore, is a system and method for enabling the presentation of time synchronous content without the requirements of creating a single file and of including a streaming server.
It is also desirable that the system and method be capable of providing a synchronous presentation even if the various files do not reside on the same server.
It is an objective of the present invention, therefore to provide such a system and method.
The invention is concerned with the delivery of data from one or more sources, typically web servers, over a communications network such as the Web or an intranet, to end users who are typically deploying computers. The data is coded content information comprising a time synchronous, so-called xe2x80x9cprimaryxe2x80x9d, media, such as audio or video, together with various other so called xe2x80x9csecondaryxe2x80x9d media from the same or other sources, such as images or events to be displayed on a monitor, synchronized to appear at predetermined time points in the media presentation. For example, the data may comprise all the information required for the presentation of a lecture using audio and images of accompanying transparencies, where each transparency is displayed at an appropriate interval of time during the audio presentation. The presentation is delivered in streaming fashion, so that the end user does not have to wait for the entirety of the data to be downloaded before starting the presentation, but rather can start viewing and listening to the presentation after a relatively short transmission period.
The invention comprises a content creation tool for preparing the data in an appropriate format with appropriate auxiliary information, the format (called HotAudio file, or haf) for the data, and a player (called HotAudio player, which is the subject of a co-pending patent application Ser. No. 09/396,946) that can utilize the information in the formatted data so that the end user experience is pleasant. The auxiliary information in the formatted data is used by the player to schedule its requests from the servers on which the various images or events for the presentation reside.
Ideally, after an initial relatively short, delay comprising the initial transmission period, the presentation proceeds without interruption. In case of network congestion, as often happens on the Web, the pauses that will invariably occur are handled so as to minimize the degradation of the overall experience. For example, if secondary data for an event has not been received by the time the player needs it, the primary media playback pauses and the player stops receiving primary media data until all the necessary secondary event data has arrived. Once the necessary secondary event data has arrived, the player resumes it normal mode of operation.
The invention is ideally suited for streaming media players that do not utilize special streaming servers. The invention does not require that the primary media data and the secondary event data be multiplexed into a single streaming file.