The field of the invention relates to Internet communication and the near-instantaneous delivery and playback of digitally encoded audio and video. Internet broadcasting or web casting allows many people to listen to radio stations or to view news programs over the internet. However, internet broadcasting or web casting has an average latency of 5-20 seconds. That is, from the time the internet radio station starts the music or talk radio program, listeners will actually hear it 5-20 seconds later. The source of this latency comes from, for example, encoding, internet transport (distribution), and decoding.
While this kind of latency may be acceptable for some applications (e.g. listening to music, talk shows and any pre-recorded program may be acceptable), there are time-critical applications for which a 5-20 second delay is unacceptable. For example, real-time market updates, emergency broadcasts (fire, natural or manmade disasters), military, police or 911 dispatches may not be able to tolerate such a delay.
One obstacle to internet broadcasting is the high cost of the encoding station, both for hardware and software. The complexity associated with setting up the encoding station, as well as the required maintenance makes it even more difficult to establish and operate such an encoding station. Another obstacle is the lack of a standard in audio, as well as, video players. Presently, there are three major media players, Microsoft's Windows Media™, RealNetworks's Real One™ and Apple's QuickTime Media Player™, that can play back digital multimedia streams. Each of these players requires different ways of broadcasting over the internet. The variety of network protocols, routing methods and security rules governing the usage of the internet also make internet broadcasting difficult.
One method of broadcasting over the internet is termed streaming. Microsoft®, RealNetworks®, and Apple® Computer are the three largest companies offering streaming products. However, streams from each of their systems are generally incompatible with one another. Streams encoded by Microsoft's Windows Media™ Server only work with Windows Media Player or Real One player, those encoded by RealNetworks' Real Server™ can only be played by RealPlayer™, while those encoded by Apple's QuickTime only work with the QuickTime Media Player™ or Real One player.
At nearly the same time the Microsoft, RealNetworks and Apple Computer developed their proprietary streaming systems, the Motion Pictures Experts Group (MPEG), a trade organization concerned with setting broadcast standards for the motion picture industry, released the MPEG-1 standard for encoding and compressing digital audio and video. A subset of this specification, MPEG-1 layer 3 audio (commonly referred to as MP3), quickly became the most popular compressed digital audio format because of its superior compression ratios and audio fidelity. Further contributing to the popularity of the MP3 format was the widespread availability of inexpensive (and in many cases, free) authoring and playback tools made possible by the presence of an open, published standard. Driven by overwhelming public support for the MP3 format, many such media players, including RealPlayer, Windows Media Player, and QuickTime, quickly added support for the MP3 standard.
Seizing on the popularity of the MP3 audio format, On-Demand Technologies™ (“ODT”) developed the AudioEdge™ server, which simultaneously serves a single MP3 audio stream to all major players. Prior to AudioEdge™, broadcasters wishing to stream to their widest possible audience were required to encode and serve streams using multiple proprietary platforms. With AudioEdge™, one MP3 encoder and one serving platform reach all popular players. In this manner, AudioEdge™ saves bandwidth, hardware, and maintenance costs. Additionally, because AudioEdge™ supports Windows Media (the most popular proprietary streaming media format) and MP3 (the most popular standard based streaming media format) streams, the AudioEdge™ system eliminates the risk of technology lock-in, which is associated with many proprietary platforms.
Multimedia streaming is defined as the real-time delivery and playback of digitally encoded audio and/or video. The advantages of streaming compared to alternative methods of distributing multimedia content over the internet are widely documented, among the most important of which is the ability for immediate playback instead of waiting for the complete multimedia file to be downloaded.
Two types of streaming are common today on the internet: on-demand and live. ODT AudioEdge™ delivers both live and on-demand (archived file) streams encoded in MP3 or Windows Media (WMA) format, and can be played using the major media players. Additionally, AudioEdge™ is capable of delivering both archived Apple QuickTime and RealNetworks encoded media files on-demand.
On-demand streaming delivers a prerecorded (e.g., an archived) multimedia file for playback by a single user upon request. For on-demand streaming, an archived file must be present for each user to select and view. An example of on-demand streaming would be a television station that saves each news broadcast into an archived file and makes this archived file available for streaming at a later time. Interested users would then be able to listen to and/or view this archived broadcast when it is so desired.
Live streaming involves the distribution of digitized multimedia information by one or more users as it occurs in real-time. In the above example, the same news station could augment its prerecorded archived content with live streaming, thus offering its audience the ability to watch live news broadcasts as they occur.
Live streaming involves four processes: (1) encoding, (2) splitting, (3) serving, and (4) decoding/playback. For successful live streaming, all processes must occur in real-time. Encoding involves turning the live broadcast signal into compressed digital data suitable for streaming. Splitting, an optional step, involves reproducing the original source stream for distribution to servers or other splitters. The splitting or reflecting process is typically used during the live streaming of internet broadcasts (webcasts) to many users when scalability is important.
Serving refers to the delivery of a live stream to users who wish to receive it. Often, serving and splitting functions can occur simultaneously from a single serving device. Last, decoding is the process of decompressing the encoded stream so that it can be heard and/or viewed by an end user. The decoding and playback process is typically handled by player software such as RealNetwork's Real One Player, Microsoft's Windows Media Player, or Apple's QuickTime player. All further uses of the term “streaming” refer to live streaming over the internet, and further uses of the term “server” refer to a device capable of serving and splitting live streams.
As noted earlier, three major software players are available, however, they are not compatible with each other. In other words, a proprietary RealNetworks-encoded audio stream can only be served by a RealNetworks server and played with the RealNetworks Real One Player. RealNetwork claims that their new Real One player, made available in late 2002, can play back Windows Media streams as well as Apple QuickTime's MPEG-4 format. However, in all practicality, the broadcaster would have to choose one of the three proprietary streaming formats, knowing that certain listeners will be excluded from hearing and/or viewing the stream, or simultaneously encode and stream in all three formats.
Unfortunately, existing streaming audio and/or video technologies, although termed live, still exhibit a time delay from when an audio or video signal, is encoded to when the encoded signal is decoded to produce an audio or video output signal. For person-to-person conversation, for example, this delay of as much as 20 seconds is simply unacceptable.
In general, the internet broadcasting of video and audio introduces an average latency of 5-20 seconds. That is, from the time live video and audio frames are being captured, to the time viewers can actually hear and view the frames, is about 5-20 seconds. The sources of this latency for audio and video are similar, and are generally a result of encoding (e.g., video/audio capture and compression of data), delivery (e.g., splitting, serving and transport over IP), and decoding (e.g., buffering, data decompression and play back).
Thus, there exists a need for an improved system for sending and receiving audio and video over a network, such as the internet, with minimal delay. Such a minimal delay may be one that is not perceptible to a user. Such minimal delay may also be referred to as “real-time”, “no delay” or “zero delay”.