1. Field of the Invention
The present invention relates to multimedia computer communication systems; and more particularly, to a buffering system for streaming media, such as audio/video, on the Internet.
2. Description of the Related Art
Prior to the development of Internet streaming media technologies, audio and video were formatted into files, which users needed to download to their computer before the files could be heard or viewed. Real time, continuous media, as from a radio station, was not suitable for this arrangement in that a file of finite size must be created so it could be downloaded. The advent of streaming media technologies allowed users to listen or view the files as they were being downloaded, and allowed users to “tune-in” to a continuous media broadcast, or “stream”, such as from a radio station. There are two fundamental types of streaming media: (i) material that originates from a source having a real-time nature, such as a radio or TV broadcast, and (ii) material that originates from a non-real-time source such as from a disk file. An example of non-real-time material might be a piece of music stored as a disk file, or a portion of a broadcast that originally was real-time, perhaps yesterday's TV evening news, and was recorded into a disk file. For purposes of clarity within this document, streaming media of type (i) will be referred to as “broadcast” media, and streaming media of type (ii) will be referred to as “file based” media. Broadcast streaming media has as its source a system or arrangement that by definition can only be transmitted to users as fast as the material is generated; for example, a disk jockey speaking into a microphone. Broadcast streaming media is the focus of this patent application.
Since audio and video media must play out over a period of time it is more appropriate to think of bandwidth requirements than file size. The bandwidth requirement of an audio or video media refers to the data rate in bits per second that must be transmitted and received in order to listen or view the material uninterrupted. Transmitting the audio or video material over a connection slower than the bandwidth requirement results in unsatisfactory viewing or listening, if viewing or listening is possible at all. The connection available to most Internet users is by dial-up modem, which has a maximum receive data rate of 56,000 bits per second. Most audio and video available on the Internet has been compressed to be listenable or viewable within the 56,000 bits per second modem bandwidth. Requirements for achieving adequate audio and video over the Internet generally consume a considerable portion of the listener's available bandwidth.
Internet connection quality can vary rapidly over time, with two primary factors responsible for degradation of the instantaneous bandwidth actually available to the user. These factors are the quality of the user's modem connection over telephone lines, which can have periods of interference causing reduced available bandwidth, and momentary Internet congestion at various points along the route over which the user's data flows. Each of these factors can cause delays and interruptions in the transmission of data to the user. Internet data communications devices such as routers are designed to drop data “packets” if they get overloaded. For material that is not time sensitive, these dropped packets will usually be resent, and the user will eventually be presented with the material. However, since streaming media is time sensitive, dropped packets can have a significant impact on the receipt and playback of an audio or video stream. These degradations in the receipt of Internet data are very common, and prevent most users from being able to listen to or view streaming media without interruption unless some special provisions have been incorporated into the user's computer software to accommodate data transmission interruptions.
These interruptions are commonly referred to as “dropouts”, meaning that the data flow to the user has been interrupted (i.e., the audio “drops out”). Dropouts can be extremely annoying—for example, while listening to music. The current state-of-the-art solution to the problem uses a pre-buffering technique to store up enough audio or video data in the user's computer so that it can play the audio or video with a minimum of dropouts. This process requires the user to wait until enough of the media file is buffered in memory before listening or viewing can begin. The media data is delivered by a server computer which has available to it the source of the media data, such as by a connection to a radio station. When the user connects to the server via the Internet, audio/video output at the user's system is delayed while the user's buffer is filled to a predetermined level. Typical pre-buffering wait times range from 10 to 20 seconds or more, determined by the vendor providing the audio or video media. Even with this pre-buffering process, interruptions in playback still occur.
In this process, the user has a software application on the computer commonly called a “media player”. Using the features built into the media player, the user starts the audio or video stream, typically by clicking on a “start” button, and waits 10-20 seconds or so before the material starts playing. During this time data is being received from the source and filling the media player's buffer. The audio or video data is delivered from the source at the rate it is to be played out. If, for example, the user is listening to an audio stream encoded to be played-out at 24,000 bits per second, the source sends the audio data at the rate of 24,000 bits per second. Provided that the user waits 10 seconds, and the receipt of the buffering data has not been interrupted, there is enough media data stored in the buffer to play for 10 seconds.
Gaps in the receipt of audio/video data, due to Internet slowdowns, cause the buffer to deplete. Because transmission of audio/video media data to the user takes place at the rate it is played out, the user's buffer level can never be increased or replenished while it is playing. Thus, gaps in the receipt of audio/video media data inexorably cause the buffer level to decrease from its initial level. In time, extended or repeated occurrences of these gaps empty the user's buffer. The audio/video material stops playing, and the buffer must be refilled to its original predetermined level before playing of the media resumes.
By way of illustration in a 10 second pre-buffering scenario, if the data reception stopped the instant that the media started playing, it would play for exactly 10 seconds. Once it starts playing, the media data plays out of the buffer as new media data replenishes the buffer. The incoming data rate equals the rate at which the data is played out of the user's buffer, assuming the receipt of data across the Internet is unimpeded. If there are no interruptions in the receipt of the media data for the duration of the time the user listens to or watches the material, the buffer level remains constant and there will still be 10 seconds of data stored in the media player's buffer when the user stops the player. On the other hand, if the media player encounters interruptions totaling 6 seconds while playing the material, there would only be 4 seconds of media data remaining in the buffer when the user stopped it. If data reception interruptions at any time during the playing exceed 10 seconds, the user's media player buffer becomes exhausted. There is no media data to play, and the audio or video stops—a dropout has occurred. At this point a software mechanism in the media player stops attempting to play any more of the material, and starts the buffering process again. The media player remains silent until the buffer refills, at which time the media player will once again start playing the material.
There are two fundamental types of streaming media: (i) material that originates from a source having a real-time nature, such as a radio or TV broadcast, and (ii) material that originates from a non-real-time source such as from a disk file. An example of non-real-time material might be a piece of music stored as a disk file, or a portion of a broadcast that originally was real-time, perhaps yesterday's TV evening news, and was recorded into a disk file. For purposes of clarity within this document, streaming media of type (i) will be referred to as “broadcast” media, and streaming media of type (ii) will be referred to as “file based” media.
Both streaming media types are handled similarly in conventional systems, and both are handled similarly by the streaming media buffering system of the present invention. The two streaming media types are readily distinguished. Broadcast streaming media has as its source a system or arrangement that by definition can only be transmitted to users as fast as the material is generated; for example, a disk jockey speaking into a microphone. File based media, on the other hand, can be transmitted to users at any data rate, since there is no inherent time element to a file residing on a computer disk. With conventional Internet streaming media systems for streaming media of either type, media data is transmitted from the server to the user at the rate at which it will be played out, regardless of the data rate capabilities of the connection between the server and the user.
Conventional streaming media systems may incorporate buffering systems for programmatic purposes. For example, the system may buffer media data at the server for the purpose of packet assembly/disassembly. Media data may also be buffered at the server to permit programming conveniences such as dealing with chunks of data of a specific size. Such server buffering of media data is not used by conventional streaming media systems to mitigate long term Internet performance degradation as described hereinafter.
The sending of audio or video files via a network is known in the art. U.S. Pat. No. 6,029,194 to Tilt describes a media server for the distribution of audio/video over networks, in which retrieved media frames are transferred to a FIFO buffer. A clock rate for a local clock is adjusted according to the fullness of the buffer. The media frames from the buffer are sent in the form of data packets over the networks in response to interrupts generated by the local clock. In this manner, the timing for the media frames is controlled by the user to assure a continuous stream of video during editing. U.S. Pat. No. 6,014,706 to Cannon, et al. discloses an apparatus and method for displaying streamed digital video data on a client computer. The client computer is configured to receive the streamed digital video data from a server computer via a computer network. The streamed digital video data is transmitted from the server computer to the client computer as a stream of video frames. U.S. Pat. No. 6,002,720, to Yurt, et al. discloses a system of distributing video and/or audio information wherein digital signal processing is employed to achieve high rates of data compression. U.S. Pat. No. 5,923,655, to Veschi et al. discloses a system and method for communicating audio/video data in a packet-based computer network wherein transmission of data packets through the computer network requires variable periods of transmission time. U.S. Pat. No. 5,922,048 to Emura discloses a video server apparatus having a stream control section which determines a keyframe readout interval and a keyframe playback interval that satisfy a playback speed designated by a terminal apparatus. Finally, U.S. Pat. No. 6,014,694 to Aharoni, et al. discloses a system and method for adaptively transporting video over networks, including the Internet, wherein the available bandwidth varies with time.
There remains a need in the art for a method and system that afford immediate and uninterrupted listening/viewing of streaming media by the user.