Traditionally, there have been two different approaches for delivering digital audio data. In the first approach, the digital audio data is mass downloaded. More specifically, and as shown schematically in FIG. 1, one or more files corresponding to an entire audio stream 10 is transmitted one frame 12 at a time from the server to the client. Once the entire audio stream 10 has been received and reassembled to form a continuous, contiguous audio stream, it is stored in storage 14 prior to being transmitted to a media player 16.
In the second approach, streaming technology is used to deliver the digital audio data ‘just-in-time’. More specifically, and as shown schematically in FIG. 2, an entire audio stream 20 is transmitted one frame 22 at a time from the server to the client, where it is received and reassembled, in part, to provide a continuous, contiguous audio stream (i.e., a small portion of audio stream continuity is preserved). Once each frame is played by the media player 26, it is then discarded from the buffer.
Small audio streams, or audio-video streams, which for example correspond to individual songs, very short movies, and music videos, are typically transmitted using the first approach. In terms of the delivery of these smaller streams of media, the delays experienced by the users are generally tolerated because they are relatively short in nature. Typically, time delays are measured as one or two minutes, and although possible, tend not to exceed this.
The delivery of larger audio streams, which for example include books and radio shows, presents a problem for the user community. Whereas a single song that plays for 4 minutes may take 1 minute to download, an audio book that plays for 12 hours may take 3 to 4 hours to download. Although the general performance is relatively the same in terms of throughput rate, users of this media complain about the hours of waiting to receive and use the media selected.
While streaming technology obviates the waiting associated with mass download, any degradation experienced in the delivery of the content in real time introduces interruptions in the audio stream, causing breaks and interruptions in the users experience of that audio stream. Moreover, since the digital audio data is not stored, repositioning within the audio stream (e.g., using rewind or fast forward functions) interrupts the just-in-time nature of content delivery, and thus, may introduce significant delays and/or be inefficient. For example, in the case of rewinding a streamed audio stream, the content associated with the new position selected in the audio stream will need to be downloaded a second time and the future content temporarily stored in the buffer will be discarded.
In both of these existing technologies, great effort is made to reassemble the audio stream into a continuous, contiguous audio stream prior to being presented to the media player. In the case of the mass download approach, the entire audio stream is downloaded and reassembled prior to use. In streaming technologies, a very small portion of the audio stream is downloaded and reassembled prior to use with additional content delivered and already played content discarded continuously, to maintain a very small portion of continuity in the audio stream. Notably, this reconstruction of the audio stream complicates the digital audio data delivery and increases delivery time.
In addition, in both of the existing technologies, the user has limited tracking options. For example, ‘The Godfather’ is an audio book that, as commercially released, contains 24 MP3 files that require 80 megabytes of storage and plays at normal speed for a total of almost 9 hours. In order to use these files with existing mass download technology, the user must manually keep track of which file is currently being listened to and where one is in that particular file.
Tracking problems also develop if the users audio player automatically changes files, if the user is listening to multiple audio streams and/or if the user listens to audio streams on more than one client device (e.g. if a user is listening to the audio stream at work and wants to resume play at home). It can be particularly difficult and time consuming for the user to resume listening to an audio stream at a specific position.
In addition, tracking problems will develop during cross-media switching (e.g. between audio and text). For example, if a user is listening to an e-book in the car and later wants to start reading the e-book at home, tracking problems will develop because the granularity provided by the prior art methods is insufficient to tie the audio stream to the corresponding eText at a letter or word level. Further complications are expected when the cross media switching is not on a single platform (e.g., a single computer), but rather is on multiple platforms (e.g., between a computer and a cell phone or e-Reader).