As the Internet continues to grow in popularity, more and more media content is being placed “on-line” and is accessible via the Internet. Examples of such media include voice, music, image, video, and 3-dimensional scenery. This media is typically stored on one or more servers. A user typically accesses the media at the server by using a client computer that has a suitably programmed browser. The user's browser can communicate a user's request, by virtue of a number of different protocols, to the server to request a particular type of media. When the server receives a request from a user's browser, it executes the request by retrieving the requested media and transmitting it in a suitable format to the user's computer. The user's browser can then take the steps necessary (such as launching an associated player application) so that the user can experience (i.e. view or listen to) the requested media.
In the past, downloading media via a network, such as the Internet, has been a time-consuming task. This, in addition to transmission bottlenecks that can occur, has led to poor user browsing experiences. In the more recent past, developments have been made to attempt to enhance the user's browsing experience. One such attempt concerns the use of so-called “streaming multimedia”. In streaming multimedia, media content is streamed over the Internet and simultaneously played. For example, an initial portion of the desired media is compressed and downloaded through the Internet and buffered locally on the client's machine. Subsequently, when the local buffer is fill, the client's machine launches a player that decompresses and simultaneously plays the media that has been buffered while continuing to download remaining portions of the compressed media from the Internet. The streaming mechanism works well for “linear” media content such as voice, music and videos. It does not work well for media content for which random access is desired.
In many instances it is desirable to enable a user to navigate through a particular media content. This gives the user an opportunity to view or experience only those portions of the media content that are of particular interest to the user. For example, a user may desire to view only one particular portion of a downloaded image. Alternately, the user may desire to view several selected portions of an image, but not all of the portions of the image.
To meet the needs for randomly accessing media content, several different forms of media content have emerged. These forms include JPEG 2000 and compressed 3D image based rendering (IBR) scenes (such as concentric mosaic, Lumigraph/Lightfield), to name just a few.
As an example, consider what happens when a user browses a large JPEG 2000 compressed image via the Internet. The basic unit of a JPEG 2000 compressed image is a block bit stream having a certain resolution, space location, and quality level. The basic unit also includes an abstract layer that indexes where each basic unit of the JPEG 2000 compressed image is located. When such an image is browsed through a network, the user (i.e. user's software) may specify a particular region of interest, as well as the browsing resolution and the quality of the desired image region to a server. The server then sends only the bit stream that corresponds to the particular image region that is specified by the user.
As another example, consider what happens when browsing a compressed IBR scene, such as a 3-D walkthrough scene compressed by concentric mosaic or Lumigraph/Lightfield techniques. In this example, hundreds of photographs of a particular scene are taken from a number of different views and angles. The photos are digitized, compressed, and stored at a server location. When a user desires to browse a particular scene, the user's browsing software gives parameters of the desired view such as the rendered position, camera viewing angle, and the field of view (FOV) i.e. the resolution. The scene can then be rendered through access rays in selected photographs. The server receives the parameters and finds a corresponding ray that pertains to the images that are digitized and stored by the server. The server then streams only the compressed image data pertaining to the desired view over the network for decoding and display on the client machine.
In each of the above examples, as in the case of other media content examples of which are mentioned above, the amount of media data that is streamed or sent over the network can be quite large and can easily reach tens or hundreds of mega bytes. Constraints in limited bandwidth capabilities of the transmission medium, as well as available client memory used to store such image data continue to present challenges to providing a desirable user experience. Current efforts at designing applications for viewing such image data haven fallen short of the goal of providing a desirable user experience. One such attempt provides an application known as a “load-all-then-render” viewer, such as a baseline JPEG viewer. This type of viewer is very “unintelligent” in that it simply waits for all of the pertinent media data to be collected before performing a rendering operation. Typically, moving between scenes or within a particular scene results in a noticeably stuttered effect or multiple pauses while the relevant media data is collected or re-collected from the server. Other viewers, such as a progressive JPEG viewer, use a periodic update feature in which several waypoints for media data collection are set. When a particular waypoint is reached, the viewer renders the image data for the user. This approach has also been sub-optimal generally for the same reasons as were mentioned for the baseline JPEG viewer.
Accordingly, this invention arose out of concerns associated with providing improved methods and systems for randomly accessing structured media content files.