The present invention relates to a method and apparatus for serving images, even very large images, over a xe2x80x9cthinwirexe2x80x9d (e.g., over the Internet or any other network or application having bandwidth limitations).
The Internet, including the World Wide Web, has gained in popularity in recent years. The Internet enables clients/users to access information in ways never before possible over existing communications lines.
Often, a client/viewer desires to view and have access to relatively large images. For example, a client/viewer may wish to explore a map of a particular geographic location. The whole map, at highest (full) level of resolution will likely require a pixel representation beyond the size of the viewer screen in highest resolution mode.
One response to this restriction is for an Internet server to pre-compute many smaller images of the original image. The smaller images may be lower resolution (zoomed-out) views and/or portions of the original image. Most image archives use this approach. Clearly this is a sub-optimal approach since no preselected set of views can anticipate the needs of all users.
Some map servers (see, e.g., URLs http://www.mapquest.com and http://www.MapOnUs.com) use an improved approach in which the user may zoom and pan over a large image. However, transmission over the Internet involves significant bandwidth limitations (i.e, transmission is relatively slow). Accordingly, such map servers suffer from at least three problems:
Since a brand new image is served up for each zoom or pan request, visual discontinuities in the zooming and panning result. Another reason for this is the discrete nature of the zoom/pan interface controls.
Significantly less than realtime response.
The necessarily small fixed size of the viewing window (typically about 3xe2x80x3xc3x974.5xe2x80x3). This does not allow much of a perspective.
To generalize, what is needed is an apparatus and method which allows realtime visualization of large scale images over a xe2x80x9cthinwirexe2x80x9d model of computation. To put it another way, it is desirable to optimize the model which comprises an image server and a client viewer connected by a low bandwidth line.
One approach to the problem is by means of progressive transmission. Progressive transmission involves sending a relatively low resolution version of an image and then successively transmitting better resolution versions. Because the first, low resolution version of the image requires far less data than the full resolution version, it can be viewed quickly upon transmission. In this way, the viewer is allowed to see lower resolution versions of the image while waiting for the desired resolution version. This gives the transmission the appearance of continuity. In addition, in some instances, the lower resolution version may be sufficient or may in any event exhaust the display capabilities of the viewer display device (e.g., monitor).
Thus, R. L. White and J. W. Percival, xe2x80x9cCompression and Progressive Transmission of Astronomical Images,xe2x80x9d SPIE Technical Conference 2199, 1994, describes a progressive transmission technique based on bit planes that is effective for astronomical data.
However, utilizing progressive transmission barely begins to solve the xe2x80x9cthinwirexe2x80x9d problem. A viewer zooming or panning over a large image (e.g., map) desires realtime response. This of course is not achieved if the viewer must wait for display of the desired resolution of a new quadrant or view of the map each time a zoom and pan is initiated. Progressive transmission does not achieve this realtime response when it is the higher resolution versions of the image which are desired or needed, as these are transmitted later.
The problem could be effectively solved, if, in addition to variable resolution over time (i.e, progressive transmission), resolution is also varied over the physical extent of the image.
Specifically, using foveation techniques, high resolution data is transmitted at the user""s gaze point but with lower resolution as one moves away from that point. The very simple rationale underlying these foveation techniques is that the human field of vision (centered at the gaze point) is limited. Most of the pixels rendered at uniform resolution are wasted for visualization purposes. In fact, it has been shown that the spatial resolution of the human eye decreases exponentially away from the center gaze point. E. L. Schwartz, xe2x80x9cThe Development of Specific Visual Projections in the Monkey and the Goldfish: Outline of a Geometric Theory of Receptotopic Structure,xe2x80x9d Journal of Theoretical Biology, 69:655-685, 1977.
The key then is to mimic the movements and spatial resolution of the eye. If the user""s gaze point can be tracked in realtime and a truly multi-foveated image transmitted (i.e., a variable resolution image mimicking the spatial resolution of the user""s eye from the gaze point), all data necessary or useful to the user would be sent, and nothing more. In this way, the xe2x80x9cthinwirexe2x80x9d model is optimized, whatever the associated transmission capabilities and bandwidth limitations.
In practice, in part because eye tracking is imperfect, using multi-foveated images is superior to atempting display of an image portion of uniform resolution at the gaze point.
There have in fact been attempts to achieve multifoveated images in a xe2x80x9cthinwirexe2x80x9d environment.
F. S. Hill Jr., Sheldon Walker Jr. and Fuwen Gao, xe2x80x9cInteractive Image Query System Using Progressive Transmission,xe2x80x9d Computer Graphics, 17(3), 1983, describes progressive transmission and a form of foveation for a browser of images in an archive. The realtime requirement does not appear to be a concern.
T. H. Reeves and J. A. Robinson, xe2x80x9cAdaptive Foveation of MPEG Video,xe2x80x9d Proceedings of the 4th ACM International Multimedia Conference, 1996, gives a method to foveate MPEG-standard video in a thin-wire environment. MPEG-standard could provide a few levels of resolution but they consider only a 2-level foveation. The client/viewer can interactively specify the region of interest to the server/sender.
R. S. Wallace and P. W. Ong and B. B. Bederson and E. L. Schwartz, xe2x80x9cSpace-variant image processingxe2x80x9d. Intl. J. Of Computer Vision, 13:1 (1994)71-90 discusses space-variant images in computer vision. xe2x80x9cSpace-Variantxe2x80x9d may be regarded as synonymous with the term xe2x80x9cmultifoveatedxe2x80x9d used above. A biological motivation for such images is the complex logmap model of the transformation from the retina to the visual cortex (E. L. Schwartz, xe2x80x9cA quantitative model of the functional architecture of human striate cortex with application to visual illusion and cortical texture analysisxe2x80x9d, Biological Cybernetics, 37(1980) 63-76).
Philip Kortum and Wilson S. Geisler, xe2x80x9cImplementation of a Foveated Image Coding System For Image Bandwidth Reduction,xe2x80x9d Human Vision and Electronic Imaging, SPIE Proceedings Vol. 2657, 350-360, 1996, implement a real time system for foveation-based visualization. They also noted the possibility of using foveated images to reduce bandwidth of transmission.
M. H. Gross, O. G. Staadt and R. Gatti, xe2x80x9cEfficient triangular surface approximations using wavelets and quadtree data structuresxe2x80x9d, IEEE Trans, On Visualization and Computer Graphics, 2(2), 1996, uses wavelets to produce multifoveated images.
Unfortunately, each of the above attempts are essentially based upon fixed super-pixel geometries, which amount to partitioning the visual field into regions of varying (pre-determined) sizes called super-pixels, and assigning the average value of the color in the region to the super-pixel. The smaller pixels (higher resolution) are of course intended to be at the gaze point, with progressively larger super-pixels (lower resolution ) about the gaze point.
However, effective real-time visulization over a xe2x80x9cthin wirexe2x80x9d requires precision and flexibility. This cannot be achieved with a geometry of predetermined pixel size. What is needed is a flexible foveation technique which allows one to modify the position and shape of the basic foveal regions, the maximum resolution at the foveal region and the rate at which the resolution falls away. This will allow the xe2x80x9cthinwirexe2x80x9d model to be optimized.
In addition, none of the above noted references addresses the issue of providing multifoveated images that can be dynamically (incrementally) updated as a function of user input. This property is crucial to the solution of the thinwire problem, since it is essential that information be xe2x80x9cstreamedxe2x80x9d at a rate that optimally matches the bandwidth of the network with the human capacity to absorb the visual information.
The present invention overcomes the disadvantages of the prior art by utilizing means for tracking or approximating the user""s gaze point in realtime and, based on the approximation, transmitting dynamic multifoveated image(s) (i.e., a variable resolution image over its physical extent mimicking the spatial resolution of the user""s eye about the approximated gaze point) updated in realtime.
xe2x80x9cDynamicxe2x80x9d means that the image resolution is also varying over time. The user interface component of the present invention may provide a variety of means for the user to direct this multifoveation process in real time.
Thus, the invention addresses the model which comprises an image server and a client viewer connected by a low bandwidth line. In effect, the invention reduces the bandwidth from server to client, in exchange for a very modest increase of bandwidth from the client to the server.
Another object of the invention is that it allows realtime visualization of large scale images over a xe2x80x9cthinwirexe2x80x9d model of computation.
An additional advantage is the new degree of user control provided for realtime, active, visualization of images (mainly by way of foveation techniques). The invention allows the user to determine and change in realtime, via input means (for example, without limitation, a mouse pointer or eye tracking technology), the variable resolution over the space of the served up image(s).
An additional advantage is that the invention demonstrates a new standard of performance that can be achieved by large-scale image servers on the World Wide Web at current bandwidth or even in the near future.
Note also, the invention has advantages over the traditional notion of progressive transmission, which has no interactivity. Instead, the progressive transmission of an image has been traditionally predetermined when the image file is prepared. The invention""s use of dynamic (constantly changing in realtime based on the user""s input) multifoveated images allows the user to determine how the data are progressively transmitted.
Other advantages of the invention include that it allows the creation of the first dynamic and a more general class of multifoveated images. The present invention can use wavelet technology. The flexibility of the foveation approach based on wavelets allows one to easily modify the following parameters of a multifoveated image: the position and shape of the basic foveal region(s), the maximum resolution at the foveal region(s), and the rate at which the resolution falls away. Wavelets can be replaced by any multi resolution pyramid schemes. But it seems that wavelet-based approaches are preferred as they are more flexible and have the best compression properties.
Another advantage is the present invention""s use of dynamic data structures and associated algorithms. This helps optimize the xe2x80x9ceffective real time behaviorxe2x80x9d of the system. The dynamic data structures allow the use of xe2x80x9cpartial informationxe2x80x9d effectively. Here information is partial in the sense that the resolution at each pixel is only partially known. But as additional information is streamed in, the partial information can be augmented. Of course, this principle is a corollary to progressive transmission.
Another advantage is that the dynamic data structures may be well exploited by the special architecture of the client program. For example, the client program may be multi-threaded with one thread (the xe2x80x9cmanager threadxe2x80x9d) designed to manage resources (especially bandwidth resources). This manager is able to assess network congestion, and other relevant parameters, and translate any literal user request into the appropriate level of demand for the network. For example, when the user""s gaze point is focused on a region of an image, this may be translated into requesting a certain amount, say, X bytes of data. But the manager can reduce this to a request over the network of (say) X/2 bytes of data if the traffic is congested, or if the user is panning very quickly.
Another advantage of the present invention is that the server need send only that information which has not yet been served. This has the advantage of reducing communication traffic.