This invention relates generally to the field of computer graphics and animation, and more specifically to the field of computerized virtual reality environments and the display, manipulation, and navigation of three-dimensional objects (in both two-dimensional and three-dimensional formats).
It is often desirable to be able to display and navigate a lifelike three-dimensional representation of an object or scene in a computerized environment. However, known applications directed to this purpose tend to suffer from several shortcomings. Initially, they tend to utilize computer-generated 3D models of objects which are then rendered on a computer monitor (or other output device) in 2D form. While the rendered models should theoretically closely resemble their real-world counterparts, they generally do not look very realistic. Realism may be enhanced by increasing their level of detail, but this results in increased processing burden and slower rendering speed. Speed is particularly problematic where viewing and navigation is to be provided in a networked environment, e.g., over the Internet/World Wide Web, largely owing to the size of the image data to be transmitted. There is great interest in the ability to allow users to view, manipulate, and navigate realistic three-dimensional objects in virtual environments over the Internet, since this would enhance personal communications, provide greater possibilities for entertainment, and allow consumers to closely inspect goods prior to engaging in e-commerce transactions. However, prior efforts to allow display, manipulation, and navigation of three-dimensional objects have largely been thwarted by the limited bandwidth available to most users.
The current invention provides methods and apparats for review of images of three-dimensional objects in a system of networked computers, and navigation through the computerized virtual environment of the objects (i.e., in the environment in which they are displayed). The invention may be implemented in any form that is encompassed by the claims set forth at the end of this document. So that the reader may better understand the invention and some of the various forms in which it may be provided, following is a brief summary of particularly preferred versions of the invention.
A three-dimensional object may be imaged (e.g., by a digital camera) from several viewpoints distributed about the object, and the image obtained at each viewpoint may be stored in conjunction with the viewpoint""s coordinates about the object. Preferably, the viewpoints are densely and evenly distributed about the object; for example, the images might be obtained from viewpoints evenly spread about the surface of a virtual sphere surrounding the object, with each viewpoint being situated no more than 30 degrees from an adjacent viewpoint (with lesser angular separation being more desirable).
The object""s image can then be transmitted for display by a client computer over a client-server network, and the user may issue commands to manipulate the object so as to accurately simulate manipulation of the actual three-dimensional object. The client computer may display the object""s image from one of the viewpoints. If the user then wishes to manipulate the object, the user will issue a command to the server to index from the coordinates of the first viewpoint to the coordinates of some adjacent viewpoint(s). The images of the adjacent viewpoints will then be displayed in a sequence corresponding to the order in which the coordinates of the viewpoints are indexed. As an example, the user may xe2x80x9crotatexe2x80x9d the virtual object by indexing about the coordinates of viewpoints encircling the object, and images of the viewpoints at these coordinates will be displayed to the user in succession. To the user, this may appear as an animated view of the rotating three-dimensional object, or of a rotating three-dimensional model of the object, even though the display is rendered solely from two-dimensional images.
This arrangement is advantageously implemented in either multi-user or single-user settings. In a multi-user setting, the images are preferably stored locally on the client computer, and the network is used to transmit data other than images (e.g., each user""s current viewpoint coordinates, indexing commands). Thus, each user may manipulate and navigate the object on his/her computer, but can obtain the viewpoint coordinates at other client computers so as to obtain the same view as another user. Alternatively, the user may obtain the indexing commands being issued by users at other client computers so as to allow these commands to apply to the user""s own display.
In a single-user setting, the images are initially stored entirely by the server. Prior to navigation, the client obtains the image of the first viewpoint from the server, and preferably also the images from some set of xe2x80x9cneighboringxe2x80x9d adjacent viewpoints prior to allowing navigation. This allows the user to immediately index between the images of these adjacent viewpoints when navigation begins, as opposed to requiring that the user wait to index between viewpoints. As an example, prior to allowing navigation, the client might obtain the images of a set of viewpoints which encircles the object; once these images are loaded, the user will be able to view images corresponding to a full orbit about the object without having to pause while further images are transmitted from the server.
The invention also preferably incorporates features which allow users to zoom images in and out (i.e., to enlarge or reduce the scale of images) without significant loss of resolution or unduly long delays in image transmission. When the object is initially imaged from the various viewpoints, these images are advantageously obtained at high resolution. As a result, the images can readily be scaled down to a lower resolution level using known scaling algorithms. In the single-user mode noted earlier, the images can then initially be transmitted to the client at some default resolution level which is preferably low, since such images are more rapidly transmitted than high-resolution images. When the user then issues a command to zoom in on the image, scaling algorithms can be used at the client to enlarge the image until some threshold level of coarseness is reached (i.e., until the image appears too xe2x80x9cgrainyxe2x80x9d). If the user then continues to issue zoom commands, the client can obtain a higher-resolution image from the server to compensate for the decreasing level of resolution. The higher-resolution image can then be subjected to scaling algorithms if the user enters further zoom commands, and images of still higher resolution can be obtained from the server if other coarseness thresholds are passed. The reverse process can be followed if the user issues commands to zoom out of the image, but this process may be faster if previously loaded (and currently cached) images are used. Since higher-resolution (and slower-transmission) images are only transmitted to the client when needed to maintain a desired level of resolution, transmission speed is enhanced.
The zoom process can be further enhanced if the concept of tiling is introduced. Images are divided into sets of arrayed sections or xe2x80x9ctilesxe2x80x9d of identical size, as by placing a uniform grid over each image; as an example, an image might be divided into four tiles by bisecting it both horizontally and vertically. If the user zooms in on an area of the image which consists of less than all of the tilesxe2x80x94for example, an area in one quadrant (one tile) of the image, or in one half (two adjacent tiles) of the imagexe2x80x94the server will transmit to the client only the higher-resolution versions of these particular tiles once the coarseness threshold is reached. In other words, the server will transmit only so many tiles as are necessary to render the portion of the image being zoomed.
It is preferable to also implement certain procedures when users wish to index from a first viewpoint to another viewpoint, and wherein the image of the first viewpoint is already being displayed at higher than the default resolution (i.e., where a user first zooms in on a viewpoint to obtain an enlarged view, and then wishes to index to adjacent viewpoints at the same level of enlargement). If indexing from one zoomed-in view to adjacent zoomed-in views is done by having the server transmit higher-resolution zoomed-in images (or tiles) at every indexing step, the time required for indexing may be significant. To reduce the indexing time, it is preferred that when the user issues commands to index from a first zoomed-in viewpoint to adjacent viewpoints, the images of the adjacent viewpoints are initially displayed at the default (lower) resolution level. As previously noted, these images may already be loaded in the client""s memory as xe2x80x9cneighborhoodxe2x80x9d images (thus requiring no transmission from the server); if they are not already present at the client, they still have lesser transmission time between the server and client because they are at lower resolution. When the user ceases issuing indexing commands for long enough that the image of the current viewpoint can be transmitted to the client at high resolution, the current viewpoint will then display the high-resolution image. Thus, when a user indexes from one zoomed-in viewpoint to adjacent viewpoints, the image of the first viewpoint will be displayed at high resolution and the images of intermediate viewpoints will be displayed at the (lower) default resolution. The image of the final viewpoint will first be displayed at the lower default resolution, but will then be displayed at high resolution once the high resolution image has sufficient time to load (i.e., if it can load before another indexing command is issued by the user).
Additionally, the framework of the invention is well-adapted for the incorporation of optional (but advantageous) features such as automated registration of the object""s images about a common center in the virtual sphere (i.e., centering of the object within all of its images), to prevent aberrations when the object is rotated or otherwise manipulated; automatic cleaning of the background of the images to isolate the object in the images; and reconstruction of a 3D model of the object by using its multiple 2D images. These features will be discussed at greater length later in this document.