1. Technical Field
The invention is related to a virtual walkthrough system and process, and more particularly to an image-based walkthrough system and process that employs pictures, panoramas, and/or concentric mosaics captured from real scenes to present a photo-realistic environment to a viewer locally and/or over a network environment.
2. Background Art
Rapid expansion of the Internet has enabled a number of interesting applications related to virtually wandering in a remote environment, e.g. online virtual tours, shopping, and games. Traditionally, a virtual environment is synthesized as a collection of 3D geometrical entities. These geometrical entities are rendered in real-time, often with the help of special purpose 3D rendering engines, to provide an interactive walkthrough experience. The Virtual Reality Modeling Language (VRML) [1] is presently a standard file format for the delivery of 3D models over the Internet. Subsequently, efforts have been made to effectively compress and progressively transmit the VRML files over the Internet [2, 3, 4, 5].
The 3D modeling and rendering approach has several main problems. First, it is very labor-intensive to construct a synthetic scene. Second, in order to achieve a real-time performance, the complexity and rendering quality are usually limited by the rendering engine. Third, the requirement of certain accelerating hardware limits the wide application of the approach.
Recently developed image-based modeling and rendering techniques [6, 7] have made it possible to simulate photo-realistic environments. The advantages of image-based rendering methods are that the cost of rendering a scene is independent of the scene complexity and truly compelling photo-realism can be achieved since the images can be directly taken from the real world. One of the most popular image-based rendering software is Apple Computer""s QuickTime(trademark) VR [7]. QuickTime(trademark) VR has its roots in branching movies, e.g., the movie-map [8], the Digital Video Interactive (DVI) [9], and the xe2x80x9cVirtual Museumxe2x80x9d [10]. QuickTime(trademark) VR uses cylindrical panoramic images to compose a virtual environment, therefore provides users an immersive experience. However it only allows panoramic views at separate positions.
It is noted that in the preceding paragraphs, as well as in the remainder of this specification, the description refers to various individual publications identified by a numeric designator contained within a pair of brackets. For example, such a reference may be identified by reciting, xe2x80x9creference [1]xe2x80x9d or simply xe2x80x9c[1]xe2x80x9d. Multiple references will be identified by a pair of brackets containing more than one designator, for example, [2, 3, 4, 5]. A listing of the publications corresponding to each designator can be found at the end of the Detailed Description section.
The present invention is directed toward an image-based walkthrough system and process that employs pictures, panoramas, and/or concentric mosaics captured from real scenes to present a photo-realistic environment to a viewer. This is generally accomplished by dividing a walkthrough space that is made available to the viewer for exploring into a horizontally sectioned grid. The viewer is allowed to xe2x80x9cwalkxe2x80x9d through the walkthrough space and view the surrounding scene from a horizontal plane in the space. In doing so, the viewer xe2x80x9centersxe2x80x9d and xe2x80x9cexitsxe2x80x9d the various cells of the grid. Each cell of the grid is assigned at least one source of image data from which an image of a part or all of the surrounding scene as viewed from that cell can be rendered. Specifically, each cell is associated with one or more pointers to sources of image data, each of which corresponds to one of the aforementioned pictures, panoramas or concentric mosaics.
In the case where the image data is a picture, its pointer will be associated with the location in the cell corresponding to the viewpoint from which the picture was captured. Similarly, a pointer to panoramic image data will be associated with the location in the cell corresponding to the center of the panorama. And when the image data represents a concentric mosaic, the pointer is associated with the region of the cell corresponding to the portion of the wandering circle of the concentric mosaic contained within the cell. It should be noted that unlike a panorama and picture, a concentric mosaic could be associated with more than one cell of the walkthrough space since its wandering circle could encompass more than one cell.
Whenever the viewer moves into one of the grid cells, the pointers associated with that cell, as well as the pointers associated with the adjacent cells (e.g., the eight neighboring cells assuming square or rectangular-shaped cells), are considered. Specifically, the distance between the current location of the viewer, and each picture viewpoint, panorama center, and nearest wandering circle point, in the considered cells is computed. If the viewer""s current location is within the wandering circle of a concentric mosaic, then no action is taken to shift the viewer""s position. However, if the viewer""s current position is not within such a wandering circle, the viewer is slotted onto the closest of these aforementioned points.
In general, the foregoing image-based rendering technique for providing a continuous walkthrough experience to a viewer would require a large number of images, and so the transfer of a large amount of image data between the device employed to store the data and the processor used to render the images. If the image data is stored locally, such as on a hard drive, or on a CD or DVD, which is directly accessible by the processor, then the requirement to transfer large amounts of image data is of little concern. However, walkthrough systems are often implemented in a network environment (e.g., the Internet) where the image data is stored in or directly accessible by a network server, and the processor used to render the images is located in a network client. In such a network environment the large amount of image data that needs to be transferred between the server and client is a concern as bandwidths are typically limited.
In order to overcome the bandwidth limitations in network environments, the present invention is additionally directed toward a unique image data transfer scheme that involves streaming the image data so that the viewer can move around in the virtual environment while downloading. Similar to other network streaming techniques, this new streaming technology cuts down the waiting time for the viewer. Furthermore, the viewer can interactively move in the environment, making the waiting less perceptible. In general the new transfer scheme allows the client to selectively retrieve image segments associated with the viewer""s current viewpoint and viewing direction, rather than transmitting the image data in the typical frame by frame manner. Thus, the server is used to store the huge amount of image data, while the client is designed to interact with the viewer and retrieve the necessary data from the server. This selective retrieval is achieved by implementing a new client-server communication protocol. Additionally, cache strategies are designed to ensure a smooth viewing experience for the viewer by capturing the correlation between subsequent views of a scene.
In essence, the new transmission scheme characterizes pictures and panoramas similar to a concentric mosaic in that each is represented by a sequence of image columns. As with the concentric mosaics each image column can be of any width, but typically will be the width of one pixel, making the image column a column of pixels. To facilitate the transfer of the image data, whether it be in the form of a picture, panorama or concentric mosaic, a specialized server-side file structure is employed. This structure includes a file header, or a separate tag file, which provides descriptive information about the associated image data. It is also noted that the image data may be compressed to facilitate its efficient transfer over the network. The type of compression used should allow for random access and quick selective decoding of the image data. The aforementioned headers or tag files would include information needed to assist the client in decompressing the incoming image data.
As indicated previously, the client-server scheme according to the present invention involves an interactive client-server approach where the client selectively retrieves image segments associated with the viewer""s current viewpoint and viewing direction. In other words, rather than waiting for the particular frames of a time-sequence of frames that contain the image data needed to render the desired view of the scene to arrive, the actual image data needed to render the view is requested and sent. This transmission process, dubbed spatial video streaming, is accomplished as follows. First, the client tracks the movements of the viewer within the walkthrough space. For each new position of the viewer, the coordinates of the position are provided over the network to the server. The server determines which picture viewpoint, panorama center, and wandering circle point is nearest to the coordinates provided by the client. The server then sends out the description information associated with the closest source of image data. This information includes an indicator of the type of image data file, i.e. picture, panorama, or concentric mosaics, and the header (or tag file) information associated with the image data file. It also includes the walkthrough space coordinates of the closest source. If these coordinates are different from the viewer""s current location, the client xe2x80x9ctransportsxe2x80x9d the viewer into the source coordinates. As the grid cells are relatively small, this change in location will probably be imperceptible to the viewer. Next, the client allocates buffers to store the image data, and then requests the image columns that are needed to render the image corresponding to the viewing position and direction selected by the viewer. The image columns are preferably requested in order of their importance. To this end, the image columns that are needed to render the viewer""s image are requested from the center of this image (which corresponds to the viewing direction) outward, in an alternating pattern.
Whenever the server receives a request for an image column, it sends the requested column in compressed or uncompressed form to the requesting client. If the image column data is compressed, as it typically would be for transfer over a network, the client would decompress the data upon its receipt and use it to construct the viewer""s image. The client then displays the image to the viewer.
It was mentioned above that the client allocates buffers to store the incoming image data. This can be accomplished using a full cache strategy, especially when the amount of image data needed to render the entire scene is small enough to be readily stored by the client. In such a case, the data associated with the whole scene is streamed to the client and stored. Of course, any requested data would be provided first and the rest sent as time permits. In this way, eventually the whole scene will be stored by the client. The client would check the stored data before requesting an image column to ensure it has not already been sent. If the required column is already in storage, it is simply retrieved. If the required column is not found in the client""s memory, then the request is made to the server as described previously.
The second approach uses a partial cache scheme. This scheme is preferred when either the memory space of the client is very small such as in mobile devices, or when the amount of the image data is so large (e.g., concentric mosaics) that transmitting and storing all of it is impractical. In the partial cache scheme, two initially equal-sized buffers are allocated in the client""s memory. The initial size of these buffers is made large enough to store the data associated with the number of image column needed to render a full 360 degree view of the scene around the viewer""s current position in the walkthrough space and at the viewer""s current zoom level (i.e., current lateral field of view). The reason for this is that the client continues to request image columns in the aforementioned alternating fashion even when all the columns necessary to render the current image of the scene to the viewer have been requested. Thus, in the case of panorama or concentric mosaic image data, enough image columns to render a full panoramic view of the scene surrounding the viewer""s current position could eventually be transferred. This data would be entirely stored in one of the two buffers. The continuing request procedure has the advantage of providing, ahead of time, the image data that the client would need to render images of the entire scene surrounding the viewer""s current location. Thus, a viewer could rotate their viewing direction about the same viewpoint in the walkabout space and be presented images of the surrounding scene on nearly a real-time basis. The only time this would not be true is when the viewer rotates their viewing direction too quickly into a region of the scene associated with image columns that have not yet been received by the client. To minimize the occurrence of such an event, the requesting process changes any time the viewer rotates their viewing position. Specifically, rather than employing the aforementioned alternating request procedure, the client immediately begins requesting successive image columns in the direction the viewer is rotating, and stops requesting columns in the direction opposite the viewer""s rotation. Should the viewer change his or her direction of rotation, this same-direction requesting procedure would be repeated, except in the opposite rotation direction. The same-direction requesting procedure continues until all the available image columns are exhausted (such as when the image data is in the form of a picture), or when all the image columns needed to provide a 360 degree view of the scene have been requested.
The viewer is not limited to just rotating his or her viewing direction around the same viewpoint. Rather, the viewer can also change positions (i.e., translate within the walkthrough space), or zoom in to or out from a current view of the scene. In the case of the viewer changing viewpoints, the client-server interaction described above is simply repeated for the new location, with the exception that the second buffer comes into play. Specifically, rather than requesting the server to provide every image column needed to produce the new view of the scene to the viewer, it is recognized that some of the columns may have already been sent in connection with the previous viewpoint. Accordingly, before requesting an image column from the server, the client checks the first buffer to determine if the column has already been received. If so, it is copied to the second buffer. If not, it is requested from the server. As soon as the image columns needed to render the image of the scene for the new viewpoint have been received, the screen is refreshed and the view of the new position is displayed to the viewer. The coordination between the buffers is repeated everytime the viewer moves to a new viewpoint in the walkthrough space, except the buffers exchange roles each time. In this way, devices with limited storage capacity can still provide a realistic walkthrough experience.
In the case of where the viewer zooms into or out of the currently depicted portion of the scene, the two-buffer scheme also comes into play. Specifically, the second buffer is used to gather the image columns needed to present the zoomed image to the viewer. However, it is noted that when the viewer""s lateral field of view decreases as when the viewer zooms in, the result is the need for more image columns to provide the same-dimensioned image to the viewer. Thus, in the case where the image data is associated with a panorama or concentric mosaic, the size of the xe2x80x9creceivingxe2x80x9d buffer must be increased to accommodate the additional image data that will be needed to store the entire panoramic view of the scene from the selected viewing position at the new zoom level. Once the buffer size has been increased, the process of copying and requesting the needed image column is the same as described previously.
Another aspect of the image column requesting process involves building up additional or xe2x80x9csecondaryxe2x80x9d buffers of image data in an attempt to anticipate the next change in the viewer""s viewpoint within the walkthrough space. Thus, assuming the client has sufficient available storage space, additional buffers could be created and filled whenever the client and server are not busy providing image data to fill the active one of the two aforementioned xe2x80x9cprimaryxe2x80x9d buffers. Each of the new buffers would contain the image columns needed to render images of the surrounding scene from a different viewpoint. Preferably, buffers associated with the image data corresponding to viewpoints immediately adjacent the current viewpoint would be filled first, with subsequent buffers being filled with image data associated with viewpoints radiating progressively out from the current viewpoint. These new buffers would be filled in the same way as the first buffer described above, with the exception the viewing direction would be assumed to be the central viewing direction if picture data is involved, the starting direction if panorama data is involved, and the starting direction corresponding to the central viewing direction of the first image captured in a concentric mosaic. As with the two-buffer scheme, the viewer may change viewpoints during the process of building up the secondary buffers. In that case, the current requesting procedure is terminated, and the requesting process to fill the active primary buffer and any additional buffers is begun again for the new viewpoint. This process would include searching the existing buffers for previously stored image columns before requesting them from the server.
Finally, it is noted that while the foregoing image data transfer scheme was described in connection with its use with the above-described walkthrough system, it could also be implemented with other walkthrough systems. For example, the new transmission scheme would be particularly useful in conjunction with a walkthrough system employ one or several interconnected concentric mosaics.
In addition to the just described benefits, other advantages of the present invention will become apparent from the detailed description which follows hereinafter when taken in conjunction with the drawing figures which accompany