1. Field of the Invention
The present invention generally relates to graphics processing and display systems and, more particularly, to the creation and presentation of three-dimensional scenes of synthetic content stored on distributed network sources and accessed by computer network transmission. The invention further relates to methods of adaptively selecting an optimal delivery strategy for each of the clients based on available resources.
2. Background Description
Using three-dimensional graphics over networks has become an increasingly effective way to share information, visualize data, design components, and advertise products. As the number of computers in the consumer and commercial sectors with network access increases, the number of users accessing some form of three-dimensional graphics is expected to increase accordingly. For example, it has been estimated by W. Meloni in xe2x80x9cThe Web Looks Toward 3Dxe2x80x9d, Computer Graphics World, 21(12), December 1998, pp. 20 et seq., that by the end of year 2001, 152.1 million personal computers (PCs) worldwide will have an Internet connection. Out of this number, approximately 52.3 million users will frequently access three-dimensional images while on the World Wide Web (WWW or the Web). This number compares to only 10 million users accessing three-dimensional Web images in 1997 out of a total of 79 million Internet users. However, the use of three-dimensional graphics over networks is not limited to consumer applications. In 1997, roughly 59% of all U.S. companies had intranet connections. By 2001 this figure is expected to jump to 80%. This transition includes three-dimensional collaboration tools for design and visualization. For instance, within the computer-aided design (CAD) community there is significant interest in applications which permit sharing on a global basis of three-dimensional models among designers, engineers, suppliers and other interested parties across a network. The capability to perform xe2x80x9cvisual collaborationsxe2x80x9d offers the promise to reduce costs and to shorten development times. Other corporate interests target the use of three-dimensional solutions to visualize data such as financial fluctuations, client accounts, and resource allocations.
As generally shown in FIG. 1, three-dimensional models and their representations are typically stored on centralized servers 100 and are accessed by clients 101 over communication networks 102. Several data-transfer technologies have been developed over the past few years to visualize three-dimensional models over networks.
At one end of the spectrum are the so-called client-side rendering methods in which the model is downloaded to the client which is entirely responsible for its rendering. FIG. 2 shows a diagram of a typical client-side rendering architecture. Upon input from a user or another application 201, the client 202 requests, via network 203 as client feedback 204, a model from the server 205. The geometry server 210 within server 205 contains the 3d geometry 211 and the scene parameters 212. In response to client feedback 204, the server 205 retrieves the model from storage 206 and delivers the 3d geometry 213 to the client 202 over the network 203. Once the model has been received by the client, the client 3d browser 208 renders it in client rendering engine 207 and displays it on the display 209. Additional client feedback may follow as the user interacts with the model displayed and more information about the model is downloaded. Such methods typically require a considerable amount of time to download and display on the client an initial meaningful representation of a complex three-dimensional model. These methods also require the existence of three-dimensional graphics capabilities on the client machines.
Alternatives to en masse downloading of a model without prior processing include storage and transmission of compressed models, as reported by G. Taubin and J. Rossignac in xe2x80x9cGeometry Compression Through Topological Surgeryxe2x80x9d, ACM Transactions on Graphics, April 1998, pp. 84-115, streaming and progressive delivery of the component geometry, as reported by G. Taubin et al. in xe2x80x9cProgressive Forest Split Compressionxe2x80x9d, ACM Proc. Siggraph ""98, July 1998, pp. 123xe2x80x94132, H. Hoppe in xe2x80x9cProgressive Meshesxe2x80x9d, ACM Proc. Siggraph ""98, August 1996, pp. 99-108, and M. Garland and P. Heckbert in xe2x80x9cSurface Simplification Using Quadric Error Boundsxe2x80x9d, ACM Proc. Siggraph ""97, August 1997, pp. 209-216, and ordering based on visibility, as reported by D. Aliaga in xe2x80x9cVisualization of Complex Models Using Dynamic Texture-Based Simplificationxe2x80x9d, Proc. IEEE Visualization ""96, October 1996, pp. 101-106, all of which are targeted towards minimizing the delay before the client is able to generate an initial display. However, producing such representations may involve significant server computing and storage resources, the downloading time remains large for complex models, and additional time may be necessary on the client to process the data received (e.g., decompression). For example, Adaptive Media""s Envision 3D (see www.envision.com) combines computer graphics visibility techniques (e.g., occlusion culling as described by H. Zang et al., xe2x80x9cVisibility Culling Using Hierarchical Occlusion Mapsxe2x80x9d, ACM Proc. Siggraph ""97, August 1997, pp. 77-88) with streaming to guide the downloading process by sending to the clients the visible geometry first and displaying it as it is received, rather than waiting for the entire model to be sent. Nonetheless, determining which geometry is visible from a given viewpoint is not a trivial computation and maintaining acceptable performance remains a challenging proposition even when only visible geometry is transmitted.
At the opposite end of the spectrum are server-side rendering methods, as generally shown in FIG. 3, which place the burden of rendering a model entirely on the server and the images generated are subsequently transmitted to clients. As in the case of client-side methods, the client 301 usually initiates a request for a model. However, instead of downloading the three-dimensional model to the client 301, the model and scene description 302 stored in storage 303 is rendered on the server 304 in rendering engine 305 to produce two-dimensional static images 306, and one or more two-dimensional images 307 resulting from this rendering are transmitted over the network 308 to the client 301. Subsequently, the images 307 are displayed on display 309 of the client 301. The cycle is then repeated based on user feedback 310.
Such techniques have the advantages that they do not require any three-dimensional graphics capabilities on the part of the clients and the bandwidth requirements are significantly reduced. The tradeoffs in this case are the loss of real-time interaction with the model (i.e., images cannot be delivered to clients at interactive frame rates) and the increase in server load and hence, server response times, as the number of clients concurrently accessing the server increases. An example of a server-side-based rendering system is CATWeb (www.catia.ibm.com) which is a web browser-based application designed to provide dynamic CAD data access to users with intranet connections and graphics capabilities. Another example in this category is panoramic rendering described by W. Luken et al. in xe2x80x9cPanoramIX: Photorealistic Multimedia 3D Sceneryxe2x80x9d, IBM Research Report #RC21145, IBM T. J. Watson Research Center, 1998. A panorama is a 360 degree image of a scene around a particular viewpoint. Several panoramas can be created for different viewpoints in the scene and connected to support limited viewpoint selection.
Hybrid rendering methods described by D. Aliaga and A. Lastra in xe2x80x9cArchitectural Walkthroughs Using Portal Texturesxe2x80x9d, Proc. IEEE Visualization ""97, October 1997, pp. 355-362, M. Levoy in xe2x80x9cPolygon-Assisted JPEG and MPEG Compression of Synthetic Imagesxe2x80x9d, ACM Proc. Siggraph ""95, August 1995, pp. 21-28, and Y. Mann and D. Cohen-Or in xe2x80x9cSelective Pixel Transmission for Navigating in Remote Virtual Environmentsxe2x80x9d, Proc. Eurographics ""97, 16 (3), September 1997, pp. 201-206, provide a compromise approach by rendering part of a complex model on the server (usually components that are far away from the viewer or of secondary interest) and part on the client. Thus, a combination of images (possibly augmented with depth information) and geometry is delivered to the client. For example, the background of a three-dimensional scene may be rendered on the server as a panorama with depth information at each pixel. Foreground objects are delivered as geometry to the client and correctly embedded into the panorama using the depth information. The main advantage of such an approach is that the time to transmit and display on the client the server-rendered parts of the model is independent of the scene complexity, while the frame rate and the interaction with the client-rendered parts are improved. Additional processing of the image and geometry data may be done to optimize their transfer over the network. For instance, in M. Levoy, supra, image compression is applied to the two-dimensional data and model simplification and compression are performed on the three-dimensional data before they are sent to the client. Some of the disadvantages of hybrid rendering methods are the fact that determining whether a part of a given model should be rendered on the server or on the client is usually not a trivial task, extra image information is often required to fill in occlusion errors that may occur as a result of a viewpoint change on the client, and limited user interaction.
Although the subject has been addressed by B. O. Schneider and I. Martin in xe2x80x9cAn Adaptive Framework for 3D Graphics in Networked and Mobile Environmentsxe2x80x9d, Proc. Workshop on Interactive Applications of Mobile Computing (IMC""98), November 1998, in general, commercial methods for delivering three-dimensional data over networks are not adaptive. They do not take into account dynamic changes in system environment conditions such as server load, client capabilities, available network bandwidth, and user constraints. In addition, the lack of standards and the increasing complexity of the models have contributed to limiting the success of existing technologies.
It is therefore an object of the present invention to provide a system and method which provides a continuous, seamless spectrum of rendering options between server-only rendering and client-only rendering.
Another object of the invention is to provide a user-controlled tradeoff between the quality (fidelity) of the rendered image and the frame rates at which the rendered image is displayed on the client.
It is yet another object of the invention to provide a system and method which provides rendering options that adaptively track a dynamic network environment.
Yet another object of this invention is to provide a system and method that uses dead reckoning techniques to avoid latency problems in a network.
According to the invention, there is provided a novel approach to the problem of seamlessly combining client-only rendering techniques with server-only rendering techniques. The approach uses a composite stream containing three distinct streams. One stream is available to send geometry from the server to the client, for local rendering if appropriate. Another stream contains video with transparent pixels that allow the client-rendered object to appear in the context of the server-rendered object. The third stream contains camera information that allows the client to render the client object in the same view as the server-rendered object.
The invention can satisfy a number of viewing applications. If the client does not have adequate rendering performance, the entire scene can be rendered on the server and only video needs to be sent to the client. On the other hand, if the client does have good rendering performance the server can initially render the entire scene, but begin streaming geometry for parts of the scene as the interaction progresses. As more geometry arrives at the client, the server will be doing less rendering for the scene since more of the content is available for local rendering on the client. The transfer of geometry can be terminated by the user on the client side, by some specified cut-off limit, or the entire scene can be transferred to the client. Alternatively, the local object may already exist at both the client and server, so no geometry need be sent; the server will render the scene and mark transparent the pixels due to the local object, and the client will render locally and fill them in without any geometry having been streamed from server to client. Regardless of the specific amount of geometry being transferred, the server can always dynamically alter the fidelity of the video being sent in order to trade off quality with bandwidth. Whenever the view of the scene is not changing, the video may become progressively refined.