The present invention relates to a graphic computing apparatus for drawing high-quality three-dimensional computer graphics (CG) in real time.
A system such as a game machine using real-time three-dimensional (3D) CG is required to execute a graphic process that receives and draws data called a primitive, which represents a unit shape of the surface of an object present in a 3D space, i.e., a 3D object (to be simply referred to an object hereinafter). In order to execute this process at high speed, a graphic computing apparatus implemented as hardware is used.
In a conventional graphic computing apparatus, a plane polyhedron called a polygon is used as a primitive of an object, and undergoes drawing to express a 3D space. More specifically, the conventional graphic computing apparatus is roughly comprised of three elements, i.e., a “geometry processor”, “rasterization processor”, and “frame memory”, and processes are done in a pipeline manner.
The geometry processor executes coordinate conversion and a lighting process of a polygon as a primitive in units of vertexes. The geometry processor also computes texture coordinates corresponding to vertexes as needed, but does not read any texture image itself from the frame memory. The geometry processor obtains screen coordinate values, colors, and texture coordinate values of the vertexes of a polygon as processing results and passes them to the rasterization processor.
The rasterization processor executes a process for drawing a polygon on the frame memory in units of pixels. The color of each pixel is determined by linear interpolation of colors assigned to individual vertexes using a method called smooth shading. The rasterization processor uses a scheme for hiding (not drawing) an object which is hidden or occluded by another object by a hidden-surface removal algorithm called Z-buffering using a Z buffer assured on the frame memory, upon drawing. Furthermore, the rasterization processor uses a technique called texture mapping for mapping a two-dimensional (2D) picture using a texture image stored in the frame memory upon executing a drawing process in units of pixels.
In the texture mapping process, the positions of corresponding texture image elements in a texture image region on the frame memory are obtained in units of pixels on the basis of the texture coordinate values from the geometry processor, and color data at those positions are read from the texture image region, and undergo an arithmetic process with colors in units of pixels determined by linear interpolation mentioned above, thus determining colors to be written in the frame memory. Conventionally, arithmetic sections in units of pixels in the texture mapping process are built in the rasterization processor as a hardware circuit, and can only execute a very simple arithmetic process.
In actual system arrangements, for example, the process of the geometry processor is implemented by a program of a CPU, the geometry processor is included in the CPU, the geometry processor and rasterization processor are formed by a single LSI, or the rasterization processor and frame memory are formed by a single LSI. In any of these arrangements, however, the process from the geometry processor to the rasterization processor is basically done by a one-way pipeline process.
On the other hand, as a still advanced 3D CG technique, a parallel type graphics architecture based on a pixel computing scheme is known. As an example of this architecture, Pixel Flow/Pixel Plane disclosed in Molnar, S. et al., “Pixel Flow: High-Speed Rendering Using Image Composition”, Computer Graphics (Proc. of SIGGRAPH '92), Vol. 26, No. 2, pp. 231–240 (reference 1), U.S. Pat. No. 4,590,465 (reference 2), U.S. Pat. No. 4,783,649 (reference 3), and the like is known.
This Pixel Flow/Pixel Plane is characterized in that SIMD processors assigned in units of pixels execute exchangeable programs upon rasterizing a polygon to determine colors by complicated procedural arithmetic operations in units of pixels and to write them in the frame memory, thus achieving elaborate picture expression. However, since processes must be done in units of pixels, arithmetic operations using many SIMD processors are required to draw a large polygon which has only simple surface properties, and a large number of SIMD processors are required to implement such process at high speed, resulting in a bulky system. Also, this technique can hardly implement displacement mapping in which the surface position of an object is displaced.
Real-time 3D CG such as a game or the like is required to display pictures with the highest possible quality within a limited time called a frame time represented by 1/60 sec so as to display animation that moves smoothly.
The balance between high speed and high quality of image generation is the most important point for application software creators of, e.g., games and the like, and a graphic computing apparatus for real-time 3D CG is required to have an arrangement with which the application creators can freely control the speed and image quality.
However, in the conventional graphic computing apparatus, since a flexible vertex process as a procedural process in the geometry processor and a texture process in the rasterization processor using the frame memory are independently shared and expressions that can be achieved by the respective portions are fixed, the control method of the speed and image quality is limited.
As a technique required to provide higher-quality pictures than conventional ones in real-time 3D CG, techniques currently used to generate very high-quality pictures in the fields of “non-real-time 3D CG” such as movies and the like are known. These techniques include:
(1) a scheme for displaying objects such as persons, living bodies, and the like with high reality by modeling based on curved surface definition;
(2) displacement mapping for displacing the surface shape of each object;
(3) a scheme for drawing by computing shadows to make the layout of objects in a space easy to understand;
(4) image-based rendering for generating 3D CG by arithmetic operations from actually sensed images; and
(5) a non-photo-realistic rendering scheme for generating a sketch-style picture, illustration-style picture, and the like by procedural shading.
In the field of “non-real-time CG”, the time upon displaying pictures on a screen is determined, but the image generation processing time is not limited when pictures to be displayed are obtained one by one by computations. Hence, in order to implement these schemes in real-time 3D CG, a mechanism for executing graphic processes at higher speed is required.
However, in the structure of the conventional graphic computing apparatus, since the vertex process in a geometry section and the texture process in a rendering section are separated and shared by the geometry and rendering units, and possible expressions in the individual processors are fixed, elaborate, real pictures cannot be efficiently drawn using the aforementioned schemes.
As an example to which the aforementioned schemes in the “non-real-time CG” field can be applied, a REYES architecture proposed by Robert L. Cook et al., “The Reyes Image Rendering Architecture”, Computer Graphics (Proc. of SIGGRAPH '87), Vol. 21, No. 4, pp. 95–102 (reference 4) is known. This architecture is implemented by software, and is commercially available as “PHOTOREALISTIC RENDERMAN” software from Pixar Animation Studios, USA. This architecture divides an input primitive into polygons called micropolygons equal to or smaller than the pixel size, and programmably executes elaborate processes including displacement mapping in units of vertexes of micropolygons.
However, this REYES architecture attaches importance on creation of very high-quality pictures. Hence, this architecture requires a long time for arithmetic operations since it is not devised to shorten the drawing time, which is strictly required in real-time 3D CG, and is not suitable for real-time hardware. Especially, since all primitives are basically processed by dividing them into small micropolygons equal to or smaller than the pixel size, a huge number of micropolygons are generated (for example, in the example described in reference 4, the number of micropolygons is 6.8 millions, resulting in poor adaptability to real-time hardware.