The present application relates to graphics rendering hardware for computer graphics and animation systems, and particularly to clipping of such graphics in preparation for rendering.
Computer Graphics and Rendering
Modern computer systems normally manipulate graphical objects as high-level entities. For example, a solid body may be described as a collection of triangles with specified vertices, or a straight line segment may be described by listing its two endpoints with three-dimensional or two-dimensional coordinates. Such high-level descriptions are a necessary basis for high-level geometric manipulations, and also have the advantage of providing a compact format which does not consume memory space unnecessarily.
Such higher-level representations are very convenient for performing the many required computations. For example, ray-tracing or other lighting calculations may be performed, and a projective transformation can be used to reduce a three-dimensional scene to its two-dimensional appearance from a given viewpoint. However, when an image containing graphical objects is to be displayed, a very low-level description is needed. For example, in a conventional CRT display, a “flying spot” is moved across the screen (one line at a time), and the beam from each of three electron guns is switched to a desired level of intensity as the flying spot passes each pixel location. Thus at some point the image model must be translated into a data set which can be used by a conventional display. This operation is known as “rendering. ”
The graphics-processing system typically interfaces to the display controller through a “frame store” or “frame buffer” of special two-port memory, which can be written to randomly by the graphics processing system, but also provides the synchronous data output needed by the video output driver. (Digital-to-analog conversion is also provided after the frame buffer.) Such a frame buffer is usually implemented using SDRAM memory chips (or sometimes with SGRAM or VRAM). This interface relieves the graphics-processing system of most of the burden of synchronization for video output. Nevertheless, the amounts of data which must be moved around are very sizable, and the computational and data-transfer burden of placing the correct data into the frame buffer can still be very large.
Even if the computational operations required are quite simple, they must be performed repeatedly on a large number of datapoints. For example, in a typical high-end configuration, a display of 1280×1024 elements may need to be refreshed at 72 Hz, with a color resolution of 24 bits per pixel. If blending is desired, additional bits (e.g. another 8 bits per pixel) will be required to store an “alpha” or transparency value for each pixel. This implies manipulation of more than 3 billion bits per second, without allowing for any of the actual computations being performed. Thus it may be seen that this is an environment with unique data manipulation requirements.
If the display is unchanging, no demand is placed on the rendering operations. However, some common operations (such as zooming or rotation) will require every object in the image space to be re-rendered. Slow rendering will make the rotation or zoom appear jerky. This is highly undesirable. Thus efficient rendering is an essential step in translating an image representation into the correct pixel values. This is particularly true in animation applications, where newly rendered updates to a computer graphics display must be generated at regular intervals.
The rendering requirements of three-dimensional graphics are particularly heavy. One reason for this is that, even after the three-dimensional model has been translated to a two-dimensional model, some computational tasks may be bequeathed to the rendering process. (For example, color values will need to be interpolated across a triangle or other geometric structure constituting the objects of a graphic) These computational tasks tend to burden the rendering process. Another reason is that since three-dimensional graphics are much more lifelike, users are more likely to demand a fully rendered image. (By contrast, in the two-dimensional images created e.g. by a GUI or simple game, users will learn not to expect all areas of the scene to be active or filled with information.)
FIG. 6 is a very high-level view of processes performed in a 3D graphics computer system. A three dimensional image which is defined in some fixed 3D coordinate system (a “world” coordinate system) is transformed into a viewing volume (determined by a view position and direction), and the parts of the image which fall outside the viewing volume are discarded. The visible portion of the image volume is then projected onto a viewing plane, in accordance with the familiar rules of perspective. This produces a two-dimensional image, which is now mapped into device coordinates.
3D Graphics
Three-dimensional graphics (3D graphics) refers to the practice of presenting a scene or image on a two-dimensional screen in such a way that it appears three dimensional. To do so, much care must be taken to accurately display surface textures, lighting, shadowing, and other characteristics. Displaying a 3D graphics image is much more difficult than displaying a traditional 2D image.
3D Graphics Requirements
3D graphics takes a great deal of computer processing power and memory. One of the performance measures for 3D games is frame rate, expressed in frames-per-second (fps), meaning the number of times each second an image can be redrawn to convey a sense of motion.
3D Graphics Concepts
3D graphics are spatial data represented in polygonal form with an associated set of characteristics, such as light, color, shade, texture, etc. The 3D graphics pipeline consists of two major stages, or subsystems, referred to as geometry and rendering. The geometry stage is responsible for managing all polygon activities and for converting 3D spatial data into pixels. The rendering stage is responsible for managing all memory and pixel activities. It renders data from the geometry stage into the final composition of the 3D image for painting on the CRT screen.
Before consulting how a scene is broken down to allow the computer to reconstruct it, one has to start with a scene which consists of shapes. The modeling process creates this information. Designers use specialized 3D graphics software tools, such as 3D Studio, to build polygonal models destined to be manipulated by computer.
3D Graphics Pipeline
The first stage of the pipeline involves translating the model from its native coordinate space, or model coordinates, to the coordinate space of the application, or world coordinates. At this stage, the separate and perhaps unrelated coordinate systems defining objects in a scene are combined in a single coordinate system referred to as world space (World Space Co-ordinates). Translating objects into world space may involve clipping, or discarding elements of an object that fall outside the viewport or display window.
Interactive 3D graphics seek to convey an illusion of movement by changing the scene in response to the user's input. The technical term for changing the database of geometry that defines objects in a scene is transformation. The operations involve moving an object in the X, Y, or Z direction, rotating it in relation to the viewer (camera), or scaling it to change the size. (The X coordinate is moving left to right; Y is moving from top to bottom; Z is moving from “in front” to behind.)
When any change in the orientation or position of the camera is desired, every object in a scene must be recalculated relative to the new view. As can be imagined, a fast-paced game needing to maintain a high frame rate will demand a great deal of geometry processing. As scene complexity increases (more polygons) the computational requirements increase as well.
The setup stage is the point in the 3D pipeline where the host CPU typically hands off processing tasks to the hardware accelerator. Setup is the last stage before rasterization, or drawing, and can consume considerable processing time. The computational demands of the setup process depend on the number of parameters required to define each polygon as well as the needs of the pixel drawing engine.
Background: Barycentric Coordinates
Given a frame in three-dimensional World Space, a local coordinate system can be defined with respect to the frame. When given a set of points in three-dimensional space, a local coordinate system can also be constructed. These types of coordinate systems are called barycentric coordinates.
Barycentric coordinates are a method of introducing coordinates into an affine space. If the coordinates sum to one, they represent a point; if the coordinates sum to zero, they represent a vector. Consider a set of points P0, P1, . . . Pn and consider all affine combinations that can be taken from these points. That is, all points P that can be written as α0P0+α1P1+ . . . +αnPn for some α0+α1+ . . set of points forms an affine space and the coordinates (α0, α1, . . . , αn) are called the barycentric coordinates of the points of the space.
These coordinate systems are quite useful, and used extensively, when working with polygons such as triangles.
Point in a Triangle
FIG. 4 depicts a triangle object. Consider three points P1, P2, P3 in a plane. If α1, α2, and α3 are scalars such that α1+α2+α3=1then the point P defined by P=α1P1+α2P2+α3P3 is a point on the plane of a triangle formed by P1, P2, P3. The point is within the triangle ΔP1P2P3 if 0≦α1, α2, α3≦1. If any of the α's is less than zero or greater than one, the point P is outside the triangle. If any of the α's is 0, P is on one of the lines joining the vertices of the triangle.