The present invention solves a long-standing problem in low-cost-real-time computer graphics. A critical component of real-time computer graphics involves the use of special purpose hardware to implement fast, efficient texture calculation. The hardware must also control the Texture Memory read process, which includes reading up to 8 input samples to produce one pixel of output. Eight independent memory units may be used to supply the stream of data, but in a low-cost implementation only a single Dynamic RAM (DRAM) memory bank is available. When using only a single bank DRAM, the actual pattern of memory access causes significant delays--due to the numerous read operations, and due to the semi-random pattern of access.
This invention solves the memory access problem via a novel cache memory, and a method for its use, allowing maximum texture calculation rate while using a low-cost single bank DRAM hardware implementation.
Texture Generator Controls Reading Texels from a MIP Map Stored in DRAM
The Texture Generator subsystem calculates the memory addresses needed to control reading pre-stored values from Texture Memory. The Texture Memory contains digitized images of synthesized images consisting of a two-dimensional matrix of sample values (Texels). To support a proper, non-aliased sampling process, each two-dimensional image is stored along with additional representations of the image which contain successively lower resolution versions of the original image. An original image of 256 by 256 samples for example is stored along with a representation of this same image which is digitized with only 128 by 128 samples. This two to one reduction in resolution in each sample axis is fully supported, with versions of the original image extending all the way down to a nearly final 2 Texel by 2 Texel representation, finishing with a single 1.times.1 representation of the image (a single Texel). This pre-filtering technique (involving re-sampling and storing prior to the real-time texture sampling process) has been named the MIP MAP storage technique.
Two Levels of Texture Detail Sampled and Combined
Producing a single texture sample from a MIP MAP which has been stored in texture memory requires two sets of sampling operations which are then properly combined to produce the single output. Since multiple Levels-Of-Detail are stored in memory (for each source image) one set of sampling operations is performed using one of the Level-Of-Detail representations, and the second set of sampling operations are performed in the next lower Level-Of-Detail. This can also be described as selecting the two pre-filtered representations of the original image which happen to straddle the precise sampling resolution that is desired. For example, between the LOD 4 representation of the image and the LOD 3 representation of the image we may wish to sample at essentially a LOD 3.5 level of pre-filtering. This desired result is achieved by sampling at LOD 4, Sampling at LOD 3, and then averaging the results to give an approximate result which looks appropriate as an approximation of LOD 3.5. To allow the fractional component of desired LOD to vary in this case from 3.0 up to 4.0, the fractional component of desired LOD is used to control a linear interpolation between the LOD 3 and LOD 4 result. All such TriLinear MIP Mapped results are computed by blending between two independent occurrences of a LOD sampling operation.
Three or Four Texels Read to Produce a Sample at One LOD
Each LOD sampling operation must produce a result which gives a consistent, continuous appearance from one sampling operation to the next. The sample point is calculated for neighboring pixels in turn by picking the screen location at the center of a pixel and extending a ray from the eye point, through the center of the pixel, extending into scene space, and striking a polygon which contains the texture image mapped onto its surface--calculating the exact location within a MIP Map level where the ray strikes the two-dimensional image. This precise location is then used to control an interpolation process, interpolating between the discrete samples (the Texels) which surround the precise sampling location. To produce a continuous result from one sample to the next the four Texels within the two-dimensional matrix whose centers are nearest to the sample point--are selected, and these four are used in the two-dimensional interpolation process.
As an option this same process can also be used whereby the nearest three Texel centers are used, with three Texels going into the two-dimensional interpolation process. Both three and four input interpolation is described here, since either is acceptable, and the extra miscellaneous logic needed to implement the three sample approach sometimes is avoided in favor of the more simple four input hardware implementation. For simplicity the four input interpolation approach will be used in the following description.
Polygon-Pixel Color Calculation Including Texture
Rendering a Computer Graphics Image typically involves computing pixel brightness for a single polygon at a time, progressing across the pixels of the display, storing the results in a Frame Buffer. Other hardware components manage the process which identifies the pixels to be processed in turn. The set of pixels fed downstream for processing all lie within the area of the single polygon being processed (in this example Feature sequential rendering approach). More specifically, when the center of the pixel happens to lie within the area of the polygon, the pixel is identified as a valid pixel for the following color calculation process.
Calculating the color of a Polygon-Pixel occurrence includes calculating the smoothly varying inherent color of a polygon which is interpolated from the color of vertices, and calculating the Texture result for the pixel, followed by combining Color and Texture, or simply using the Texture results. The Texture result may be a color or monochrome result, and may optionally include translucency which results from the Texture calculation. The option to combine polygon color and texture or alternatively to simply apply the Texture Color as the final results is a choice specified during modeling of the computer graphics scene.
Texture Use Creates Realistic Graphics
Scenes modeled exclusively using color specified at vertices of polygons, with simple interpolation of color across polygons, but with no Texture, appear unrealistic and unnatural to the human eye. Simple color shading of polygons (absence of Texture) also gives imagery that is hard to interpret while in motion, since ground surfaces tend to include smoothly varying color which fails to give the needed Stimulus Gradient that the human visual system expects. As a result, with simple color shading, a real-time graphics display device fails to give the needed visual queues when simulating real-world scenarios, and the user is left unable to determine one's position simply by references to the graphics display.
Texture is added to a scene by specifying a relationship between a digitized image (or synthesized image) and its placement on a flat polygon, similar to the way wallpaper (with an image on its surface) is applied to a flat wall. The Texture Image must be properly translated, rotated, and scaled, and this is specified during modeling, as the relationship between image and polygon are set during the off-line 3D scene composition steps (3D Database Modeling).
The image that is applied to a polygon during Texturing typically contains recognizable features along with subtle brightness variations which clearly remind the human visual system of certain consistent looking materials such as Grass, or Bricks, or Road surface. These consistent brightness or color variations appear across the surface of a polygon with a correct perspective orientation, and this gives the user all of the Stimulus Gradient needed to navigate realistically through a simulated world.
In addition, scenes modeled with textured polygons take on a Realism that is striking, due to the way in which surfaces imitate real-world surfaces. We expect to see subtle brightness variations on surfaces, caused in the real-world by irregular surfaces, imperfections, or even caused by dirt or normal wear and tear. Scenes computed without Texture in contrast all have a wholly artificial appearance--with objects appearing to be made out of perfect materials with no surface irregularities--a condition that does not occur in the real world. Applying Texture to surfaces in a scene therefore creates a realism that adds dramatically to any simulated world.
Extreme Demand for Texture Calculation
Texturing adds enormously to the effectiveness of real-time graphics and so most or all Polygon-Pixel occurrences generated during the rendering process must be processed through the Texture Calculation, which includes the above mentioned need to access eight semi-random memory locations to produce the single Polygon-Pixel result. To sustain real-time updating of Frame Buffer contents, a Polygon-Pixel completion rate of 30 million completed results per second (or more) may be required. This completion rate demand is dictated by the need to calculate more than 1/4 million pixels, with an average Polygon coverage of 4 Polygons covering or touching each Pixel (typical), while completing the full scene at a 30 Frame per Second completion rate. In the absence of a sophisticated Texture Memory access technique, such a device would require 240 million semi-random Texel read operations per second, demanded from a single bank of DRAM. Typical DRAM currently supplies 12 million purely random read operations per second, so the Memory Read demands associated with Texture Generation will severely limit graphics performance in the absence of a dramatic improvement in the Texture Memory access concept. The present invention supplies a potential 20 to 1 improvement in supplying the needed Texels to a Texture Generator, while relying on the existing cost-effective DRAM technology.
Problem with Simple Parallel Texel Storage
FIG. 1A and 1B show the problem associated with reading Texels from off-chip DRAM. FIG. 1A shows the four Texels needed to calculate a proper smooth sample point 110, given four input Texels. An attempt to organize memory storage to include four Texels within a single word of storage can be shown to work for FIG. 1A, but does not help when processing FIG. 1B. FIG. 1B shows the need to read four groupings of four Texels in order to supply the proper inputs to the Texture calculation.
Each square in the FIG. 1A grid represents a texel. The dotted line rectangle is drawn to illustrate the four neighboring texels Top Right ("TR") 111, Bottom Right ("BR") 112, Bottom Left ("BL") 113 and Top Left ("TL") 114 used in the interpolation calculation to generate the texture for the pixel corresponding to sample point 110. In this case storing texels 111-114 in a single word would improve the memory access efficiency for the interpolation calculation of sample point 110. However, such a memory grouping would not solve inefficiencies for the interpolation calculation of sample point 130 illustrated in FIG. 1B. FIG. 1B illustrates a second precisely calculated sample point 130. The texels are labeled to indicate how the texels would be stored in memory, using the memory organization approach of FIG. 1A. Each square group of texels comprising texels labeled TR, BR, BL, and TL would be stored in a memory word. The dotted line rectangle in FIG. 1B shows the four texels 131-134 that would be used in the interpolation calculation for sample point 130. Texels 131-134 are each stored as part of a separate memory word. To retrieve texels 131-134, reading memory a word at a time therefore would involve reading four different memory words. FIGS. 1A and 1B illustrate how a fixed grouping of four neighboring texels, stored together in one word for example, would not supply the desired set of four texels with a single read operation. In some cases, as shown in FIG. 1B, four groups of four texels would be needed from texture memory to supply the proper texels for interpolation.
The problem with reading three or four neighboring texels is that the geometry of the situation forces multiple random reads from graphics memory, not the more desirable single random access, followed by several sequential accesses within the same DRAM page. Thus a simple clustering of four neighboring texels does not eliminate the need for three or four random accesses potentially needed to supply a single modulation calculation.
Any conceivable small grouping of Texels still requires one, two, or possibly four read operations from DRAM to supply the inputs needed for a single LOD calculation. Any large grouping of Texels would fail to fit within a single word of DRAM and would require multiple read operations to fetch a Texel (thereby defeating the purpose of large groupings of Texels). A simple grouping of Texels in external DRAM fails to achieve the ultimate speed goal.
Use of a MIP Map to Solve the Undersampling Problem
The multi-resolution storage inherent in a MIP Map is needed to avoid undersampling during the Texture calculation process. A single Level-Of-Detail image applied in perspective on a Polygon can easily lead to undersampling, since pixel centers that are close together on an output display can impinge upon precise texture sample locations that are far distant in Texture Space. Any consistent signal must be sampled at reasonably close intervals (sampled at better than twice the frequency of the signal). When this reasonably close sampling rule is not followed, then successive samples will fail to convey the information that is available in the signal, and instead the result will be the appearance of noise. Undersampling a signal gives noise.
When applying Texture to surfaces we wish to display an image in perspective, on a polygon. The mistake described here causes noise to be displayed instead of an image. This is a serious undesirable side effect of viewing images in perspective. The noise begins and occurs in such a way that pure noise is first preceded by an odd combination of some signal and some noise which gives undesirable visual artifacts called Moire Patterns, which wholly distract a viewer by moving in unpredictable ways across Textured surfaces during simulated motion. The MIP Map technique, if properly applied solves the noise problem, eliminating pure noise and eliminating the possibility of Moire Patterns appearing in the Texture Generator output.
Noise results because texture samples for neighboring pixels sample the texture image with spacing between samples that is too far apart as compared to the spacing of the Texel grid. The solution includes storing a more coarsely sampled version of the same image and using the more coarse representation when the Texture sample points become too widely spaced. As described above, the typical MIP Map actually contains multiple LOD representations, and the essential step of avoiding undersampling requires a per sample selection of the proper LOD to use for the current pixel. Typically this LOD value calculation is performed for each pixel, giving a precise LOD value (including a fractional component of LOD as described above) which is used to control the MIP Map sampling operation.
Texture Level-Of-Detail Calculation
Above we described how a precise LOD number can be used to control interpolation between two LOD samples. Here we stress that this number must be calculated per pixel (per sample) to account for the way in which the perspective view creates a complex variation in Texture spacing from one pixel to the next. The ideal calculation to determine the proper sampling LOD involves taking the Gradient of the Texture Equations, and evaluating this Gradient equation at each pixel center.
Given the two independent texture axis we would actually take the worst case of the Gradient of the S axis and the Gradient of the T axis. This worst case of two numbers is the single result, the single LOD value used to control the MIP Map sampling process at a pixel.
A practical alternative to evaluating the Gradient of S and T involves taking the difference of S and T as measured across the width and height of one pixel. The gradient of S can be approximated via use of the difference of S, sampled one pixel distant in both the horizontal and vertical directions. In practice the Square Root of the Sum of the Squares of these differences is used as the approximation to the Gradient of S at a point. The same approximation is applied to the independent sampling of T at these same locations, and then the worst case of approximated gradients is used to select the LOD for processing a single pixel. This differencing and approximation is repeated for each Polygon-Pixel and used to control sampling the MIP Map. One such LOD value is calculated and used for each Polygon-Pixel sample operation.