Specifically, the present invention discloses an improved method and apparatus for per pixel MIP mapping and trilinear filtering.
Multimedia graphics are typically generated by treating an image as a collection of small, independently controlled dots (or pixels) arranged on a screen or cathode ray tube. A computer graphic image is typically composed of a number of objects rendered onto one background image, wherein each object comprises multiple pixels. Pixels, or xe2x80x98picture elementsxe2x80x99, may be viewed as the smallest resolvable area of a screen image. With the area usually rectangular in shape, each pixel in a monochrome image has its own brightness, from 0 for black to the maximum value (e.g. 255 for an eight-bit pixel) for white. In a color image, each pixel has its own brightness and color, usually represented as a triple of red, green and blue intensities. During rendering, the object may be combined with previously generated objects using compositing techniques, wherein compositing is the combining of multiple images by overlaying or blending the images. In a composited image, the value of each pixel is computed from the component images.
Three-dimensional (3D) computer graphics generally refers to graphics environments that are rich in color, texture, correct point of view and shadowing. Typical 3D graphics systems generally implement a range of techniques to allow computer graphics developers to create better and more realistic graphics environments. A subset of these techniques is described in further detail below.
The building block of any 3D scene is a polygon. A polygon is a flat shape that is generated using rendered pixels. Triangles, for example, are frequently used to create a variety of shapes. The polygon may be rendered using pixels having a single color resulting in a flat look, or using pixels with shading applied, resulting in a gradation of color so that it appears darker with distance or based upon scene lighting.
In composing the triangles that form the images, each vertex or coordinate has a corresponding color value from a particular color model. A color model is a specification of a 3D color coordinate system and a visible subset in the coordinate system within which all colors in a particular color gamut lie, wherein a color gamut is a subset of all visible chromaticities. For example, the red (R), green (G), blue (B), color model (RGB) is the unit cube subset of the 3D Cartesian coordinate system. The purpose of a color model is to allow convenient specification of colors within some color gamut. The RGB primaries are additive primaries in that the individual contributions of each primary are added together to yield the resultant pixel. The color value of each pixel in a composited multimedia image is computed from the component images in some fashion.
Texture mapping is a technique that allows a 3D developer to create impressive scenes that appear realistic and detailed by scaling and mapping a bitmap image file onto a polygon. Instead of simply shading a polygon red, for example, the use of texture mapping allows a polygon to look like a realistic brick wall. As a technique to display images in a sufficiently realistic manner that represent complex three-dimensional objects, texture mapping involves mapping a source image, referred to as a texture, onto a surface of a three-dimensional object, and thereafter mapping the textured three-dimensional object to the two-dimensional graphics display screen to display the resulting image. Surface detail attributes that are commonly texture mapped include, for example, color, specular reflection, transparency, shadows, and surface irregularities.
Texture mapping may include applying one or more texture map elements of a texture to each pixel of the displayed portion of the object to which the texture is being mapped. (Where pixel is short for xe2x80x98picture elementxe2x80x99, texture map element is shorten to xe2x80x98texelxe2x80x99.) The location of each texel in a texture map may be defined by two or more spatial coordinates and a homogeneous texture effect parameter. For each pixel, the corresponding texel(s) that maps to the pixel is accessed from the texture map via the texel coordinates associated with that pixel. To represent the textured object on the display screen, the corresponding texel is incorporated into the final R, G, B values generated for the pixel. Note that each pixel in an object primitive may not map in a one-to-one correspondence with a single texel in the texture map for every view of the object.
Texture mapping systems typically store data in memory where that data represents a texture associated with the object being rendered. As indicated above, a pixel may map to multiple texels. If it is necessary for the texture mapping system to read a large number of texels that map to a pixel from memory to generate an average value, then a large number of memory reads and the averaging of many texel values would be required. This would undesirably consume time and degrade system performance.
Multum in parvo may translate into xe2x80x9cmuch in littlexe2x80x9d such as in compression of much into little space. Multum in parvo (MIP) mapping is a technique that is used to improve the visual quality of texture mapping while optimizing performance. The technique works by having multiple texture maps for each texture, each rendered at a different resolution. Different texture maps are then used to represent the image at various distances. In other words, MIP mapping includes creating a series of MIP maps for each texture map and storing in memory the MIP maps of each texture map associated with the object being rendered. A set of MIP maps for a texture map includes a base map that corresponds directly to the texture map as well as a series of related filtered maps, where each successive map is reduced in size by a factor in each of the texture map dimensions. In a sense, each MIP map represents different resolutions of the texture map. Bilinear filtering may also be used to improve the visual quality of texture mapping. Bilinear filtering uses the four surrounding texels from a texture map to more precisely calculate the value of any given pixel in 3D space. Texels are dots within a texture map, while pixels refer to dots on the screen.
Trilinear filtering is a refined filtering technique that takes filtering into the third dimension. With trilinear filtering, the resulting pixel is averaged from the four surrounding texels from the two nearest MIP maps. Trilinear filtering results in an improved visual quality of texture mapping, but requires eight memory reads per pixel, instead of the four memory reads for bilinear filtering, and a calculation to determine which MIP maps from which to read. Accurately calculating this is very expensive. The calculations comprise calculating a Level of Detail (LOD) wherein
Rho=MAX({square root over ((du/dx)2+(dv/dx)2)},{square root over ((du/dy)2+(dv/dy)2)}), 
and
LOD=log2 Rho. 
When simplifying to avoid taking a square root, the equations become,
Rhoxe2x80x2=(Rho)2=MAX[(du/dv)2+(dv/dx)2,(du/dx)2+(dv/dx)2], 
and
LOD=xc2xdlog2 Rhoxe2x80x2. 
To accurately calculate Rhoxe2x80x2 at each pixel, multipliers and adders are used to calculate du/dx, dv/dx, du/dy, and dv/dy. Additional multiplers and adders are used to calculate the square of each of these values. In a system with a tremendous amount of processing capability, the cost of performing four additional memory reads may not limit trilinear filtering. In an environment with less processing power, such as a personal computing environment, however, trilinear filtering may not be implemented without affecting performance. It is therefore extremely desirable for an improved cost-effective method of performing trilinear filtering that does not affect performance.
A method and apparatus for per pixel MIP mapping and trilinear filtering are provided in which the performance of trilinear filtering is improved by reducing the number of computations performed in rendering graphics by computing certain terms only at the beginning of each scanline. In one embodiment, a scanline gradient is calculated once at the beginning of each scanline for each of two texture values with respect to the x-coordinate of the scanline. Following the scanline gradient calculations at the beginning of each scanline, a pixel gradient is calculated for each pixel of the scanline with respect to the y-coordinate of the scanline. The sum of the squares of the scanline gradients and the pixel gradients are compared, and the larger of the two quantities is selected to be a maximum Rho constant term for the corresponding pixel. The maximum Rho constant is used to calculate a Level of Detail (LOD) for each pixel of the scanline. The LOD value for each pixel is used to select a texture map for rendering the corresponding pixel.
In an alternate embodiment, a scanline gradient is calculated once at the beginning of each scanline for each of two texture values. Following the scanline gradient calculations, at the beginning of each scanline, a pixel gradient is calculated for each of two texture values for a first pixel of the scanline with respect to the y-coordinate of the scanline. Derivatives are calculated for the pixel gradients, wherein pixel gradients are found using the derivatives, thereby eliminating the calculation of pixel gradients for each pixel. The sum of the squares of the scanline gradients and the pixel gradients are compared, and the larger of the two quantities is selected to be a maximum Rho constant term for the corresponding pixel. The maximum Rho constant is used to calculate a LOD, and the LOD value for each pixel is used to select a texture map for rendering the corresponding pixel.