Computer graphics systems commonly are used for displaying graphical representations of objects on a two dimensional computer display screen. Current computer graphics systems can provide highly detailed representations and are used in a variety of applications.
In typical computer graphics systems, a certain amount of memory is allocated to the graphics display. That available memory, for example frame buffer memory, is partitioned into a series of sections depending upon the display requirement, such as, for example, 640.times.480 bytes or 800.times.600 bytes. In this example, the display configuration can include a display buffer, a Z (Depth) buffer, a composition buffer, a video buffer, and a plurality of texture buffers.
The aforementioned display systems generally provide a fixed amount of frame buffer memory for the display, as mentioned above, however, they allot differing amounts of memory for different buffers, depending on the display configuration desired. In the simplest implementation, frame buffer memory is typically accessed in a linear fashion, whereby the memory is partitioned into a linear stream with each memory location, or byte, treated as equivalent to any other.
To illustrate, refer now to FIG. 1, which illustratively shows a 2 megabyte (MB) section of memory 11 partitioned into sections. The first 786 Kbytes of memory section 11 are arranged in a 1024.times.768 byte matrix allocated to a display buffer 12. Next a 640.times.480 byte section of memory 11 is depicted as allocated to a Z buffer 14 and represented as 308 Kbytes in memory 11. Next, another 640.times.480 byte section represented as 308 Kbytes is allocated to a composition buffer 16. Similarly, video buffer 17, and texture buffers 18, 19, and 21 are allocated sections of memory 11.
In the memory section 11 depicted in FIG. 1, each buffer, i.e., display buffer 12, Z buffer 14, composition buffer 16, etc., has a particular byte stride (BS) associated therewith. For example, the byte stride of display buffer 12 is 1024 while the byte stride of composition buffer 16 is 640.
Referring to display buffer 12, the first 1024 bytes of memory make up the first line of display buffer 12. The 1025th byte of display buffer 12 begins the second line, and so on, with each byte of memory considered equivalent to any other. Section 12 is considered to have a stride of 1024 bytes. To define a particular memory location, i.e., to transform an x,y address to a linear address (LA), the following formula can be used. X,Y.fwdarw.Linear Address (LA)=y.cndot.Byte Stride (BS)+x+Base Address (BA).
The base address is the starting location for the particular section of allocated memory. The base address for section 12 is "0", while the base address for section 16 is 1093632, that is 786432+307200=1093632. For example, section 16 is a total of 307200 bytes with a dimension of 640 bytes.times.480 bytes. The "x" dimension is 640 and the "y" dimension is 480. Assuming the memory location desired to be accessed is pixel (3,2) corresponding to "x" dimension 3 and "y" dimension 2, plugging the numbers into the formula we have LA=2("y" dimension).cndot.640 (byte stride)+3("x" dimension)+1093632(Base Address)=1094915. Therefore, an x,y location of 3,2 translates into a linear address of 1094915, meaning that byte 1094915 is the byte that would be addressed if one desired to reach the particular byte represented by pixel 3,2 in section 16.
In actuality however, each byte of memory is not equivalent. Because of the underlying memory architecture, it is usually the case that some memory accesses are less costly than others. The underlying memory is accessed using a (row, column) address, and the address is specified on a multiplexed address bus. If two acceses have the same row address, it is not necessary to specify the row address for the second access, it having been supplied for the first access. If the second access has a different row address, then both the row and column address must be specified.
In computer graphics, there is a high degree of 2-dimensional locality to the accesses to the memory. That is, given a first access, the next access is likely to be dose in either the X or Y dimension.
Designers of high performance computer graphics devices take advantage of these two facts when designing memory controllers. They will ensure that the memory is organized in such a way that subsequent accesses in both the X and Y direction will have a high likelihood of having the same row address (same page), thereby allowing subsequent accesses at a lower cost.
Referring now to FIG. 2A, the aforementioned concept is illustrated. FIG. 2A is a schematic view illustrating the position of primitives and pixels with respect to the memory organization discussed above. When graphical images are displayed on a screen, the building blocks of all images are called primitives. Primitives are typically triangular or square in shape.
Primitive 33 is shown as residing completely within page 31a, which represents a single page of memory. There is a high probability that successive pixels, i.e., pixel 34, drawn after first pixel 32, will fall on the same page of memory (31a), thereby requiring less system resources to draw pixel 34 when memory is arranged and accessed as in FIG. 2B.
Primitive 39 is depicted as overlapping pages 31b and 31d. Assume that pixel 37 is drawn first. Even as primitive 39 overlaps pages 31b and 31d of memory, there is still a high probability that a successive pixel, 36, will be drawn to the same page, 31b, as pixel 37.
Referring now to FIG. 2B, shown are the position of primitives and pixels on multiple pages of the memory of FIG. 1. FIG. 2B illustrates 29 rows, or pages of memory 11. Each page of memory 11 contains 1024 bytes of memory organized linearly. Primitives 33, 38 and 39 are shown to illustrate the concept of high resource mapping. Each primitive is made up of a plurality of pixels. Assuming that pixel 32 in primitive 33 is drawn first, the next pixel drawn (35) is likely to be drawn to a different page, or line, of memory when the memory is linearly organized as in FIG. 2B. In FIG. 2B, because the memory is organized linearly, a successive pixel drawn to each primitive is likely to be on a different page, thus requiring increased system resources to draw.
The high performance memory access mechanism as depicted in FIG. 2A, however, requires that a graphics memory controller know the (X, Y) address of the particular location in memory desired to be accessed. The linear access mechanism described earlier provides only a single linear address, not an (X, Y) address.
This arrangement would not be a problem except that industry personal computer (PC) standards (such as DIRECT X.RTM. application programming interface (API)), which is a registered trademark of Microsoft Corporation of Redmond, Wash. require the use of a memory access mechanism that is linear byte based. That is, the PC market requires a base address and a byte stride value in order to access memory. The base address is the location in memory that denotes the start of a particular section and the byte stride is the number of bytes arranged linearly, or in the x direction, in a particular section of memory.
One way to comply is to assign a single stride to the entire frame buffer memory and arrange the pages in rectangular fashion.
Referring now to FIG. 3, shown are the buffers of FIG. 1 arranged in a rectangular 2-dimensional array. Rectangular memory section 41 illustrates a compromise solution by treating the whole memory as one large 2D array 41 with a single byte stride of 1024 bytes. With this solution, the pages of memory can be preconfigured to take advantage of the 2D locality of rendered primitives. Flexibility requirements are met in that any rectangle (e.g., display buffer 12, Z (depth) buffer 14, composition buffer 16, etc.) can be carved out of the single large rectangle to be used for any purpose. However, the disadvantage to this solution is that some areas, or sections of the memory can become unusable. These unusable sections are illustrated as area 42. For example, there may be physically enough memory in area 42 to allocate an additional buffer, however, the available memory locations are scattered throughout the memory and cannot be encompassed in a rectangle of the desired size with any amount of reorganization. Furthermore, this arrangement sacrifices the linear allocation required by an API such as DIRECT X.RTM. API.
Referring now to FIGS. 4A and 4B, shown are schematic views of a 640 byte wide buffer 43 and an 800 byte wide buffer 47. FIGS. 4A and 4B illustrate the concept that a linear address of a particular memory location will correspond to a different line and column location in buffers having different byte stride values. Spefically, as shown in FIG. 4A, linear address location 2600 will correspond to line 4 column 40 in the 640 byte wide (byte stride=640) buffer 43, while linear address 2600 will correspond to line 3 column 200 in the 840 byte wide (byte stride=800) buffer 47 shown in FIG. 4B.
Because the API requires that the appearance of linear access to memory be maintained, it would be desirable to provide the appearance of linear memory addressing while taking advantage of the 2-dimensionality of graphics rendering systems using rectangular memory addressing.