1. Field of the Invention
The present invention relates, in general, to a method and system to be utilized in data processing systems. In particular, the present invention relates to a method and system to be utilized in data processing systems wherein virtual memory address to physical memory address translation is done. Yet still more particularly, the present invention relates to a method and system to be utilized in data processing systems, wherein virtual memory address to physical memory address translation is done, such as data processing systems utilizing the Accelerated Graphics Port (AGP) interface standard.
2. Description of the Related Art
Data processing systems are systems that manipulate, process, and store data and are notorious within the art. Personal computer systems, and their associated subsystems, constitute well known species of data processing systems. Personal computer systems in general and IBM compatible personal computer systems in particular have attained widespread use for providing computer power to many segments of today""s modern society. A personal computer system can usually be defined as a desk top, floor standing, or portable microcomputer that includes a system unit including but not limited to a system processor and associated volatile and non-volatile memory, a display device, a keyboard, one or more diskette drives, one or more fixed disk storage devices, and one or more data buses for communications between devices. One of the distinguishing characteristics of these systems is the use of a system board to electrically connect these components together. These personal computer systems are information handling systems which are designed primarily to give independent computing power to a single user (or a relatively small group of users in the case of personal computers which serve as computer server systems) and are inexpensively priced for purchase by individuals or small businesses.
A computer system or data-processing system typically includes a system bus. Attached to the system bus are various devices that may communicate locally with each other over the system bus. For example, a typical computer system includes a system bus to which a central processing unit (CPU) is attached and over which the CPU communicates directly with a system memory that is also attached to the system bus.
In addition, the computer system may include a peripheral bus for connecting certain highly integrated peripheral components to the CPU. One such peripheral bus is known as the Peripheral Component Interconnect (PCI) bus. Under the PCI bus standard, peripheral components can directly connect to a PCI bus without the need for glue logic. Thus, PCI is designed to provide a bus standard on which high-performance peripheral devices, such as graphics devices and hard disk drives, can be coupled to the CPU, thereby permitting these high-performance peripheral devices to avoid the general access latency and the band-width constraints that would have occurred if these peripheral devices were connected to a low speed peripheral bus. Details on the PCI local bus standard can be obtained under the PCI Bus Specification, Revision 2.1, from the PCI Special Interest Group, which is hereby incorporated by reference in its entirety.
Relatively recently, techniques for rendering three-dimensional (3D) continuous-animation graphics have been implemented within PCs which, as will be explained below, have exposed limitations in the originally high performance of the PCI bus. For example, the AGP interface standard has been developed to both, (1) reduce the load on the PCI bus systems, and (2) extend the capabilities of systems to include the ability to provide 3D continuous-animation graphics with a level of quality previously found only on high-end computer workstations. The AGP interface standard is defined by the following document: Intel Corporation, Accelerated Graphics Port Interface Specification, Revision 1.0 (Jul. 31, 1996), which is hereby incorporated by reference in its entirety.
The AGP interface standard is specifically targeted to improve the efficiency of 3D continuous-animation graphics applications which utilize a technique know in the art as xe2x80x9ctexturing.xe2x80x9d Consequently, as background for understanding the data processing systems utilizing the AGP interface standard, it is helpful to have a brief overview of the data processing needs of 3D continuous animation graphics applications which utilize texturing, how they degrade the performance of PCI local bus systems, and how the AGP interface standard remedy this degradation of performance.
The display device of a computing system displays data in two-dimensions (2D). In order to create a 3D continuous animation graphical display, it is first necessary to create an object such that when the object is presented on the 2D display device, the object will be perceived by a human viewer as a 3D object. There are two basic ways in which this can be done. The first way is to use color and shading techniques to trick the human visual system into perceiving 3D objects on the 2D display device (essentially the same technique used by human artists when creating what appear to be 3D landscapes consisting of trees, rocks, streams, etc., on 2D canvases). This is a very powerful technique and creates superior 3D realism. The second way is to use mutually perpendicular lines (e.g., the well-known x, y, z coordinate system) to create geometric objects which will be interpreted by the human visual system as denoting 3D (essentially the same technique used by human architects to create the illusion of 3D in perspective view architectural drawings). However, the 3D illusion created by the use of mutually perpendicular lines is generally perceived to be inferior to that produced by the coloring and shading techniques.
Subsequent to creating a 3D object, the object must be animated. Animation is the creation of the illusion of continuous motion by the rapid sequential presentation of discrete images, or frames, upon the 2D display device. Animated 3D computer graphics are generated by taking advantage of a well know physiological property of the human visual system which is that if a person is shown a sequence of 15 discrete snapshots of a continuous motion, where each snapshot was taken in {fraction (1/15)} second intervals, within one second, the brain will integrate the sequence together such that the person will xe2x80x9csee,xe2x80x9d or perceive, continuous motion. However, due to person-to-person variations in physiology, it has been found empirically that a presentation of 20 images per second is generally the minimum rate at which the majority of people will perceive continuous motion without flicker, with 30 images per second tending to be the accepted as the optimal presentation speed.
The difficulty with 3D continuous animation computer graphics is that while the color and shading techniques (which are typically accomplished via bit-mapped images) produce superior 3D realism, such techniques are not easy for a computer to translate through geometric space for the creation of continuously varying sequential images necessary to produce the animation effect. On the other hand, the geometric shapes produced via the use of mutually perpendicular lines allow for easy computer manipulation in three dimensions, which allows the creation of sequential images necessary to produce the animation effect, but such geometric shapes result in inferior 3D realism. Recent 3D continuous-animation computer graphics techniques take advantage of both of the foregoing noted 3D techniques via the use of a middle ground approach known in the art xe2x80x9ctexturing.xe2x80x9d
In the use of texturing, the gross, overall structures of an object are denoted by a 3D geometric shape which is used to do geometric translation in three space, while the finer details of each side of the 3D object are denoted by bit mapped images (known in the art as xe2x80x9ctexturesxe2x80x9d) which accomplish the color and shading techniques. Each time a new image of an object is needed for animation, the geometric representation is pulled from computer memory into a CPU, and the appropriate translations calculated. Thereafter, the translated geometric representation is cached and the appropriate bit-mapped images are pulled from computer memory into the CPU and transformed as appropriate to the new geometric translations so as to give the correct appearance from the viewpoint of the display device, the new geometric position, and any lighting sources and/or other objects that may be present within the image to be presented. Thereafter, a device known as the graphics controller, which is responsible for creating and presenting frames (one complete computer screen) of data, retrieves both the translated geometric object data and transformed texture data, xe2x80x9cpaintsxe2x80x9d the surfaces of the geometric object with the texture data, and places the resultant object into frame buffer memory (a storage device local to the graphics controller wherein each individual frame is built before it is sent to the 2D display device). It is to be understood that the foregoing noted series of translations/transformations is done for each animated object to be displayed.
It is primarily the technique of texturing which has exposed the performance limitations of PCI bus systems. It has been found that when an attempt is made to implement 3D continuous-animation computer graphics application wherein texturing is utilized within PCI bus systems, the texturing data results in effective monopolization of the PCI bus by the application, unless expensive memory is added to the graphics controller. That is, texturing using the PCI bus is possible. However, due to PCI bandwidth limitations, the textures must fit into the memory directly connected to the graphics card. Since there is a direct correlation between the size of textures and the realism of the scene, quality can only be achieved by adding memory to the graphics card/controller. It was this realization that prompted the development of the AGP interface specification: with the AGP interface standard, texture size can be increased using available system memory. The AGP interface standard is intended to remedy the exposed limitations of the PCI local bus systems by providing extended capabilities to PCI bus systems for performing 3D continuous-animation computer graphics, as will become clear in the following detailed description.
The AGP interface standard dictates supported functions and interface standards for AGP-enabled devices; it does not dictate the internal device logic whereby the supported functions are to be implemented, but rather leaves the internal details to the discretion of system designers.
One of the primary extended capabilities provided by the AGP interface standard is that the AGP interface standard gives graphics applications the ability to access system memory without utilizing the PCI bus, thereby alleviating the limitations of PCI bus systems. The AGP interface standard provides a new separate pathway for data transfer between the graphics controller and memory. This pathway is also used for CPU to graphics controller data flow. The AGP interface standard also provides a re-mapping function for addresses that fall in a defined range. This capability enables graphics devices and the CPU to utilize a contiguous address range for the memory allocated by the operating system for graphics data and textures.
As mentioned, the AGP interface standard does not dictate the internal logic of AGP devices. One such device where the internal logic is not dictated is the Graphics Address Re-Mapping Table (GART) system. The GART system is to provide virtual memory address to physical memory address translation in order to allow devices such as AGP-enabled graphics controllers to treat AGP memory as if it were one contiguous area in memory, when in fact it may consist of many discontiguous areas of physical memory.
While the internal logic of the GART system is not defined under the AGP standard, it is likely that GART system will be implemented by most manufacturers as some variant of the Intel Corporation""s x86 CPU Paging Mechanism, which is a 32-bit virtual to physical address translation accomplished through two levels of look up tables: one (the directory) provides a pointer to the location of the base of the appropriate page table, the other (the page table) provides a pointer to the base of a 4 kbyte contiguous physical memory location corresponding to a 4 kbyte page, from which the offset can be used to reach the desired location within the page.
Referring now to FIG. 1, shown is that in the x86 CPU Paging Mechanism the translation is accomplished as follows: (1) 32-bit Virtual Memory Linear Address 100 is decomposed into directory index field 102, page table index field 104, and byte offset field 106; (2) the upper most 10 bits of 32-bit Virtual Memory Linear Address 100 are used with directory base 101 to form directory index 103, an index into the first or Directory Level table 108 (these 10 bits select one of the 1024 4-byte page directory entries from this first level table whose base is stored in a CPU register); (3) the contents of the 4-byte page directory index include a pointer 110 to the base 112 of the level 2 or page table 114; (4) the next 10 bits (page table index field 104) of 32-bit Virtual Memory Linear Address 100 are used with page table base 112 to form page table index 116 to one of the 1024 4-byte entries in page table 114 (5) the contents of the 4-byte entry at page table index 116 includes a physical memory pointer 118 to base 120 of 4 kbyte memory page 122; and (6) the lower 12 bits 106 of 32-bit Virtual Memory Linear Address 100 are used with page base address 120 of 4 kbyte memory page 122 to form physical memory address 126, wherein is contained specific byte of data 128.
After the foregoing page table lookup has been done once for a particular 32-bit Virtual Memory Linear Address 100, the additional cycles required to read the directory and page table entries from memory are typically eliminated by caching within the CPU the referenced translation data. A special cache called a translation look aside buffer (TLB) is used for this purpose. However, design constraints on the TLB restrict the number of entries in the TLB, and thus limit the total amount of virtual memory which can be accessed without incurring the penalty of a page table walk (i.e., the term of art for the FIG. 1 process of reading the page directory and then the page table in order to translate a virtual address to a physical address). These page table walks represent processing overhead, and it is desired to minimize them.
If some variant of the x86 CPU Paging Mechanism is utilized to implement the virtual to physical address translation within the GART system, the same processing overhead represented by the page table walks will show themselves. Given the computational and memory intensive nature of 3D continuous-animation computer graphics as described above, it is important that such processing overhead be effectively minimized. It is therefore apparent that a need exists for a method and system which will substantially minimize the processing overhead associated with page table walks taken by any virtual memory address to physical memory address translation mechanism employed within the GART system.
It has been discovered that a method and system can be produced which will substantially minimize the processing overhead associated with page table walks, especially in the context of data processing systems utilizing the Accelerated Graphics Port (AGP) interface standard. In the method and system, a request to access a first virtual memory address, correspondent to a first physical memory location resident within a first page of physical memory, is received. In response to the request to access the first virtual memory address, a Graphics Translation Look Aside Buffer entry is created. In response to a request to access a second virtual memory address, correspondent to a second physical memory address resident within a second physical memory area non-overlapping with the first physical memory page, the second physical memory location is accessed via the Graphics Translation Look Aside Buffer entry. The Graphics Translation Look Aside Buffer entry is constructed such that it a translates a number of virtual memory addresses corresponding to a number of physical memory addresses. The construction of the Graphics Translation Look Aside Buffer entry is achieved by translating the first virtual memory address to a first physical memory address, obtaining a size of a contiguous graphics physical memory block containing the first physical memory address, and associating a range of virtual memory addresses with the obtained contiguous graphics physical memory block.
The foregoing summary is illustrative and is intended to be in no way limiting. Other aspects, inventive features, and advantages of the present invention, as defined solely by the claims, will become apparent in the non-limiting detailed description set forth below.