1. Field of the Invention
The present invention relates, in general, to a method and system to be utilized in data processing systems. In particular, the present invention relates to a method and system to be utilized in data processing systems wherein a faster device communicates with a slower device, such as the non-limiting example of data processing systems wherein the Accelerated Graphics Port (AGP) interface standard is utilized.
2. Description of the Related Art
Data processing systems are systems that manipulate, process, and store data and are notorious within the art. Personal computer systems, and their associated subsystems, constitute well known species of data processing systems. Personal computer systems in general and IBM compatible personal computer systems in particular have attained widespread use for providing computer power to many segments of today's modern society. A personal computer system can usually be defined as a desk top, floor standing, or portable microcomputer that includes a system unit including but not limited to a system processor and associated volatile and non-volatile memory, a display device, a keyboard, one or more diskette drives, one or more fixed disk storage devices, and one or more data buses for communications between devices. One of the distinguishing characteristics of these systems is the use of a system board to electrically connect these components together. These personal computer systems are information handling systems which are designed primarily to give independent computing power to a single user (or a relatively small group of users in the case of personal computers which serve as computer server systems) and are inexpensively priced for purchase by individuals or small businesses.
A computer system or data-processing system typically includes a system bus. Attached to the system bus are various devices that may communicate locally with each other over the system bus. For example, a typical computer system includes a system bus to which a central processing unit (CPU) is attached and over which the CPU communicates directly with a system memory that is also attached to the system bus.
In addition, the computer system may include a peripheral bus for connecting certain highly integrated peripheral components to the CPU. One such peripheral bus is known as the Peripheral Component Interconnect (PCI) bus. Under the PCI bus standard, peripheral components can directly connect to a PCI bus without the need for glue logic. Thus, PCI is designed to provide a bus standard on which high-performance peripheral devices, such as graphics devices and hard disk drives, can be coupled to the CPU, thereby permitting these high-performance peripheral devices to avoid the general access latency and the band-width constraints that would have occurred if these peripheral devices were connected to a low speed peripheral bus. Details on the PCI local bus standard can be obtained under the PCI Bus Specification, Revision 2.1, from the PCI Special Interest Group, which is hereby incorporated by reference in its entirety.
Relatively recently, techniques for rendering three-dimensional (3D) continuous-animation graphics have been implemented within PCs which, as will be explained below, have exposed limitations in the originally high performance of the PCI bus. The AGP interface standard has been developed to both (1) reduce the load on the PCI bus systems, and (2) extend the capabilities of systems to include the ability to provide 3D continuous-animation graphics with a level of quality previously found only on high-end computer workstations. The AGP interface standard is defined by the following document: Intel Corporation, Accelerated Graphics Port Interface Specification, Revision 1.0 (Jul. 31, 1996), which is hereby incorporated by reference in its entirety.
The AGP interface standard is specifically targeted to improve the efficiency of 3D continuous-animation graphics applications which utilize a technique know in the art as "texturing." Consequently, as background for understanding the data processing systems utilizing the AGP interface standard, it is helpful to have a brief overview of the data processing needs of 3D continuous animation graphics applications which utilize texturing, how they degrade the performance of PCI local bus systems, and how the AGP interface standard remedy this degradation of performance.
The display device of a computing system displays data in two-dimensions (2D). In order to create a 3D continuous animation graphical display, it is first necessary to create an object such that when the object is presented on the 2D display device, the object will be perceived by a human viewer as a 3D object. There are two basic ways in which this can be done. The first way is to use color and shading techniques to trick the human visual system into perceiving 3D objects on the 2D display device (essentially the same technique used by human artists when creating what appear to be 3D landscapes consisting of trees, rocks, streams, etc., on 2D canvases). This is a very powerful technique and creates superior 3D realism. The second way is to use mutually perpendicular lines (e.g., the well-known x, y, z coordinate system) to create geometric objects which will be interpreted by the human visual system as denoting 3D (essentially the same technique used by human architects to create the illusion of 3D in perspective view architectural drawings). However, the 3D illusion created by the use of mutually perpendicular lines is generally perceived to be inferior to that produced by the coloring and shading techniques.
Subsequent to creating a 3D object, the object must be animated. Animation is the creation of the illusion of continuous motion by the rapid sequential presentation of discrete images, or frames, upon the 2D display device. Animated 3D computer graphics are generated by taking advantage of a well know physiological property of the human visual system which is that if a person is shown a sequence of 15 discrete snapshots of a continuous motion, where each snapshot was taken in 1/15 second intervals, within one second, the brain will integrate the sequence together such that the person will "see," or perceive, continuous motion. However, due to person-to-person variations in physiology, it has been found empirically that a presentation of 20 images per second is generally the minimum rate at which the majority of people will perceive continuous motion without flicker, with 30 images per second tending to be the accepted as the optimal presentation speed.
The difficulty with 3D continuous animation computer graphics is that while the color and shading techniques (which are typically accomplished via bit-mapped images) produce superior 3D realism, such techniques are not easy for a computer to translate through geometric space for the creation of continuously varying sequential images necessary to produce the animation effect. On the other hand, the geometric shapes produced via the use of mutually perpendicular lines allow for easy computer manipulation in three dimensions, which allows the creation of sequential images necessary to produce the animation effect, but such geometric shapes result in inferior 3D realism. Recent 3D continuous-animation computer graphics techniques take advantage of both of the foregoing noted 3D techniques via the use of a middle ground approach known in the art "texturing."
In the use of texturing, the gross, overall structures of an object are denoted by a 3D geometric shape which is used to do geometric translation in three space, while the finer details of each side of the 3D object are denoted by bit mapped images (known in the art as "textures") which accomplish the color and shading techniques. Each time a new image of an object is needed for animation, the geometric representation is pulled from computer memory into a CPU, and the appropriate translations calculated. Thereafter, the translated geometric representation is cached and the appropriate bit-mapped images are pulled from computer memory into the CPU and transformed as appropriate to the new geometric translations so as to give the correct appearance from the viewpoint of the display device, the new geometric position, and any lighting sources and/or other objects that may be present within the image to be presented. Thereafter, a device known as the graphics controller, which is responsible for creating and presenting frames (one complete computer screen) of data, retrieves both the translated geometric object data and transformed texture data, "paints" the surfaces of the geometric object with the texture data, and places the resultant object into frame buffer memory (a storage device local to the graphics controller wherein each individual frame is built before it is sent to the 2D display device). It is to be understood that the foregoing noted series of translations/transformations is done for each animated object to be displayed.
It is primarily the technique of texturing which has exposed the performance limitations of PCI bus systems. It has been found that when an attempt is made to implement 3D continuous-animation computer graphics application wherein texturing is utilized within PCI bus systems, the texturing data results in effective monopolization of the PCI bus by the application, unless expensive memory is added to the graphics controller. That is, texturing using the PCI bus is possible. However, due to PCI bandwidth limitations, the textures must fit into the memory directly connected to the graphics card. Since there is a direct correlation between the size of textures and the realism of the scene, quality can only be achieved by adding memory to the graphics card/controller. It was this realization that prompted the development of the AGP interface specification: with the AGP interface standard, texture size can be increased using available system memory. The AGP interface standard is intended to remedy the exposed limitations of the PCI local bus systems by providing extended capabilities to PCI bus systems for performing 3D continuous-animation computer graphics, as will become clear in the following detailed description.
The AGP interface standard accomplishes the foregoing via a rather indirect process. Under the AGP interface standard, a CPU independently processes the geometric and texturing data associated with each object to be displayed in a scene. Subsequent to processing the geometric and texturing data, the CPU writes the geometric and texturing data back into system memory. Thereafter, the CPU informs a graphics processor that the information is ready, and the graphics processor retrieves the information from the system memory.
It may seem as if it would be more efficient to have the CPU write the processed geometric and texturing data directly to the graphics processor, thereby avoiding the intermediate steps of writing and retrieving data from system memory. Such is not the case under the AGP standard. Under the AGP standard, serious inefficiencies are introduced when attempt is made to write data directly to an AGP device.
It has been noted that the normal AGP mode of operation is for the CPU to write processed data to system memory and thereafter to direct an AGP device to read the processed data from system memory. This is typically done because the theoretical peak efficiency of data transmission to an AGP device from system memory, via AGP interconnect through an AGP capable Northbridge, is 533 Mbytes/sec at a bus speed of 133 MHz data transfer rate (a bus speed of 66 MHz, but utilizing both rising and failing clock edges). In contrast, the theoretical peak efficiency of data transmission from the CPU writing directly to the AGP device, via AGP interconnect through an AGP capable Northbridge, is 266 Mbytes/sec at a bus speed of 66 MHz.
In actuality the practicable data transmission rate from the CPU directly to the AGP device is much lower than that noted. There are multiple reasons for this, but one of the most significant is that under the AGP standard the CPU writing directly to an AGP device uses PCI protocol. This means that the pipelined operation of the AGP interconnect is not available for CPU to AGP device direct data transmission; rather, the CPU is reduced to using PCI burst mode as its most efficient tactic for data transfer.
When the CPU writes directly to the AGP device, it writes into a temporary storage location, or "buffer," contained within the AGP device. This buffer is generally known as the command queue buffer. Because the CPU is writing to the AGP device via the utilization of PCI protocol, the CPU must "poll" (ask) the AGP device regarding the AGP device's available storage prior to the CPU writing data to the AGP device. Such polling results in significant inefficiencies, on multiple levels, a few of which will now be detailed.
A first inefficiency arises due to the fact that in order to do such polling, the CPU must communicate with the AGP device over two buses: the CPU bus connecting the CPU to the Northbridge, and the AGP interconnect bus connecting the Northbridge to the AGP device. A second inefficiency arises due to the fact that when the AGP answers back, both of the foregoing buses must be "turned around"--reconfigured so that communication is now flowing from the AGP device back to the CPU--which introduces time inefficiency. A third inefficiency arises due to the fact that the CPU is task based, so if the AGP answers back that its command queue is full, the CPU will "spin," or just cycle without doing any useful computation, until the AGP device indicates that command queue space has become available. Yet a fourth inefficiency arises from the fact that when the command queue becomes available, both buses must again be turned around such that the CPU can transfer data to the AGP device.
The foregoing noted inefficiencies arise directly from the AGP interface standard itself. It is undeniable that the AGP interface standard is highly useful and that AGP compliant devices are highly desirable. However, it is likewise clear that inefficiencies exist and arise from the AGP standard defining the acceptable manner of direct CPU to AGP device data communication. It is therefore apparent that a need exists in the art for a method and system which will substantially conform to the established AGP interface standards, yet also substantially minimize the computational inefficiencies associated with writing data directly from a CPU to an AGP compliant device.