The present invention relates generally to graphics display systems and methods, and in particular to methods and systems for interfacing a graphics system with a weakly ordered bus interface.
In many modern computer or computerized systems the primary interface for providing information to a user is a display. Information generated by the computer system""s processing unit is conveyed to the user by means of the display. As computer systems have evolved and become ever more capable, increased emphasis has been placed on the quality of the images displayed, including both the quantity of data displayed and the rate at which data on the display may change in order to increase the quality of the images displayed without overburdening the computer system""s processing unit, graphics devices, such as graphics accelerators, are often used to perform much of the work required to render a given image.
In general, the processing unit, or CPU, determines what graphical images should be displayed on a display. The graphics device, on the other hand, largely determines how to display the image. The CPU executes, or runs, a software program such as a driver. The driver creates drawing commands to be provided to the graphics device. An example of a command is a command to draw a triangle. Along with the drawing command, the driver provides information, i.e. parameters, that pertains to the command. For example, the parameters associated with the command to draw a triangle could be the vertices of the triangle, and the red, green, blue and intensity values of the triangle. The graphics device then renders the display by manipulating the drawing command and parameters to produce a visual representation, such as a rendered triangle, on the display.
As visual representations have become more realistic, the drawing commands and parameters have become more complex. For example, the drawing command for rendering a three-dimensional triangle requires numerous parameters for designating three-dimensional spatial locations for each vertex, as well as additional shading or texturing information. The large number of required parameters for commands, and the rate at which commands must be propagated from the CPU to the graphics device, place ever increasing demands on the CPU, the graphics device, and also the communication interface between the CPU and the graphics device.
The CPU generally communicates with the graphics device over a bus. In the past, and in many existing systems, the CPU and graphics device communicated over a general system bus. Due to the high volume of information required to produce high quality images, particularly dynamic texturing three dimensional images, use of the general system bus inordinately taxes system resources and, at times, provides insufficient support of data requirements for graphics displays. Accordingly, a dedicated CPU to graphics device bus is provided by some systems. A dedicated CPU to graph as device bus reduces the number of devices competing for the use of the bus and thereby provides more bandwidth or space on the bus for transferring information to the graphics device. The dedicated bus, however, does not reduce the large amount of information required to be communicated to the graphics device. Therefore, with the tremendous amount of information being communicated to the graphics device, the importance of efficient bus utilization becomes even greater.
One way to reduce the amount of data transferred from the CPU to the graphics device is to efficiently encode the command to indicate the meaning of parameters pertaining to the command. This effectively increases bus bandwidth as less than the complete set of possible parameters need be sent from the CPU to the graphics device every time the CPU issues a new command. Instead, the graphics device determines from the command the meaning of parameters provided with the command. This implies, however, that the parameters bear some predefined relationship to the command. An example of such a relationship is that the parameters be received by the graphics device in a predefined sequential order after receipt of a command.
Use of write combining (WC) technology is another way to increase bus bandwidth. An example of WC technology is found in the Intel AGP chipset 440 BX. Write combining combines individual writes to a bus into an aggregation, with each aggregation being formed of many individual writes. The order of the individual writes, however, is not necessarily maintained.
Thus, if the bus is transmitting information sequentially loaded into an area in memory, the order in which the information is received may not be the order in which it was sequentially loaded into memory. In other words, when using WC the CPU or the bus controller does not necessarily transfer information in the same order as the information is prepared by software executing on the CPU, such as the graphics driver. Instead the CPU or bus controller may transmit information as it sees fit to increase bus efficiency.
The use of both WC technology and encoded commands to increase bus bandwidth is therefore problematical. WC technology requires that at times information be transmitted in an order possibly different than that otherwise expected by the sender. The use of encoded commands, on the other hand, requires that received parameters be in a predefined order with respect to a command. Accordingly, methods and systems which overcome the obstacles of using of both write combining technology and encoded commands are desirable.
The present invention provides systems and methods of increasing effective bus bandwidth through utilization of commanding encoding in a weakly ordered bus interface. According to the present invention, items of data are written sequentially into sequential ordered areas of a memory. The ordered areas of memory thereby form command registers, with the registers being identifiable by address locations. Thus, each item of data is placed in an area of the memory having an associated address. The items of data and their associated addresses are transmitted over a bus. The items of data and the associated addresses are received from the bus, and the addresses for each item of data are examined. The items of data are then placed in a storing buffer based on the address associated with each item. The placement of the items in the storing buffer is accomplished in the same order as the items of data were originally written to the sequentially ordered areas of the memory.
The items of data comprise commands and parameters, with each parameter, and generally multiple parameters, pertaining to a command. The method further comprises examining the items of data to determine whether the items of data are commands or parameters, as well as determining which parameters pertain to which command. The storing buffer may comprise a plurality of storage buffers. Each of the storage buffers has two arrays associated with the storage buffer. The first array is a read availability status array. The read availability status array indicates whether locations in the storage buffer contain data available for reading. The second array is a direct read status array. The direct read status array indicates for each storage location within the storage buffer whether data has been directly read out to a command interpreter, which may be a burst command interface. The direct read status array is useful in that the storage buffers may be read out in bulk to a temporary storage facility, which may include a graphics device memory, in order to avoid overflowing the command storage buffers as well as to increase system efficiency. However, there may be times when less than all of the command storage buffers have data available for reading, but a sum of the storage buffer data should be read. Thus data also needs to be provided directly to the command interface, instead of only after being stored in the temporary memory. If data is read directly to the burst command interface, however, then the corresponding slots and the other storage buffers should also be provided directly to the command interpreter.
In one embodiment, data received over the bus is split into a data portion and an address portion, and routed to a FIFO. The data portion is routed to the FIFO via multiplexors using portions of the address associated with the data to determine the portion of the FIFO in which the data should be placed. The FIFO serves as an elastic memory such that data may be clocked into the FIFO at a bus clock rate, and clocked out of the FIFO at graphics device cock rate. After data is output from the FIFO the data is once again multiplexed, or routed using a router, into a slot in the storage buffers, the slot selected based on portions of the address information associated with the data item. The data in the storage buffers is then provided to a command interpreter for further processing by the graphics device.
Many of the attendant features of this invention will be more readily appreciated as the same becomes better understood by reference to the following detailed description considered in connection with the accompanying drawings.