1. Field of the Invention
The present invention relates to the field of computer displays, and in particular to a method and apparatus for hardware acceleration of graphical fill in display systems.
Sun, Sun Microsystems, the Sun logo, Solaris and Java are trademarks or registered trademarks of Sun Microsystems, Inc. in the United States and other countries.
2. Background Art
A typical graphics system has a controller and display memory. The controller executes graphics commands, including writing data to and reading data from the display memory. Some graphics systems use synchronous graphics random access memory (SGRAM) for display memory. SGRAM contains logic to perform some commands internally. For example, SGRAM executes a command to fill a region of memory with a constant by storing one or two colors from a command source. Then, SGRAM accepts command words consisting of individual bits (i.e., part of a bit-mask) selecting between the stored colors. SGRAM writes one or the other of the colors as instructed to by the bit-mask.
SGRAM is more expensive than dynamic random access memory (DRAM). As a result, many graphics systems use DRAM instead of SGRAM to reduce cost. However, since DRAM lacks the logic to perform some commands internally, graphics systems using DRAM are typically slower than graphics systems using SGRAM. This problem can be better understood by a discussion of display systems in a multi-tier application architecture.
Multi-Tier Application Architecture
In the multi-tier application architecture, a client communicates requests to a server for data, software and services, for example, and the server responds to the requests. The server""s response may entail communication with a database management system for the storage and retrieval of data.
The multi-tier architecture includes at least a database tier that includes a database server, an application tier that includes an application server and application logic (i.e., software application programs, functions, etc.), and a client tier. The data base server responds to application requests received from the client. The application server forwards data requests to the database server.
The client tier typically consists of a computer system that provides a graphic user interface (GUI) generated by a client, such as a browser or other user interface application. Conventional browsers include Internet Explorer and Netscape Navigator, among others. The client generates a display from, for example, a specification of GUI elements (e.g., a file containing input, form, and text elements defined using the Hypertext Markup Language (HTML)) and/or from an applet (i.e., a program such as a program written using the JavaTM programming language, or other platform independent programming language, that runs when it is loaded by the browser).
Further application functionality is provided by application logic managed by application server in application tier. The apportionment of application functionality between client tier and application tier is dependent upon whether a xe2x80x9cthin clientxe2x80x9d or xe2x80x9cthick clientxe2x80x9d topology is desired. In a thin client topology, the client tier (i.e., the end user""s computer) is used primarily to display output and obtain input, while the computing takes place in other tiers (i.e., away from the thin client). A thick client topology, on the other hand, uses a more conventional general purpose computer having processing, memory, and data storage abilities. The database tier contains the data that is accessed by the application logic in the application tier. A database server in the database tier manages the data, its structure and the operations that can be performed on the data and/or its structure.
The application server can include applications such as a corporation""s scheduling, accounting, personnel and payroll applications, for example. The application server also manages requests for the applications that are stored therein. The application server can also manage the storage and dissemination of production versions of application logic. The database server manages the database(s) that manage data for applications. The database server responds to requests to access the scheduling, accounting, personnel and payroll applications"" data, for example.
A connection is used to transmit data between client tier, application tier, and may also be used to transfer the application logic to client tier. The client tier can communicate with the application tier via, for example, a Remote Method Invocator (RMI) application programming interface (API) available from Sun Microsystems(trademark). The RMI API provides the ability to invoke methods, or software modules, that reside on another computer system. Parameters are packaged and unpackaged for transmittal to and from the client tier. The connection between the application server and the database server represents the transmission of requests for data and the responses to such requests from applications that reside in the application server.
Elements of the client tier, application tier and database tier (e.g., the client, the application server, and the database server) may execute within a single computer. However, in a typical system, elements of the client tier, application tier and database tier may execute within separate computers interconnected over a network such as a LAN (local area network) or WAN (wide area network).
Display Systems
Display systems in the multi-tier application architecture are used to arrange display information for presentation to a user on a display device (e.g., a monitor). Typically, a display system comprises display memory and a display controller in the client tier. The display memory is frequently DRAM and contains pixel color information for each pixel of the display device. The display controller updates the data in the display memory and retrieves data from the display memory to send to the display device.
Typically, thousands, millions or even billions of color value possibilities are available for storage in each display memory location. Frequently, however, the same one or two values are found in many locations. For example, displaying a graphic of a stop sign results in a large number of the display memory locations storing the same value for red. In another example, a text window having a black background and white text results in a large number of display memory locations storing either the value for black or the value for white.
Individually reading and writing each display memory location having the same value is inefficient. SGRAM is sometimes used to speed up graphical fills of one or two colors. The SGRAM uses stored color values, bit-masks and internal logic to perform graphical fills more quickly (typically 2 to 10 times more quickly) than graphical fills using DRAM in typical graphics systems. However, SGRAM is significantly more expensive than DRAM.
Embodiments of the present invention are directed to a method and apparatus for hardware acceleration of graphical fill in display systems. In one embodiment of the present invention, a bit-mask is maintained. In one embodiment, the bit-mask, termed the xe2x80x9cfilled color bitmapxe2x80x9d, has one bit for each pixel of the display data. In another embodiment, a register, termed the xe2x80x9cfilled color registerxe2x80x9d, capable of storing a single color value is maintained.
In one embodiment, all values in the filled color bitmap are initialized to 0. When a write command is executed to fill a portion of the display memory with the same value that is stored in the filled color register, the value is not written to display memory. Instead, the bits in the filled color bitmap corresponding to the portion of display memory are set equal to 1. Similarly, when a write command is executed to fill a portion of the display memory with a different value from the value stored in the filled color register, the value is written to display memory and the bits in the filled color bitmap corresponding to the portion of display memory are set equal to 0.
In one embodiment, display data is read in parallel from the display memory and the filled color bitmap. When a bit in the filled color bitmap is 1, the value stored in the filled color register is used for the corresponding pixel rather than the value in display memory. When a bit in the filled color bitmap is 0, the value stored in display memory is used for the corresponding pixel.
In one embodiment, the filled color bitmap is physically smaller than the display memory used to store graphics data. The filled color bitmap uses fewer bits to represent each pixel than does the display memory. In one embodiment, the filled color bitmap is placed inside the graphics controller chip. Since the filled color bitmap is on-chip, it can be updated and read very quickly. The quicker reads and writes improve the speed of the display system without incurring the cost associated with the use of SGRAM.
In another embodiment, the filled color bitmap is on a separate chip. Since the filled color bitmap is smaller than the display memory, it can be updated and scanned more quickly than the display memory. The quicker reads and writes improve the speed of the display system without incurring the cost associated with the use of SGRAM.
In yet another embodiment, the filled color bitmap is located in the same chip as the display memory. The filled color bitmap is located in its own partition of the display memory chip. Since the filled color bitmap is smaller than the display memory, it can be updated and scanned more quickly than the display memory. The quicker reads and writes improve the speed of the display system without incurring the cost associated with the use of SGRAM.
In a further embodiment, the color bitmap and the display memory are both comprised of dynamic random access memory (DRAM) or memories.
In one embodiment, color values are written from the filled color bitmap to the display memory as a background task in the graphics chip. When the graphics controller is not busy executing other commands, it scans the fill color bitmap. When a 1 bit is found in the filled color bitmap, the value in the filled color register is written to the display memory. After the write is done, the bit in the filled color bitmap is set to 0. When all bits in the filled color bitmap are 0, the filled color register is available to store another value to enable a fast fill with a different color.
In one embodiment, Min-X, Max-X, Min-Y and Max-Y registers are maintained. The registers record the area of the screen containing 1s in the filled color bitmap. When the area containing 1s is less than the total area of the screen, less of the filled color bitmap must be scanned. Thus, the graphics controller is able to scan for 1s in the filled color bitmap more quickly.
In one embodiment, a low-resolution filled color bitmap is maintained. Each memory location of the low-resolution filled color bitmap corresponds to more than one pixel. In one embodiment, each memory location of the low-resolution filled color bitmap corresponds to a 4 pixel by 4 pixel square. In another embodiment, each memory location of the low-resolution filled color bitmap corresponds to a 16 pixel horizontal area. In other embodiments, each memory location of the low-resolution filled color bitmap corresponds to different numbers and configurations of pixels.
In one embodiment, a bit in the low-resolution filled color bitmap is equal to 1 whenever at least one corresponding bit in the filled color bitmap is equal to 1. If the low-resolution bitmap bit value is 0, all corresponding bits in the filled color bitmap are also 0. Thus, a region corresponding to a low-resolution filled color bitmap bit value of 0 does not need to be scanned for 1s. Thus, when at least one bit value in the low-resolution filled color bitmap is 0, the graphics controller is able to scan for 1s in the filled color bitmap more quickly.
In another embodiment, more than one low-resolution bitmap is maintained. In one embodiment, the memory locations of the multiple low-resolution bitmaps cover different amounts of pixels in a hierarchical manner.
In one embodiment, a running count of how many bits are xe2x80x9c1xe2x80x9d in the filled color bitmap is maintained. The count is incremented whenever a bit changes from 0 to 1 and decremented whenever a bit changes from 1 to 0. When the count is equal to zero, there are no 1s in the filled color bitmap and scanning for 1s in the filled color bitmap is stopped.
In one embodiment, when data is read from display memory, the contents of the filled color bitmap is pre-fetched. Pixels corresponding to a filled color bitmap memory location storing a value of 1 do not have their color value read from the display memory. Instead such pixels have their color value read from the filled color register. As a result, the graphics system performs more quickly and with less power consumption.
In one embodiment, the filled color bitmap has more than 1 bit for each pixel. In one embodiment, the filled color bitmap has 2 bits per pixel, and the display system has 3 filled color registers. When a value matching any of the 3 color values stored in the filled color registers is to be written to display memory, the binary code representing the appropriate filled color register is written to the filled color bitmap memory location corresponding to the pixel. One pattern in the 2-bit binary code (e.g., 00) indicates that the written color value is not stored in any filled color register and should be read from display memory.
In one embodiment, block transfers of one region of graphics data to another region of graphics data are accomplished using the filled color bitmap. When the region being transferred consists only of pixels with color values stored in the filled color registers, the block transfer is executed in the filled color bitmap. The values stored in the receiving region of the filled color bitmap are set equal to the values stored in the transferring region of the filled color bitmap. Thus, display memory is not accessed or altered. Since references to the filled color bitmap are much faster than references to display memory, significant speed-up of some operations (e.g., scrolling operations applied to text windows) is achieved.
In one embodiment, a graphics system has a plurality of filled color registers corresponding to vertical regions of the display data. A single filled color bitmap serves several different partially overlapping windows with different foreground and background colors. In one embodiment, the boundary between vertical regions is determined dynamically, based on graphics command activities. In another embodiment, a memory scanning technique is used to clear bits out of the filled color bitmap and enable the boundary to be moved whenever the count of 1s in a region becomes 0.
In one embodiment, software is used to monitor and record the details of the commands executed by the graphics controller chip. The software determines which are the best colors to put into the filled color registers in order to maximize performance.