A Bit-aligned Block Transfer (BITBLT) is a general operator which provides a mechanism to move an arbitrary size rectangle of an image from one part of a display memory to another. This operation may be performed by a hardware BITBLT engine otherwise known as a BITBLT engine or blitter. A display controller (e.g., Video Graphic Adapter (VGA) or the like) with this capability may be referred to as a display controller with a BITBLT engine or BITBLT hardware accelerator.
A clipping function is a mechanism to cut an edge of an arbitrary size rectangle of an image located either in system memory or display memory before transferring the image to a destination in display memory. FIG. 4 illustrates a software solution for performing such clipping functions in a system provided with a display controller 402 with built-in BITBLT engine for performing bit block transfers.
The sequence of a bit block transfer operation are as follows. First, a block of data may be read from a portion of a source memory and a portion of a destination memory. The source memory may be display memory 403 of display controller 402 or a system memory (not shown) for host computer 401. The destination memory may be display memory 403 of display controller 402. Display controller 402 retrieves the resultant image data stored in display memory 403 and outputs a signal to generate an image on display 404.
Next, data from the portion of the source memory may be combined with data from the portion of the destination memory using the BITBLT engine within display controller 402. This combination of data, known as Raster Op, may be any logical operation which may be used between elements of data from the source memory and destination memory. As Raster Op operations are completed, resultant data may be written into destination memory. Thus, for example, data from a source memory representing text information, may be combined with a graphic image in a destination memory to product a text over graphics image in the destination memory.
The sequence of a clipping operation is as follows. First, a block of data from the source memory is "cut" in X and Y directions, where X and Y represent ordinate axes. Next, the remaining results are written into the destination memory.
In a Windows.TM. application, when data from a source memory is to be copied into destination memory and a clip rectangle has been defined, the cross-section of the destination rectangle and clip rectangle may be the area in which source data must be copied. Anything outside of this cross-section region should not be written to the destination memory. Such a situation is illustrated in FIG. 1. In FIG. 1, the cross-section of the BITBLT destination rectangle and the clip rectangle is called the clip cross-section rectangle. In the example illustrated in FIG. 1, it is assumed that the source data is located in display memory 403, although alternately, source data may be located in a system memory of host computer 401.
In order to cut a destination rectangle block in the X and Y directions (illustrated in FIG. 1 as X-offset and Y-offset) the following algorithm may be employed. EQU X-Offset=(DSACLP-DSAB) modDSP (EQ 1) EQU Y-Offset=(DSACLP-DSAB) divDSP (EQ 2) EQU WCS=WBLT-(X-Offset) (EQ 3) EQU HCS=HBLT-(Y-Offset) (EQ 4)
Where:
DSAB=Destination Start Address of BITBLT Rectangle PA1 DSACLP=Destination Start Address of Clip Rectangle PA1 DSP=Destination Pitch PA1 WCS=Width of Cross-Section Rectangle PA1 WBLT=Width of BITBLT Rectangle PA1 HCS=Height of Cross-Section Rectangle PA1 HBLT=Height of BITBLT Rectangle PA1 DSAB=A6480 PA1 DSACLP=AC4F1 PA1 SSP=280 PA1 DSP=280 PA1 WBLT=200 PA1 DSAB=A6480 PA1 DSACLP=A670A PA1 DSP=280 PA1 WBLT=16 pixels PA1 HBLT=20 lines PA1 NSBL=Number of Source Bytes to be expanded per BITBLT Line PA1 OSSA=Old Source Start Address PA1 NSSA=New Source Start Address PA1 Destination Start Address (DSAB)=A670A PA1 Width (WBLT)=6 pixels (6 bytes @8 bpp) PA1 Height (HBLT)=19 lines PA1 DSA=6400 PA1 DSACLP=7F8A PA1 WBLT=40 Bytes PA1 HBLT=40 lines PA1 DSP=280 PA1 SSP=280
After calculating WCS and HCS having a DSACLP, one may program the BITBLT engine with calculated width, height, and new destination start address (DSAB) to write only cross-section rectangle data into the destination memory. The calculations of Equations 1-4 may be performed by software 406 in FIG. 4. HBLT and WBLT maybe calculated in X-Y coordinate conversion and BITBLT parameter generation software 405 illustrated in FIG. 4. Writing cross-section rectangle data into a destination area in display memory 403 may be performed by a BITBLT engine.
The following example illustrates the above-described procedure in more detail. For the purposes of illustration, assume a color depth of eight bits per pixel (bpp), a typical color depth for VGA displays. Parameters, in hexadecimal, for the variables recited above may be as follows:
HBLT=100
Applying these values into Equations 1-4 above yields equations 5-8 below. Again, the values shown are in hexadecimal. EQU X-Offset=(AC4F1-A6480) mod 280=171 (EQ 5) EQU Y-Offset=(AC4F1-A6480) div 280=26 (EQ 6) EQU WCS=200-(171)=8F (EQ 7) EQU HCS=100-(26)=DA (EQ 8)
Now, using a BITBLT engine, the cross-section rectangle may be written into destination memory using the following parameters. The start address of the clip rectangle of the destination memory (DSACLP) is AC4F1 and has a width WCS of 8Fh or 143 bytes and a height HCS of DA (hex) or 218 lines (i.e., scan lines). A new source memory start address (NSSA) may be calculated from the original source memory start address (OSSA) as follows : EQU NSSA=OSSA-(DSACLP-DSAB) (EQ 9) EQU NSSA=OSSA-(AC4F1-A6480) (EQ 9a)
After calculating WCS and HCS and having a DSACLP, one can program the BITBLT engine with calculated width WCS, height HCS, new source start address (NSSA), and new destination start address (DSAB) to write only into the calculated cross-section rectangle area.
In general, the procedure for clip cross-section rectangle calculation may be performed by software without a significant impact on overall system performance. However, in some instances, a software solution may be insufficient and may degrade overall sufficient performance.
In performing clipping functions, there may be two cases in which hardware may accelerate system performance. First, when the source data is a monochrome bit map of a schematic or the like which is to be copied into destination memory. The second case is where the source data is a pattern used to fill a portion of destination memory.
In a monochrome bit map, the start address of a clip cross-section rectangle may be calculated and driven through software programming which may be part of a program of a graphics card driver. The start address of the cross-section rectangle may have a corresponding aligned data byte in the source memory which may be used and expanded through the BITBLT engine with a color expansion function.
In other words, in a monochrome bit map, each bit of data (0 or 1) may represent a single pixel of an image. once transferred through the BITBLT engine, color expansion features may add pixel depth (e.g., more bits per pixel or bpp) to provide a color image. However, unlike with other data formats, for example, 8 bpp pixel depth, the border of a clipped monochrome bit map image may not fall on an even byte boundary. The beginning of a clipped monochrome bit map may fall, for example, on bit N of a given byte. A mechanism or software must be provided to read the starting bit and correct for such offset.
The start address of the cross-section rectangle may correspond to the Nth bit in a calculate source byte where N=0,1,2, . . . 7, as illustrated in FIG. 2. In the example illustrated in FIG. 2, each double word of source data may have to shift (7-N) times to the left. In a software solution, a loop of shift operations for the entire source data may be very time consuming and degrade the overall performance of the system. For example, if N=7, a total of seven shift operations may have to be performed to retrieve one bit of source data for a monochrome bit map.
In the example of FIG. 2, The start address of the cross-section rectangle may correspond to the fifth bit (i.e., N=5) of unexpanded source data byte three which has to be expanded and written into a portion of display memory 403. In the example of FIG. 2, the following parameters will be used for the purposes of illustration. All values are given in hexadecimal, unless otherwise noted.
Plugging these values into Equations 1-4 yields: EQU X-Offset=(A670A-A6480) mod 280=10 pixels (EQ 10) EQU Y-Offset=(A670A-A6480) div 280=1 line (EQ 11) EQU WCS=16-10=6 pixels (EQ12) EQU HCS=20-1=19 lines (EQ13)
In the example of FIG. 2, color depth may be 8 bits per pixel (bpp). X-Offset at 10 pixels, for a monochrome bit mode (1 bpp) may be equivalent to 1 byte and 2 bits or 10 bits total.
A new source address may be calculated according to the following Equations 14-16. EQU If (WBLT mod 8&gt;0) then NSBL=WBLT div 8+1 (EQ 14) EQU IF (WBLT mod 8=0) then NSBL=WBLT div 8 (EQ 15) EQU NSSA=OSSA+(NSBL*Y-Offset)+(X-Offset div 8) (EQ 16)
Where:
Applying the values form the example of FIG. 2, to Equations 14-16 yields equations 17-19 below. again, all values shown are in hexadecimal unless stated otherwise. EQU NSBL=WBLT div 8=10 div 8=2 (EQ 17) EQU NSSA=OSSA+(2*1)+(A div 8) (EQ 18) EQU NSSA=OSSA+2+1=OSSA+3 bytes (EQ 19)
Now the problem is that the start address of the clip cross-section rectangle does not lie within a byte boundary, but rather starts from bit five of the third unexpanded source byte. The software solution to this problem is to perform a Shift Left operation, bit-wise, for every double word of unexpended source data two times (i.e., since bit five is two positions away from bit 7 or the byte boundary) and then use the BITBLT engine to expand the source data from NSSA memory location and write the resultant data into the memory location of clip cross-section rectangle in display memory 403 with the following BITBLT register values.
Shifting bit-wise twice in the above example (up to a maximum of seven times in a worst case) may create a substantial penalty in overall system performance when using a software solution.
Another application area where difficulties may arise using a BITBLT engine is a pattern copy operation. As illustrated in FIG. 3, in pattern copy operation, source data may comprise a pattern 301 located in an off-screen portion of display memory 403 representing a pattern image 300. Since the start address of clip rectangle 302 may be a memory location of any pixel on one of an 8 by 8 pattern tile located on the BITBLT rectangle, the left-most tiles of the clip cross-section rectangle may require an offset with some number of pixels with respect to the original pattern.
The result would be a rectangle painted with the same pattern, except the first column pattern tiles may be shifted to the left by the same number of pixels as the Pattern X-Offset value. In the example of FIG. 3, pattern X-Offset has a value of 2 pixels. In addition, if the top of the rectangle has be offset by some number of pixel lines, then the Pattern Y-Offset may define the start line of a first row top pattern tile. In the example of FIG. 3, the Pattern Y-Offset has a value of 3 lines.
For the example of FIG. 3 we assume the following values which are in hexadecimal unless otherwise noted.
Applying these values, X-Offset and Y-Offset may be calculated using Equations 1-2 as illustrated in Equations 20-21. Again, all values shown are in hexadecimal unless otherwise noted. EQU X-Offset=(7F8A-6400) mod 280=A=10 Bytes (EQ 20) EQU Y-Offset=(7F8A-6400) div 280=B=11 lines (EQ 21)
Values for Pattern X-Offset and Pattern Y-Offset may be calculated using equations 22-23 as follows. EQU Pattern X-Offset=X-Offset mod 8=2 Bytes (EQ 22) EQU Pattern Y-Offset=Y-Offset mod 8=3 lines (EQ 23)
In a software solution, the source pattern may be adjusted to include the Pattern X-Offset and pattern Y-Offset into the source pattern and then fill the clip rectangle 302 using the BITBLT engine with the new adjusted source pattern. To adjust the source pattern with Pattern X-Offset and Pattern Y-Offset, it may be necessary to fill out four adjacent pattern tiles in an off-screen potion of display memory with the source pattern. then, a new source start address of the 8 by 8 pattern may be calculated, taking into account the Pattern X-Offset and Pattern Y-Offset values. Readjusting the source pattern may require at least two extra BITBLT operations to fill out the four adjacent pattern tiles. These extra BITBLT operations may degrade overall system performance.
One partial solution to the above-described difficulties is to enable the clipping function in hardware rather than software, as illustrated in FIG. 5. The incorporation of BITBLT engine into display controller 502 and its sequence may make a hardware clipping function relatively easy to implement. Display controller 502 may incorporate a BITBLT engine and hardware module to perform the functions of software modules 405 and 406 in FIG. 4. Source data may be received from host processor 501 or from display memory 503, processed by the BITBLT engine and stored in display memory 503. Display controller 502 may retrieve resultant image data from display memory 503 and output a signal to display 504 to generate the resultant image.
However, a hardware clipping function may create a serious penalty in gate count in control logic. For a display controller (e.g., Video Graphics Adapter or the like) as in any electronic design, it may be desirable to minimize additional gates required to perform various functions.
Referring to FIG. 5, the conventional hardware solution for the clipping function is to store X and Y coordinates of the clip rectangle (i.e., beginning and end coordinates) inside registers of display controller 502 through system I/O write cycles from host CPU 501 and let hardware control logic within the display controller 502, embedded inside the BITBLT engine, take care of the entire clipping function. This hardware solution performs all clipping function through hardware, regardless of whether the clipping is color expansion of a monochrome bit map, pattern copy, or any other regular BITBLT.
In order for the hardware to perform all of the necessary operations to support the clipping function as in the conventional software method, a conversion module must be provided within display controller 502 to convert all the X and Y coordinates of the clip rectangle and BITBLT destination rectangle to linear addresses, as performed by software module 405 in FIG. 4. The X and Y coordinate information may be provided to display controller 502 by a CPU host processor 501, as illustrated in FIG. 5.
In addition, an arithmetic logic and control unit must be provided to perform all calculations and decisions based on Equations 1-4, 14-16 and 22-23 set forth above, and as performed by software module 406 in FIG. 4. The results of the hardware module may be passed to the BITBLT engine to execute the corresponding clipping function.
For even a 640 by 480 pixel resolution image, the X-Y coordinate convention may require a 10 bit multiplier plus an 18 bit adder. The size of the multiplier and adder for much higher resolutions (e.g., 1024.times.768, 2048.times.1024, or the like) increases accordingly. Thus, to build such features into hardware may require an inordinate number of logic gates, adding to the cost of the VGA. Of course, such a hardware solution may overcome some of the difficulties described above in connection with the software implementation of the clipping function. In the aggregate, however, the relative increase in performance is far outweighed by the increase in gate count. Thus, the prior art hardware solutions for clipping functions may be inadequate.
Thus, it remains a requirement in the art to provide a clipping function which will not seriously degrade overall system performance, regardless of mode, while minimizing overall gate count in a display controller.