.sctn. 1.1 Field of the Invention
The present invention concerns read, modify and write operations which may be used, for example, when blending images. For example, the present invention may concern blending a first image, such as a character(s) for example, with a second image.
.sctn. 1.2. Related Art
.sctn. 1.2.1 Composting Images
Images, such as bitmap images, JPEG images, GIF images, etc., for example, may be combined into a single display frame to be rendered on a display device. Separate images may be rendered on separate, distinct, sections of the display frame. Often, however, rendering more than one image on a common section of the display frame is desired. In such instances, one or more foreground images may be rendered over a background image. Potential sources of foreground and background images are introduced in .sctn. 1.2.1.1 below. Then, various known methods of compositing images are introduced in .sctn..sctn. 1.2.1.2 through 1.2.1.4 below.
.sctn. 1.2.1.1 Image Sources and Communicating Images from their Sources
In a personal computing system, the background (or second or remote) image may be stored in a frame buffer on a video driver card, while a foreground (or first or local) image may be available closer to the means for performing the modification, such as a central processing unit for example. In most Pentium-based computers and in the Apple Power Macintosh computer, a peripheral component interconnect (or "PCI") local bus is used to transfer data between hardware devices, adapters, and other bus backplanes. An AGP bus may be similarly used. The PCI bus may be coupled with the host CPU and main memory through a bridge device that controls the data transfers between the CPU, cache, and main memory.
.sctn. 1.2.1.2 Sprite Operations and their Perceived Limitations
"Sprites" may be thought of as foreground images to be composited over a background image or background images. As shown in FIG. 1, sprites 110, each of which may be thought of as a picture with an irregular shape, possibly with transparent holes, that can be moved on the screen and which has a depth, may be rendered on a background image 120 to produce a composite image 130 in an off-screen buffer storage area 140. The contents of the off-screen buffer storage area 140 may then be combined as a frame 150 rendered on a display. (See, e.g., the electronic article Herman Rodent, "Animation in Win32," Microsoft Developer Network, at techart.chm::html/msdn_anim32.htm (Feb. 1, 1994).
.sctn. 1.2.1.3 BitBlt Operations and their Perceived Limitations
As a further example of compositing a foreground image on a background image, referring to FIG. 2, character information 210, which may be a bitmap, may be combined with a background image 220 stored in a display frame buffer storage area 220 to generate a composite image 240. A block transfer (or "BLT") function may be used to modify and transfer bitmaps. The BitBlt function, supported by Windows NT.RTM. operating system from Microsoft Corporation of Redmond Wash., may be used to transfer bits from a rectangle on a source device to a rectangle, having the same dimensions, on a destination device. The bitmap character information 210 includes individual character bitmaps 212, also referred to as glyphs. Notice that each glyph 212 has a foreground color 214 and a background color 216. Notice, further, that in the composite image, the feature 222 of the background image 220 is obscured by both the foregrounds 214 of the glyphs 212, as well as the backgrounds 216 of the glyphs 212. Unfortunately, displaying only the foreground (that is, only the character information) 214 of the glyphs 212 may be desired in many instances.
There are some techniques for performing transparency operations with bitmaps. (See, e.g., the article, Ron Gery, "Bitmaps with Transparency," Microsoft Developer's Network, techart.chm::/html/msdn_transblt.htm (Jun. 1, 1992). In this regard, there are "alpha-blt" operations which permit a first image and second image to be blended together based on a blend coefficient (also referred to as "alpha"). Unfortunately, if the second image is stored away from where these operations take place (e.g., the CPU), which is often the case, it must be read (e.g., by the CPU) which may be a relatively slow operation. Further, bitblt and alpha-blt operations are performed on a pixel basis, which is not appropriate for newly developed high resolution rendering systems which operate on a pixel sub-component level. (See, e.g., U.S. patent application Ser. No. 09/240,653, filed on Jan. 29, 1999 and incorporated herein by reference.)
.sctn. 1.2.1.4 Combining Images Using a Blend Coefficient
Blending is a way to combine two samples using their color components, as weighted by their blend coefficient (or alpha) values. Alpha blending allows colors (or materials or textures) on a surface to be blended, with transparency, onto another surface. For example, blending may combine a pixel's (or pixel group's, or bitmap's) color with that of a pixel stored in a video frame buffer storage area at a corresponding location. The blending depends on the alpha value of the fragment and that of the corresponding currently stored pixel. For example, the DXCompositeOver function uses the alpha value of a source (or first) sample to combine colors of the source (or first) sample with that of a destination (or second) sample. More specifically, with the DXCompositeOver function, the source (or first) sample is scaled by alpha, the destination (or second) sample is scaled by the inverse of alpha, and the two values are added. (See, e.g., the article, "Compositing Helper Functions," Microsoft Developers Network, dxmedia.chm::/dxmedia/help/dxt/reference/helpers/composit_h elpers.htm, p. 3.) On the other hand, with the DXCompositeUnder function, the destination (or second) sample is scaled by alpha, the source (or first) sample is scaled by the inverse of alpha, and the two values are added. Id. at pp. 3-4.
In some embodiments, the alpha information may be represented by a whole number from 0 to 255, where a 255 source alpha indicates that the source pixel is to overwrite the destination pixel (that is, the source pixel is 100 percent opaque) and a 0 source alpha indicates that the destination pixel is to be left unchanged (that is, the source pixel is 100 percent transparent). A value between (and not including) 0 and 255 means that the destination pixel should be read and combined with the source pixel to generate a new pixel. The new pixel is then to be written to the destination image to replace the original destination pixel.
For example, FIG. 3 illustrates a blending operation in which characters 312 have a foreground portion 314 with an alpha equal to 255 and a background portion 316 with an alpha equal to 0. Thus, when the glyph 310 is blended with the background image 330, the backgrounds (i.e., non-character pixels) 316 of the glyphs 312 do not obscure the feature 322 of the background image. That is, only the foreground portions (i.e., character pixels) 314 of the glyphs 312 overwrite the background image 330 in the resulting image 340.
Before two images can be blended, pixel(s) from each of the images must be read. Once the pixel(s) from each of the images are combined to generate a modified pixel, the modified pixel is written to a destination storage. Section 1.2.1.4.1 below describes a conventional method of reading pixel(s) from two images to be blended, modifying them, and writing them to a destination storage area.
.sctn. 1.2.1.4.1 Piecewise Read-Modify-Write and its Perceived Shortcomings
FIG. 4 is a high level flow diagram of a method 400 for blending a first (or local) image (stored (temporally) close to the CPU which will perform the blending operation) with a second (or remote) image (stored (temporally) further from the CPU). In this method 400, it is assumed that the second (or remote) image is stored in a video frame buffer. First, as shown in act 410, a pixel (or pixels, in a "vector basis" blend operation) is read from the second (or remote) image. Then, in act 420, a blend operation is performed based on the read pixel(s) of the second (or remote) image, a corresponding pixel of the first (or local) image (e.g., a character), a foreground color of the first (or local) pixel, and a blend coefficient (alpha) to generate a modified pixel (or modified pixels). In act 430, the modified pixel(s) is then written back (e.g., written back to the video frame buffer storing the second (remote) image). Referring to conditional branch point 440, if there are more pixels, the method 400 continues back at act 410. If, on the other hand, there are no more pixels, the method 400 is left via RETURN node 450.
As is the case with the foregoing example, many "remote" images (or destination surfaces) have CPU read performance characteristics that are expensive relative to the cost of a write operation. For example, if a destination surface is stored in a video frame buffer of a video adapter card in a personal computer with a PCI bus, the present inventor has determined that the CPU might be able to read the destination surface from the video frame buffer at 8 MB/s, while the CPU might be able to write information to the video frame buffer at a rate of 200 MB/s. The present inventor has also determined that the read speed may depend on the width of the read request--if the CPU reads using 8-bit operations, the throughput may be 2 MB/s, if the CPU reads using 16-bit operations, the throughput may be 4 MB/s, and if the CPU reads using 32-bit operations, the throughput may be 8 MB/s. That is, a read request for 8-bits of information, 16-bits of information, or 32-bits of information may take about the same time. This is because the PCI bus is 32-bits wide.
Blending operations like that illustrated in FIG. 4, however, are often done on a pixel-by-pixel basis, or on a vector basis where a fixed number of pixels are processed in parallel. Typically, with pixel-by-pixel blending operations, read requests are issued when necessary at the size or smaller of the destination pixel. For example, if a blend is done to a second (or remote) image having 16 bits/pixel (or "bpp"), 16-bit read requests will typically be used. In another example, if a blend is done to a second (or remote) image having 24 bpp, 8-bit read requests will typically be used.
With vector blending, read requests are typically issued at the size of the vector size. For example, if a blend is done to a destination image having 16 bpp and the blending operation handles 4 pixels at once, 64-bit reads will typically be used.
Read requests that are greater than 8-bits should be properly aligned. For example, a 32-bit read request should be done from an address that is a multiple of four (4) bytes (32 bits). The alignment requirement stems from limitations on the types of reads supported by the processor and therefore, by the operating system.
As can be appreciated, in the conventional method 400 for blending a first (local) image (stored (temporally) close to the CPU to perform the blending operation) with a second (remote) image (stored (temporally) further from the CPU), a "bottleneck" is caused by the read operations. Thus, better methods and apparatus are needed to blend two (2) images, particularly when one image is stored at a location separated from the machine (e.g., a CPU) performing the blend operation by a bus having relatively slow read performance. These method and apparatus should work with new techniques for rendering images at high resolutions by operating on pixel sub-components, rather than merely operating on pixels.