FIG. 1 illustrates the system architecture for a conventional computer system, such as an IBM PS/2.RTM. computer. The exemplary computer system of FIG. 1 is for descriptive purposes only. Though the description below may refer to terms commonly used in describing particular computer systems, such as an IBM PS/2 computer, the description and concepts equally apply to other systems, including systems having architectures dissimilar to FIG. 1.
The exemplary computer 100 includes a central processing unit (CPU) 105, which may include a conventional microprocessor; a system random access memory (RAM) 110 for temporary storage of information and a read only memory (ROM) 115 for permanent storage of information. A memory controller 120 is provided for controlling system RAM 110. A bus controller 125 is provided for controlling bus 130. An interrupt controller 135 is used for receiving and processing various interrupt signals.
Mass storage may be provided by a diskette 142, a CD-ROM disk 147 or a hard disk 152. The diskette 142 can be inserted into a diskette drive 141, which is, in turn, connected to bus 130 by a controller 140. Similarly, the CD-ROM disk 147 can be inserted into a CD-ROM drive 146, which is also connected by a controller 145 to bus 130. Finally, hard disks 152 are part of a fixed disk drive 151, which is connected to bus 130 by controller 150.
Data input and output to computer system 100 is provided by a number of devices. For example, a keyboard and mouse controller 155 connects to bus 130 for controlling a keyboard input device 156 and a mouse input device 157. A DMA controller 160 is provided for performing direct memory access to system RAM 110. A visual display is generated by a video controller 165, which controls a video output display 170. As will be further described below, video controller 165 may include a graphics engine 175, a frame buffer 180, and off-screen VRAM 185. Under the control of the computer system 100, display 170 presents a two dimensional array of picture elements (pixels), which may be independently controlled to form an image. Other input and output devices, such as an audio subsystem 191, may be connected to the system through expansion slot 190.
The computer 100 is generally controlled and coordinated by operating system software, such as the OS/2.RTM. operating system, available from the International Business Machines Corporation (IBM), Boca Raton, Fla. Conventional operating systems typically control and schedule computer processes for execution, perform memory management, provide a file system, networking capabilities, and I/O services, and provide a user interface, such as a graphical user interface (GUI), among other things. User applications, such as editors and spread sheets, directly or indirectly, rely on these and other capabilities of the operating system.
Computer systems are increasingly using sophisticated techniques to present information to a user. Modern computers use graphics capabilities to produce various graphical items, such as lines, boxes, and circles, on a display 170, typically in color. These graphics capabilities are used, for example, by GUIs and other computer applications.
In addition to graphics, modern computers are increasingly using multimedia techniques, which store, organize, and present various forms of data, including textual data, digital audio data, digital video data, and digital music data (e.g., MIDI). For example, a computer using multimedia techniques may play back video data and audio data to produce a movie clip video sequence on display 170 with synchronized audio output from audio subsystem 191.
Graphical and video images are conventionally produced by storing data for each pixel in a corresponding location of a frame buffer 180. These data are placed in the frame buffer 180 by graphics engines 175 or, as is further discussed below, by software. A frame buffer 180 is typically, although not necessarily, constructed from special memory chips called VRAMs, which allow conventional read and write operations to be performed to memory cells of the VRAM on one port, while allowing data to be scanned out from the cells via a second, scan port. The video controller 165 typically scans the data out and uses the data to cause corresponding pixels of the display 170 to be energized in accordance with the data. The size of a frame buffer 180 depends upon the number of pixels of the display 170 and the amount of data required for each pixel.
The display data may indicate whether or not a pixel should be illuminated, or if color images are involved, may indicate the desired luminance and chrominance for a pixel. Moreover, color data may be implemented according to a variety of formats, such as YUV, RGB, RBG, etc., which require many bits of data per pixel. Modern color formats, for example, may require up to three bytes, or twenty four bits, of information per pixel.
Producing graphical and video images requires a substantial amount of system resources. Even seemingly simple graphical items, such as lines and circles, may require considerable computation to determine which pixels should be illuminated to yield a high quality graphical item. Animated video usually requires a substantial amount of storage resources and bandwidth from system bus 130. A typical image may involve tens of thousands of pixels, and each pixel may involve several bytes of data. Moreover, video typically involves displaying a sequence of images at a playback rate approximately 15 to 45 images per second.
To help alleviate the computational burdens, various graphics engines have been developed to off-load the computational burden from the CPU 105 of producing graphical items. Graphic engines are known in the art and will not be further discussed.
To help alleviate the storage and bandwidth burdens, compression and decompression techniques are often utilized. With such systems, compressed video data are retrieved from system RAM 110. There, the compressed data may be decompressed by a software decompression routine. Afterwards, the decompressed data may be placed in frame buffer 180, or the decompressed data may be further operated upon by software, as described below.
Often, the image data, i.e., either the graphical or the decompressed video image data, need to be operated upon to provide a desired image. In some cases, the source image data may need to be stretched or scaled by a predefined amount. For example, an image may need to be scaled because a user has resized the image on the display 170 using the mouse 157. Scaling is conventionally performed by a software scaling routine. Referring to FIG. 2, for example, a source image 205 may be stored as a 160.times.120 pixel image, and the to-be-displayed, or target, image 210 may be 200.times.150 pixels. In this example, both the horizontal and the vertical dimensions of the source image 205 are scaled at a 5:4 ratio. That is, every 4 pixels of source image 205 in the horizontal dimension must yield 5 pixels of target image 210 in that direction, and every 4 rows of the source image 205 must yield 5 rows of target image 210. Often, this is achieved by copying certain bits and replicating other bits according to known scaling techniques. In the example of FIG. 2, a conventional technique would copy the first three pixels of the source image 205 and replicate a fourth pixel. In cases where an image must be scaled down, certain bits or rows would be excluded from the copying.
In other cases, the source image 205 may need to be color converted. For example, color conversion may be necessary, because the color format of the source image 205 may be unsupported by the display 170 (FIG. 1). For instance, the source image 205 may be stored in a RGB 5-5-5 format, and the target image 210 may need to be in RGB 24 (R-G-B) format. Color conversion is typically performed by a color conversion routine.
In still other cases, the source image 205 may need to be clipped. For example, referring to FIG. 2, target image 210 is partially obscured or overlaid by image 215. The shaded region 220 is clipped from target image 210. Thus, when the source image 205 is transferred to the frame buffer 180, the image must be clipped so that source data are not written into the frame buffer locations corresponding to the shaded region 220.
In each case, conventional techniques typically invoke a routine to perform a corresponding color conversion, scaling, or clipping operation on the source image 205, or possibly on an intermediate image, as described below. As is known in the art, these conventional routines are often implemented with computer instructions that cause the CPU 105 (FIG. 1) to loop through the source or intermediate data and perform a corresponding operation on that data. For example, a conventional color conversion routine would use nested looping instructions to loop through the rows and columns of a two dimensional array of source image data for source image 205. For each pixel's data, the routine may index a color conversion look-up table with the pixel's data, and the table provides the color converted data for that pixel.
If multiple operations need to be performed on a source image 205, conventional techniques invoke a corresponding routine for each operation in a sequence. For example, if a source image 205 needs to be color converted, scaled, and clipped, conventional techniques would likely invoke a color conversion routine to color convert the source image 205. The color converted image would then be placed in an intermediate data buffer of RAM 110. Then, a scaling routine may be invoked to scale the intermediate, color converted image data, and the scaled and color converted data would be placed in another intermediate data buffer of RAM 110. Lastly, a clipping routine would likely be invoked to clip the scaled and color converted data from the intermediate buffer of RAM 110 to the frame buffer 180, typically according to a mask that would indicate which pixels of color converted and scaled data should be transferred.
After the data was thus operated upon, the data would be .BLTed. to a frame buffer or the like. The terms .blitter. and .BLT. are generally known in the art to refer to block transfer operations of image data, usually, to a frame buffer memory, but also to other target locations, such as a buffer of off-screen VRAM.
As is readily seen, each individual color conversion, scaling, or clipping operation involves moving a substantial amount of data to and from RAM 110, as each routine reads and writes large intermediate data buffers. Consequently, system performance is hindered by the substantial amount of intermediate data movement between the CPU 105 and RAM 110.
Moreover, the instruction looping performed by the conventional routines degrades the performance of modern CPUs 105. This performance degradation is usually attributed to, among other things, instruction looping's detrimental impact on register usage in modern complex instruction set CPUs.
In addition, the prior art conventionally implemented the above-described color conversion, scaling, and clipping functionality as part of the software modules that performed rapid screen updates, for example, as part of video decompressor modules. This approach, however, has certain drawbacks. For instance, the software that performed the color conversion, scaling, and clipping had to be replicated in each of the various modules that performed rapid screen updates, increasing development and maintenance costs. Moreover, the color space conversion routines in each module supported only a limited number of target space conversions. The addition of new color space formats entailed updating numerous modules, some of which may have been developed by other software developers. In addition, the image scaling capabilities for each video compression format were limited to whatever was implemented by the corresponding video decompressor module. Furthermore, the prior art provided no convenient interface to enable applications, such as interactive games, to perform rapid screen updates directly. That is, such applications would need to include all the complexity of scaling, color conversion, and clipping, if the applications desired to directly perform rapid screen updates.
Given that modern applications may require images to be presented at a rate on the order of forty-five images per second or more, those skilled in the art will readily appreciate the advantage of performing the above-described operations rapidly.
Moreover, given the further drawbacks described above, which arise from incorporating the above-described functionality into the various modules that perform rapid screen updates, those skilled in the art will readily appreciate the advantage of having the various features integrated into an interface library so that software, such as interactive games and the like, may invoke the functionality at a single source, thus allowing easier upgrades to color conversion, scaling, and clipping functionality, thereby reducing development and maintenance costs.
Accordingly, there is a need in the art for a method and apparatus that improves the performance of imaging operations.
Likewise, there is a need in the art to provide an improved method and apparatus to allow easier upgrades to imaging functionality and to reduce development and maintenance costs of the software.
An advantage of the present invention is that it provides a method and apparatus that efficiently performs image operations.