1. Technical Field
This invention relates to rasterization, and more particularly to rasterization by parallel processors for use in printing and other batch rasterization applications.
2. Description of the Prior Art
The transition from typewriter-like printers (a fixed set of preformed characters that could only be positioned at specific places on a page) to all-points-addressable ("APA") printing devices, in which a page consists of millions of tiny dots (also known as "picture elements", or "pels" for short) has paved the way to computer printing of complex documents. The page-description languages that emerged to facilitate the process of specifying the content of complex pages require extensive processing in order to determine the darkness or color of each of the millions of pels that constitute the page's content. High-speed printers have advanced so rapidly as to outstrip the processing power available to determine page content, thus creating a processing bottleneck. The present invention addresses this bottleneck.
The printing process begins with an application program, such as a document formatter, which generates a specification of a page's content in a page-description language ("PDL"). Page content expressed in a PDL is called a "datastream". Examples of PDL commands which could be included in a datastream are:
`Draw a circle of radius three inches (8 cm) centered five inches (13 cm) below the page's top margin and four inches (10 cm) from the page's left margin, and fill the circle with the color magenta.`
`Write the words "Now is the time for all good" horizontally, beginning 3.75 inches (9.5 cm) below the page's top margin and 1 inch (2.5 cm) from the page's left margin, using the font "Times Roman" in a size of 12 printer points.`
Once the datastream has been generated, for it to be printed or displayed it must be converted into a collection of several million numbers, each number representing the color of one picture element (pel) on the page. This collection of numbers is called the "pagemap", and the conversion process is referred to as "rasterization".
The determination of which objects (or parts of objects) are visible on the final pagemap or display is usually made differently for two-dimensional ("2-D") objects than for three-dimensional ("3-D") ones. 3-D objects have the third dimension of depth, and their visibility is determined by depth information which is an integral part of an object's description. Since the objects contain the depth information, the order in which they are rasterized and applied to the pagemap is not important.
In the case of 2-D objects, the visibility of an object is typically determined by a combination of its location in the datastream and a "mixing mode". In the "overpaint" mixing mode, for example, a "later" object always hides an "earlier" one. More generally, in mixing modes the new value of a pel which is being colored by a new object is always a predetermined function of the old value and the pel color provided by the new object. For this to be done correctly, however, different objects that affect the same pel must do so in the same order as their order of appearance in the datastream. The use of mixing modes to determine visibility adds some characteristics of a third "depth" dimension to the rasterization of two-dimensional objects, and such rasterization is therefore often referred to as "2-D rasterization".
The adoption of all-points-addressable (APA) printing in conjunction with "advanced function" PDLs has brought about a dramatic increase in the amount of processing required for the rasterization of a page. A rasterizer may now be required to process fonts whose characters are described by curves representing their outlines. The characters must be scaled to arbitrary sizes and rotated at arbitrary angles. The rasterizer may also be required to scale (expand, contract or stretch) graphical objects, rotate them, and convert them to bitmaps, as well as to scale and rotate already-bitmapped images. Furthermore, masks may be used to limit ("clip") the scope of an object, and various mixing modes may be used to specify the result of placing overlapping objects. Finally, these PDLs also have the features of general-purpose programming languages, such as conditional branching, which require processing akin to general-purpose digital computers such as mainframes and personal computers. And recently the introduction of color in printers has further increased the processing demands on rasterizers.
The simplest way to implement a rasterizer is with a single processor. However, as the required processing rate increases, the cost of the processor and the associated components such as memory, increases dramatically, eventually making cost-effective high-speed printers an impossibility. Preparing pages for high-speed printers requires both faster and more powerful processing than is available from single moderate-cost microprocessors. Since, beyond a certain speed, the cost of a processor grows faster than its processing speed (i.e., 5 slow processors are less expensive than a single one that is 5 times faster), high-speed single-processor rasterizers are not competitive.
To keep the cost of the rasterizer manageable, as well as for other reasons such as incremental growth capability, it is therefore highly desirable to have several inexpensive processors collaborate rather than use a single, very expensive processor.
Significant problems and challenges must be addressed when designing a multiprocessor rasterizer. The design should attain a significant degree of parallelism, in order to achieve a high overall processing rate. The design should minimize overhead and duplication of effort by the processors, since these increase the total amount of work done for a given page, and hence reduce the effective increase in performance. The design should minimize the sensitivity of processor utilization to page content. And the design should require less memory than a single-processor rasterizer of equal performance, in order to make it competitive.
Beyond these goals, two essential requirements must be met. First, the datastream environment (cursor position, scale factor, current font, etc.) must be available at the beginning of each segment so that its processing can begin immediately independently of other segments. Second, the impressions of the segments on the pagemap must not overlap, because if they do the order in which the segments are processed and their impressions are applied may affect the final appearance.
The prior art contains several approaches toward multi-processor rasterization. In "pipelined rasterizers" the rasterization is broken into stages, and those are "pipelined" so that a different processor works on each stage, passing its results to the processor working on the next stage along the pipeline. These rasterizers satisfy both the first and second requirements, since every processor sees the entire datastream in the correct order, and only one processor updates the pagemap. The pipelined design is also quite efficient in terms of memory requirements. However the level of parallelism is limited by the number of possible stages to be pipelined, which is no more than 2 or 3. Furthermore, the relative processing load on the different pipelined processors is sensitive to the page content. As a result, pipelined rasterizers are usually no more than twice as fast as single-processor rasterizers (i.e. their speedup factor is usually smaller than 2).
In "page-parallel" rasterizers each processor works on a different page. With this design the first requirement is usually easy to satisfy, since sufficient information is usually available at the beginning of the page. The second requirement is always satisfied since pages (by definition) do not overlap, and a high degree of parallelism is achievable. However, very large amounts of memory are required and the rasterization time of a single page is not reduced. Also, processor utilization drops significantly whenever the page-rasterization time is highly variable (such as when simple and complex pages are mixed together), unless much more memory is used. Since pages must be printed in a specific order, when a page-parallel rasterizer is faced with a difficult page followed by several easy pages, the printing engine (the printer) along with all but one of the processors must lay idle, waiting for the processor working on the difficult page to complete it.
In rasterizers which use "functional parallelism", blocks of different types (e.g., image, text, graphics) are rasterized independently and the results are merged sequentially into the pagemap. This can only offer moderate parallelism due to the limited number of types of blocks. Further, the effective parallelism achieved in a given page is very sensitive to the relative processing loads for the different types and hence to the content of the page. And finally, the last part of the rasterization, in which the blocks are merged into the pagemap, is not parallelized.
Efforts to create rasterizers which exploit "geographic parallelism" have not succeeded without limiting the mixing mode to "OR", which causes superposed images to be merged together without regard for which would overlap the other. Geographical parallelism is based on the intuition that it makes sense to build different regions of a page in parallel. However, the perceived need to satisfy the second requirement at the outset of the rasterization process has prevented the realization of this idea for other mixing modes.
"3-D rasterizers with Z-buffers" are exemplified by the "Superdisplays" proposed by Pavicic, in which "object processors" prepare rasterized "objects" and send them to "image processors" (smart memory). Each rasterized object from a given location is fed to a designated image processor for that location. The object includes an intensity "I" and a "z" value (depth) for each pel. Whenever an image processor receives a new object having (x,y,z,I) values, it compares the z value with the one it is currently holding. If the new value of z is larger than the old one (i.e. if the new object is deeper), the image processor discards the new entry, otherwise it replaces the old value with the new one. It is possible to keep around several values to accommodate objects which only partly color pels. The Z-buffer design is aimed at displays rather than printers. In display rasterizers, typically, lists of independent 3-D objects undergo incremental changes, such as the addition or deletion of new objects, and changes to old objects. Z-buffers are not very efficient for batch 2-D rasterizations of entire displays (pagemaps), since their image processors update pels individually, thus not taking advantage of the fact that with 2-D objects and an overpaint mixing mode (also known as order-implied depth), the last object covers all previous ones. Even if the mixing mode is not overpaint, the same operation sis performed on all affected pels and incremental update is inefficient. Thus, most APA printer rasterizers use batch rasterization since successive pages are not mere updates of previously printed pages. Also, Z-buffers rely heavily on the notion of overpainting of a deep object by a shallow one, and do not lend themselves to more complicated operations in which the result intensity and color are an arbitrary function of the past one and new one, since to do so would require a possibly infinite list of past (z,I) values for every pel (Z-buffer), there being no concept of completing the rasterization for a given depth range. Finally, Z-buffer technology has no concepts of finding state-independent objects and resource preparation (discussed below).
The "Raster Processing Machine" (RPM) described by A. Ben-Dor and Brian Jones, Versatec, in "New Graphics Controller for Electrostatic Plotting, IEEE CG&A, pp. 16-25, January 1986, was developed for electrostatic plotters which do not have a full pagemap. Incoming graphical objects are converted sequentially (not in parallel) into an internal format, which produces a bottleneck limiting the rasterizer's performance. The converted objects are then sorted by geographic location into bins corresponding to the size of the raster buffers of the machine (bands of pelmap). Once this process is completed, the bins are processed, possibly in parallel, and the results are placed into the appropriate partial pelmaps. The partial pelmaps are then combined together without regard for the sequencing of objects in the data stream. If two objects are superposed on the page they are blended together and printed.