CCITT (the International Telegraph and Telephone Consultative Committee) defines a raster image stream as a series of YH raster lines, each containing XW picture elements called pels. The stream begins at the upper left corner of the image and is scanned left to right, top to bottom. Each raster line is represented as a series of alternating run lengths corresponding to consecutive white and black pels.
As a standard for facsimile transmissions, CCITT specifies what is known as variable run length coding formats, into which such input streams should be encoded. More accurately, the CCITT Group 3 and Group 4 facsimile specifications define two coding schemes. One CCITT coding scheme is a "one-dimensional" method which converts input pel run lengths into variable length codes statistically chosen so that the most common run lengths can be represented with the fewest number of bits. This is known as the Modified Huffman scheme. The basis for this approach is that documents, preferably defined as black printed on a white background, contain short black pel runs and lots of long white pel runs. The CCITT also defines a "two-dimensional" encoding method, in which the spacing of pixel runs on the previous scan line is used as a reference to encode the current scan line. This method attempts to benefit from the fact that most printed characters have significant vertical correlation between adjacent scan lines. The CCITT coding schemes are described in the CCITT Fascicle VII.3, Recommendation T4 and T6, pages 21-57 ( 1989), herein incorporated by reference.
Further the CCITT encoding methods, and the Group 4 facsimile specifications in particular, have been widely applied to document image processing applications in which mostly black on white documents are optically scanned, encoded, stored in optical or magnetic media, retrieved, decoded, and then output for printing or display purposes. For example, the modern facsimile machine essentially performs most of these steps. The special properties of the CCITT bi-level image encoding method often results in data compression ratios exceeding 15:1, so that A4 sized documents scanned at 200 dots per inch (i.e., 1784 bits per line by 2200 lines or 490 kB) often require less than 30 kB once encoded. This enables modern facsimile machines to transmit most pages in under 60 seconds.
In addition to the basic CCITT encoding and decoding functions, document image processing applications often require an image to be clipped, scaled, enhanced, flipped orthogonally, and adjusted to correct for small skew angles introduced during the scanning process. Skew correction improves the quality of displayed images, since such images lose resolution going from a 200 dpi (dots per inch) scan to a typical 100 dpi screen. Skew correction may also improve the quality and speed of optical character recognition software that is often used to attempt to "read" scanned document fields.
Current state of the art VLSI CCITT Compression Expansion Processor (CEP) circuits such as the OAK OT95C71 have been designed to primarily support facsimile applications. These circuits follow the above described CCITT coding schemes. One disadvantage of such circuits is that they do not provide any clipping, scaling or rotation (CSR) functionality as that required by document image processing applications. To date, CSR function is currently performed separately in a processing unit that is forced to operate upon the fully decoded CEP physical output raster stream, even though much of this data stream will eventually be eliminated from the clipped and scaled output data stream.
For instance, graphical user interfaces (GUI) such as Windows 3.0, Macintosh, and X Windows all represent a desk top as consisting of a stack of images (pages or "windows") randomly overlapping each other. Usually there is one top or active "window" at a time. If the dimensions or position of any window is modified by moving or resizing operations, rectangular, not necessarily adjacent, pieces of one or more underlaying windows can be become uncovered, requiring a redraw operation known as "repainting". Repainting attempts to minimize system bus bandwidth by restoring only uncovered pixels using pixel block transfer (PIXBLT) or regeneration techniques.
For low resolution screens, the number of off screen bytes needed to save or reload each individual image is moderate and the PIXBLT technique is widely used during repaint. But high resolution screens (greater than 1200 pixels or rows) require a larger amount of off screen memory and a potentially large amount of bus bandwidth in order to use the PIXBLT technique. A 1664 by 1200 screen is roughly 6.5 times the size of a 640 by 480 VGA screen, and a 1784 by 2400 screen is nearly 14 times this size. These screens represent typical document image processing resolutions needed to achieve full screen real size images of A4 sized documents.
For this reason it is desirable to repaint uncovered pixels using the regeneration technique. In that case, the encoded image is already stored in so no additional memory is required for "off screen" copies. There is usually less overall memory bandwidth required to transfer subject image data. For example, to repaint 50% of a 1664 by 1200 image requires 832 by 1200 or 125 kB of temporary storage versus 30 kB for the compressed image. This extra memory is eliminated, and so is the extra U bandwidth requirement to move the additional 95 kB. These numbers become optimal as the display resolution increases or the repaint area increases. At VGA screen levels, the entire 640 by 480 screen occupies 38 kB and the off screen method works just fine.
However, the uncovered pixels in a repaint operation are not necessarily limited to one region of the screen, and while each region can in itself be constructed of overlapping rectangular components, each region is not necessarily a rectangle. Accordingly, there is a need to represent multiple rectangular repaint regions and to decode a file representing the original input 1:1 image pixel screen into multiple scaled output streams, each with its own clipping and window parameters. In this context, "clipping" refers to the process of discarding input stream pixels, and "windowing" refers to the process of selecting input stream pixels. Both "clipping" and "windowing" are mutually exclusive versions of the same basic selection process. For purposes of the following description, these terms are used interchangeably.
Within each repaint rectangle, a completely predictable pixel pattern must be generated so that resulting images can be pasted into repainted regions with pixel perfect alignment. Also, subword sized pixel streams appearing at each paint rectangle's right and left borders need to be merged with on-screen pixels. This is a problem for the DMA Bus Controller. This requires a scaling algorithm capable of precise repeatability.
Scale factors are represented herein as a ratio of two integers, denoted as N/M. This representation produces precisely N output pixels for every M input pixels. The problem which exists relates to how to handle randomly sized input pixel runs, or runs that do not start or end on M unit boundaries. Randomly selected clipping windows have this later boundary matching problem. Note, also that the X and Y dimension scale factors are not necessarily identical, although it may be assumed that they will not differ by more than some reasonable amount, typically 4:3 or 2:1.
The CCITT scheme is completely lossless, so that every pixel appearing in the original image is represented in the decoded 1:1 image stream. The scaling process is lossy however, and can result in the loss of entire black or white pixel runs during image reduction. Various image enhancement schemes will be needed to minimize this loss effect. Similarly edge definition becomes exaggerated and jagged when an image is enlarged and this effect also needs to be minimized.