1. Field of the Invention
This invention relates to facsimile image compression and more particularly to a method for highly compressing imagery of textual information for purposes of expedient information storage and transmission in text processing systems.
2. Description of the Prior Art
Video imagery is probably the most ubiquitous form of information transfer in our society. Television transmissions and their related pictures constitute one form of video imagery. In office systems another form of video imagery has become prevalent whereby the picture of drawings or pages of text can be transmitted from point to point by densely scanning the document and resolving it into hundreds of video lines per inch. These lines are transmitted as patterns of off and on bits corresponding to the black and white picture elements (PELS) in the original document. At the other end of the transmission, by correspondingly creating black and white PELS from the bits, the original image can be reproduced at a remote location with practically perfect fidelity. The limiting factor on fidelity relates to the resolution with which the original document was scanned and reconstituted into a series of off and on bits related to black and white patterns. In the same manner that such imagery reconstituted into bit patterns can be transmitted, it can be similarly stored on magnetic memory media as an alternative to a paper depository. Although a document can be reconstituted with ink on paper using a bit pattern, the representation as a PEL by PEL bit pattern has certain drawbacks. For good resolution, an 8 inch by 11 inch document resolved at 200 PELS per inch requires roughly three and one-half million bits of storage. Hence, simple PEL by PEL document imagery capture becomes expensive in terms of transmission bandwidth and magnetic media storage requirements.
A second method of maintaining document imagery while economizing on the number of bytes that normally would be required to store the document on a PEL by PEL basis involves utilizing run length encoding. This implies that before the image of the document is stored or transmitted, runs of like-bits are removed from the image and replaced by a number that indicates that a run of bits of a certain length was originally present at this location. When the imagery is reconstituted, the numerical count of like-bits is replaced by a string of bits that was originally present. This is a very effective method of removing "white space". Its efficiency begins to deteriorate as more complex images occur on the page and the average length of the runs of like-bits decreases. Depending on the method of run length encoding used, and the complexity of the image upon which the run length encoding is being attempted, a point of diminishing returns is reached whereupon attempting to use run length encoding actually requires more bits than the simple PEL by PEL binary representation of the document. Although all schemes of run length encoding involve replacing a run of like-bits by their number, there are several variations on the scheme to increase efficiency and allow for more dense documents to be considered before the point of diminishing returns occurs. In all cases, the efficiency of run length encoding is better for sparse imagery such as often found in graphics and diagrams.
Another approach to compression of facsimile imagery utilizes two-dimensional run length encoding. In this case a more complex but efficient form of run length encoding is used whereby the most typical run lengths are very effectively represented utilizing a "memory line". This facilitates very economical representation of the most common run lengths, which are highly correlated with "edge effects" evidenced by vertical repetitiveness of run lengths and the high frequency of recurring run lengths that are minor perturbations (plus or minus one or two PEL) from their vertical predecessor. The result is nominally a factor of two (2) improvement over what simple run length encoding yields. However, two dimensional run length encoding also, for certain highly non-coherent images such as poor quality reproduction of textual documents, encounters diminishing returns. The encoded document can require more bits than simple straight forward PEL to bit capture. Also, with dense text documents relatively low compression rates are encountered.
A third method of image compression utilizes so-called "lossy" algorithms which do not necessarily resolve an image down to its PEL level but rather captures the graphic substance of a document and transmits those shapes maintaining understandability or cognitive substance but not necessarily absolute fidelity. For textual purposes, the communicated document conveys the same information as the original but may not look absolutely the same as its paper predecessor. An example of this would be the transmission or storage of a document that has been OCR scanned and interpreted. The electronic representation of this document in theory contains the same character information. However, at the PEL level, there may be some discrepancy between the respective images since the cognition has been at a higher level than PEL by PEL. Short of Optical Character Recognition (OCR), there are a number of complex symbol compression techniques which compress the document by finding repetitive shapes, cataloging them and assigning an ID number and then creating the electronic copy of the document as a combination of the original complex images and their ID number wherever they are repeated within the document. For example, in the case of a textual document the set of symbols may be resolved that correspond in some order to the alpha-betic characters A through Z that are now looked upon as the complex symbols that the repetitive images in the document are resolved against and denoted by ID's. Those character images that do not match against previously encountered complex symbols are added to the repertoire of templates, assigned their respective ID's and becomes candidates in the match process to resolve successive images encountered in the document. For textual documents, such an approach offers a high compression rate in comparison to run length encoding based techniques. However, the performance of such techniques is predicated on being able to resolve images at the character level, which implies that the system is sensitive or intelligent enough to be able to segment characters from within a word subfield. It has been shown that it is this ability to reliability delineate character shapes from a word subfield that is one of the key weaknesses in reliable optical character recognition. Hence, although the lossy complex symbol match approach in theory would work quite well, the inability to consistently delineate character shapes within a word implicitly limits the performance, reliability and utility of such algorithms in addressing compaction of textual documents. For general graphics, a high repetitiveness of shapes is normally not present and hence the underlying requisite for the complex symbol match facsimile compaction of text is missing.
The present state of the art in facsimile document compression, especially with respect to images of textual documents, require enormous storage or bandwidth for archiving or transmission, respectively. The ability to compress a broad range of imagery of textual origin with varying print quality and sharpness and maintain a compression rate similar to that of coded information is an area that has not been addressed in the prior art.