1. Technical Field
The invention disclosed broadly relates to data processing systems and more particularly relates to improvements in the storage manipulation and retrieval of digitized images.
2. Background Art
A digitized image requires a large amount of storage, e.g., a simple binary coded 81/2.times.11 typewritten page requires 3.74 million bits of storage when scanned at 200 lines per inch in both dimensions. In the usual terminology of computer storage, this is 467.5 kilo bytes. This compares with approximately 3 kilo bytes of character data that is required to store a typical typewritten page of text.
To reduce the storage and transmission requirements in holding and moving these pictures through a system, encoding is usually applied. Coders in which the data is reduced while allowing a perfect reproduction of the information are referred to as information preserving encoders.
Compression in the spatial domain usually is a run length type of encoding which maps the sequence of picture elements; a(1), a(2), a(3) . . . into a sequence of pairs (c(1),1(2)), where c denotes the color, black or white for binary, and 1 represents the length of the sequence. The number of bits required to store 1, obviously has to be greater than 1 for this method to be useful. Suppose, for example that we reserve two bits, using standard binary counting to describe run lengths, that is we can describe lengths up to four bits. Then, it is required to have three bits of storage, per run length transition. For example, the sequence 1111000110011001 that requires 16 bits, would be represented by the following pairs (1,11),(0,10),(1,01),(0,01),(1,01),(0,01),(1,00); 111010101001101001100. As can be seen from the example, if there are a large number of transitions, the approach does not compress the data, but rather expands the data. Of course, in well-behaved sequences, the compression significantly reduces the data, e.g., 1111111111110000 reduces to (1,11 ),(1,11),(1,11),(0,11) or 111111111011. The problem, as illustrated, indicates that compression may not reduce data if there are a large number of transitions, but may in fact expand the data. The problem is acute in valuable documents, e.g., checks, stock certificates, and negotiable instruments of all types. These documents are designed to make it difficult to copy, or reproduce for the obvious reason of decreasing the probability of fraud.
The ability to decrease the storage and data transmission requirements for images of complex documents is the subject of this patent application. While negotiable instruments are the examples used, the concept is not limited to these types of documents.
In some applications it is not necessary to have information preserving encoders. In these applications it is only necessary to have a "good enough" picture or image to obtain the relevant information. In these situations, the source documents may be kept in a remote low cost storage facility, in case of legal issues, or the document may be required as a reference for some period of time but is not critical enough to require perfect reproduction. A personal check for a small amount is an example. The bank may want a readable image in its archives for two or three billing cycles, but safekeeping is the responsibility of the person who wrote the check and reference to a readable picture in the archive is sufficient to resolve any minor balancing concerns.
This invention addresses processing of these documents when preserving all the information is not necessary. The invention addresses large volume efforts where it is desirable to process documents without operator intervention and it is necessary to adjust the processing automatically as a function of a key characteristic of the document or control card.
Various characteristics of the document can be used, including, but not limited to: color, mark sense characters, MICR, OCR, bar code and text recognition.