1. Field of the Invention
The present invention relates generally to methods for compressing digital data and, more particularly, to methods for compressing digital data representing a document image into a code bit stream.
2. Background Information
Data compression techniques are widely used in facsimile data transmission systems and optical disc storage systems. In general, data compression techniques are used when documents have been converted into images and the images have been digitized for either storage or transmission. The document image is then represented by a bit stream.
The techniques of data compression have been extensively studied and are widely used today because the image data obtained from both documents and illustrations have great redundancy.
One well known method of compressing image data is run length encoding. This process involves optically scanning an image, digitizing the resulting analog output waveform, and expressing the digitized image data in terms of the lengths of each sequence of like bits. In other words, each bit represents a pixel of the image and its logic level represents either black or white. The length of each sequence of all white or all black pixels is the run length.
Of the various run length encoding methods, the modified Huffman's encoding method is the most well known. In this method, the output data having the shortest bit length is given to the statistically most frequent length. Further, there are more sophisticated run length forecasting methods which are also available. In these systems, the run length is obtained and then converted by using a table of codes so that the statically most frequent run length uses the smallest identifier.
The above-named systems do not easily lend themselves to computer processing because of the inherent slowness of serially processing the resulting bit streams.
Multi stage block encoding is a method suitable for computer processing. This technique is described in an article entitled "New Method for Compressing Data for Facsimile Signals," by linuma, Usubuchi, and Ishiguro, Data for Research for Transmissions Methods, the Japanese Transmissions Society, CS73-36, 1973-07. In this method, data are divided into a series of data blocks of the same length. A block representing all white is encoded as a logic "0." Other blocks are encoded as a logic "1." The bits inside of each block given a logic "1" are divided in the same manner. The division process is continued until a unit block made up of a given number of bits occurs. Thereafter, codes are assigned to the blocks and the unit blocks are transmitted.