The present invention is directed to a data compression technique and, in particular, to the compression of data corresponding to gray scale images generated with, for example, a facsimile machine.
Although the data compression technique of the present invention is useful in a variety of applications, it will be discussed herein with particular reference to the environment of facsimile machines which generate images by scanning original images, transmitting and/or receiving image-related data over telephone lines, and outputting reproduced images.
Conventional facsimile machines use one of three compression schemes called "Modified Huffman" (MH), "Modified Read" (MR), and "Modified Modified Read" (MMR). Each scheme converts a run length (i.e. an uninterrupted sequence of either black or white pixels) of either black or white pixels into a pre-assigned code word. The MH scheme utilizes 92 such code words for each of the black and white pixels, the first 10 of which are shown in Table 1 by way of example.
TABLE 1 ______________________________________ Run Length White Code Word Black Code Word ______________________________________ 1 000111 010 2 0111 11 3 1000 10 4 1011 011 5 1100 0011 6 1110 0010 7 1111 00011 8 10011 000101 9 10100 000100 10 00111 0000100 ______________________________________
The one-dimensional, horizontal compression carried out by MH coding is improved upon with the MR and MMR schemes which are two-dimensional by virtue of performing vertical compression. Briefly stated, MR correlates the current line of pixels being coded with the line just above it. If, for example, a run length of "0"s in the current line coincides with a run length of "0"s in the previous line, only a single bit will suffice to encode the entire run length.
Regarding the MR encoding scheme, when the fax signal is distorted by a noise pulse (e.g. due to lightning or a switching transient), errors are made in the pattern of black and white dots in the received copy. To prevent this incorrect pattern from propagating down the page, an MH coded line is sent periodically. One approach, known as MR-2, compresses every other line using MR encoding (i.e. 1 MH line and 1 MR line). Another approach, known as MR-4, uses MH encoding every fourth line (i.e. 1 MH line and 3 MR lines). In the approach known as MMR, MH encoding is used only for the first line of a page, and all the remaining lines are compressed with MR encoding. In the discussion below, "MR" refers to the MR-2 approach.
Details on MH and MR can be found in McConnell, Bodson, and Schaphorst, "FAX: Digital Facsimile Technology & Applications" Artech House, Inc., 1989, Chapter 2.
The efficiency of the three schemes is usually measured on standard images, such as "CCITT chart #1", "CCITT chart #2", and "CCITT chart #3". The page storage sizes in "Standard Resolution" (i.e. 1728.times.1100 pixels), before and after compression, are shown in Table 2.
TABLE 2 ______________________________________ Uncompressed MMR Page (bytes) MH (bytes) MR (bytes) (bytes) ______________________________________ CCITT #1 237,600 20,951 18,708 14,455 CCITT #2 237,600 47,619 40,659 31,663 CCITT #3 237,600 128,280 95,553 60,879 ______________________________________
These compression schemes result in good compression for pages that contain lines with relatively long "run lengths", such as contained in a page of text. The compression can produce efficient coding of these run lengths. For example, a run length of 10 consecutive white pixels would be encoded by 5 bits (00111 per Table 1) in the MH encoding scheme, resulting in a compression ratio of 2.0.
Such conventional data compression schemes suffer from at least the following two drawbacks. Firstly, they do not have the capability to satisfactorily compress complex pages (e.g. "half tone") because little, if any, storage space is actually saved by the compression scheme. Secondly, the implementation of these schemes in a software based facsimile system, such as NSFAX available from National Semiconductor Corporation, is problematic because they require a high level of processor power. These drawbacks are explained in greater detail below.
Compression of gray scale images:
Gray scale images, when converted to half tone images, are characterized, in contrast to most simple text pages, by very short run lengths, typically one to 4 bits long.
As one can see from Table 1, if the run lengths are more than 4 bits in length, the MH encoding will save space, because the encoding codes are 4 bits in length or less. On the other hand, in complex gray scale images, where it is common to have areas with very short run lengths (both black and white), the MH encoding can actually increase the storage requirements. Table 3 shows a typical gray scale example:
TABLE 3 ______________________________________ Original # of Bitmap Original image MH encoding bits # of MH bits ______________________________________ 0011010111 01111100011101000011110 10 23 ______________________________________
The problem of obtaining a "compressed" file which is usually bigger than the un-compressed original is quite common for gray scale images. MR and MMR compression schemes, which usually achieve better compression than MH, do not solve this problem either, and can even produce worse results. Table 4 shows the file size as well as compressed sizes for three gray scale images (which are attached in the appendix).
TABLE 4 ______________________________________ File File size MH MMR ______________________________________ "PENCIL" 237,600 822,133 608,148 "FAMILY" 237,600 463,367 516,358 "FROG1" 237,600 372,751 466,477 ______________________________________
Performance:
The standard MH, MR and MMR compression algorithms operate based on a determination of run lengths of bits to encode the bit pattern in a more efficient manner. To do so, the basic operations carried out on the input bitmap generated from a scan of the original image, and the output of compressed data, are done on the basis of bit boundaries. In other words, a typical algorithm will count the number of "0" bits in a row and, according to the result, will substitute a unique, pre-designated bit pattern (i.e. compression code) in the output (see the encoding in Table 1 for some examples). While this method may not be problematic when implemented in hardware, it is very time consuming when done in software. This is mainly due to the fact that processors waste a lot of processing power in connection with aligning operations to bit boundaries. Stated another way, processors typically communicate in multiples of a preset number of bits, such as 8 bit bytes. However, the bit boundary at the end of a run length can include any number of bits, yet the processor must transfer it in terms of bytes. For this reason as well as other reasons, such as time needed to repeatedly access the memory, more instructions being required, slow shift instructions needed to align the bits, and a slow determination of where the bits change color, efficiency is lost with a software implementation of an encoding scheme for half tone images.
This performance limitation becomes especially troublesome in connection with obtaining what may be known as a "quick scan". This term refers to a technique by which a person sending a facsimile can insert a document to be scanned, and the resulting data is retained in a memory for later use as transmission is initiated. Once the scanning and retention are completed, the actual steps of sending the document (e.g. dialing, handshaking and transmission) are carried out at whatever speed is appropriate for the equipment being used. However, the user is freed from the task because the facsimile does the transmission on its own by retrieving the stored data. This is different from the earlier-developed, slower and more laborious approach of sending a facsimile with the speed of scanning a document being governed by the transmission speed which is normally considerably slower. It is quite reasonable to expect that a "quick scan" session requires no more than a few seconds even for a complex, gray scale image.
The data compression for a "quick scan" is done in real time as the original is being scanned and with no intermediate storage of the data except that the previous line must be stored for use as a reference to perform MR encoding. Acceptable speed for "quick scan" is available only with specially designed hardware which involves a longer development time for a product, requires more board space, and adds to the cost.
Table 5 provides timing for compressing some of the above-mentioned pages with MH encoding (as measured using NSFAX):
TABLE 5 ______________________________________ File Time (sec) ______________________________________ CCITT3 7.5 FAMILY 24.7 PENCIL 43.5 ______________________________________
It is quite evident that the length of time required is more than one would find reasonably convenient.