The present invention relates to a method of compressing data, and, more particularly to a method of compressing data representative of a video signal to achieve more facile transmission thereof.
Data compression or compaction techniques can increase the throughput of data over a communication link. A simple way to achieve data compression is to examine data intended to be moved over a communication link and to transmit an instruction which provides a count of characters that are repeated in sequence. This technique, called "run length encoding" permits transmission and reception of the instruction--without actually transmitting or receiving the serial characters--as a surrogate for the characters. Another technique, called "code book compression", uses specific codes to indicate a pattern of characters and phrases to permit one character to be sent and received as a surrogate for numerous other characters. Also known is so-called "Huffman encoding" pursuant to which tables indicating the frequency of use of characters within a language are evolved. The number of bits required to send each character is based on the character's relative frequency in the language.
A video image is usually expressed in pixels or pels. A pixel is a point (i.e., an address or location in an orthogonal (X, Y)) array having a value. The value is related to, or determines, color and/or intensity. Of interest is the compression of video data contained in the sixty-four, 64 by 64 arrays (4096 pixels/array), where each pixel may have one of 256 values ranging from 0 to 255.
A "standard" video screen has a theoretical grid of 520.times.640 pixels, of which only a 512.times.512 pixel grid typically is used. A 512.times.512 pixel grid may be viewed as containing 64 (i.e., 8.times.8) "BLOCKS", each BLOCK comprising a 64.times.64 array of pixels. The 512.times.512 pixel grid, or 8.times.8 grid of pixel BLOCKS, each being a 64.times.64 array of pixels, is called "FRAMEGRAB".
As noted, each pixel has an orthogonal location, representable in orthogonal X,Y notation, and a value, P. Thus, each pixel could be identified by the notation X,Y,P in which X is the decimal value of its X ordinate (horizontal row in the array), Y is the decimal value of its Y coordinate (vertical column in the array) and P is its decimal color/intensity value. Typically the pixel at the origin, (the upper left-hand corner) that is, the "first" pixel or pixel 0, has a decimal scalar address S of "0" (i.e., S=0) and is represented in X, Y or vector notation as X,Y=0,0. Since the standard FRAMEGRAB is 512.times.512, within a FRAMEGRAB X and Y vary from 0-511 (i.e., 0,0 to 511,511) and P varies from 0-255.
Alternatively, each pixel location can be identified by the decimal scalar quantity S identifying its grid location (there are 262,144 such locations in a 512.times.512 grid) varying from 0 to 262,143 and a decimal numeric designation P of each pixel's value (0-255). The X,Y vector and S (scalar) notations for a pixel's address are convertible one into the other. For example, considering the 512 pixel by 512 pixel grid, the pixel in decimal scalar location S=12,500 is located in location X,Y=24,212, that is, in the Row 24 and Column 212. This conversion may be easily achieved for in the typical orthogonal pixel array by so-called residual mathematics. Specifically, the scalar decimal location S=12,500 in the grid is representable by EQU S=X(Mod)+Y,
where "Mod" means modulus and is the number of pixels, here 512, in a row or column. Thus, EQU S=12,500=X(512)+Y,
where X is the largest multiple of 512 not exceeding 12,500 (=24) and Y is the remainder (=212). That is, EQU S=12,500=(24)(512)+212
where X=24 and Y=212
An example of one type of compression is now described. Assume that in the 512.times.512 pixel grid, a pixel at S=1300 (X,Y=2,276 in X,Y notation) has a value of 100, pixels at S=1301 through S=1315, (X,Y=2,277 through 2,291 in X,Y notation) all have the value of 150, and the pixel at S=1316 (X,Y=2,292) has a value of 200. In transmitting non-compressed data representing the foregoing, the location S=1300 (or X=2, Y=276) and P=100 are first transmitted. This is followed by transmission of X=2, Y=277, P=150 through X=2, Y=291, P=150, which is followed by transmission of X=2, Y=292, P=200. Transmission and receipt of these data nominally require that all data be sent and received, decimally, digitally or otherwise. However, compression may be achieved.
Specifically the value P=150 is the same for the pixels located X,Y=2,277 through X,Y=2,291. The value P=150 may, for the pixels located at X,Y=2,278 through X,Y=2,291 (having the scalar addresses S=1301 through S=1315) be transmitted as a "null" or "0", the presence of a "null" or "0" being a "code" which instructs that the affected pixel following X,Y=2,277 (scalar S=1301) has the same value as the value of the last non-0 pixel value, or, that pixels with scalar addresses from S=1302 (X,Y=2,278) through S=1314 (X,Y=2,290) have the same value P=150 as the pixel at scalar location S=1300 (X,Y=2,276). Thus, if the transmission notation is in the form "S,P", data regarding the foregoing pixels would be:
(a) Uncompressed (S,P): 1300,100; 1301,150; 1302,150 . . . , 1314,150; 1315,150; 1316,200;
(b) Compressed (S,P): 1300,100; 1301,150; 1302,0; 1303,0; . . . , 1314,0; 1315,0; 1316,200.
The presence of 0's in fourteen locations following "1301,150," denoting addresses having pixels with the same value P=150 as address 1301 instead of bits representing "150", decreases the amount of, or compresses, the data which needs to be transmitted. This is a type of "run length encoding" mentioned earlier, in which serial, repeating characters are not sent. This technique is also referred to as "filling in".
Although locations in any grid of pixels do not repeat--each location is unique--as do the pixel values, a variant of the above run length encoding as regards pixel locations can be used. An S having the value 0 in the S,P notation does not mean the "0" location, but is a "code" meaning "that the relevant location is "1" higher than the preceding location. Thus, the compression expressed as (b) above, can be further compressed as:
(c) 1300,100; 0,150; 0,0; 0,0 . . . 0,0; 0,0; 0,200.
The above compression techniques may lead to compressions of the pixel value data as high as 70% and compressions of the pixel location data as high as 50%. Unfortunately, where both value and location must be transmitted, the "real" compression is the lower of the two compressions. Further, because pixel values P may be essentially not repetitive, these techniques may yield as little as 2% compression of the data representing pixel value P, in which event the real compression is about 2%.
Accordingly, one object of the present invention is the provision of techniques for compressing pixel value and address data which achieve compressions significantly higher than the techniques of the prior art, specifically, compressions of at least about 80% and higher, such as compressions of 100:1, or even 1000:1.
As noted, data compression using run length encoding may achieve maximum compressions of only 50%, but often the compression is substantially less than this maximum. Compressions achieved by "code book compression" and "Huffman encoding" may reach 70-80%, but, in reality, typically reach 60% or lower. Further, compression achieved according to the so-called "cosine law" is capable of achieving apparently higher compressions of up to 99%, but often results in, or introduces, errors which, in turn, produce "fuzzy" pictures.