1. Field of the Invention
The present invention relates generally to devices and methods for video compression, and more particularly to devices and methods for video compression of motionless images.
2. Description of the Prior Art
Video and audio signals making up a conventional television broadcast may be digitized and then compressed in accordance with standards established by the International Organization for Standardization (“ISO”) and International Electrotechnical Commission (“IEC”). One of these standards, ISO/IEC 11172, is generally identified by the popular name MPEG-1. A technologically related standard, ISO/IEC-13818, is identified by the popular name MPEG-2. The MPEG-1 and MPEG-2 standards respectively define a serial system stream, i.e. a bitstream that contains both compressed video and audio data, that is well suited for quality:                1. video playback from digital storage media such as a hard disk, CD-ROM, or digital video disk (“DVD”); and        2. transmission such as over a cable antenna television (“CATV”) system or high bit rate digital telephone system, e.g. a T1, ISDN Primary Rate, or ATM digital telecommunications network.The MPEG-1 and MPEG-2 standards are hereby incorporated by reference.        
The block diagram of FIG. 1 graphically illustrates a portion of the process by which video and audio signals making up a conventional television broadcast are digitized and then compressed during assembly of an MPEG-1 or MPEG-2 serial system stream. In the illustration of FIG. 1, a video camera 22, video tape player 24, video disk player 26 or some other type of video-data storage-device 28 supply both:                3. an audio signal, indicated in FIG. 1 by an arrow 32, to an audio encoder 34; and        4. a video signal, indicated in FIG. 1 by an arrow 36, to a video encoder 38.In accordance with either of the MPEG standards, the encoders 34 and 38 first digitize the respective signals 32 and 36, and then encode the digitized signals 32 and 36 respectively into a MPEG compressed video bitstream 42 and a MPEG compressed audio bitstream 44. Subsequently during the MPEG compression process, as illustrated in FIG. 1 a MPEG serial system stream 46 is assembled by concatenating packs 48 of compressed data selected respectively from the compressed video bitstream 42 and the compressed audio bitstream 44.        
In this way, the MPEG serial system stream 46 incorporates the compressed video bitstream 42 that may decompressed to present a succession of frames of video. As illustrated in FIG. 2, the compressed video bitstream 42 produced by the video encoder 38 consists of successive groups of pictures (“GOPs”) 52. Each GOP 52 includes intra (“I”) frames 54, predicted (“P”) frames 56, and bidirectional (“B”) frames 58. An I frame 54 of MPEG compressed digital video data is both encoded and decoded without direct reference to video data in other frames. Therefore, MPEG compressed video data for an I frame 54 represents an entire uncompressed frame of digital video data. A MPEG P frame 56 is both encoded and decoded with reference to a prior frame of video data, either reference to a prior I frame 54 or reference to a prior P frame 56. A B frame 58 of MPEG encoded digital video data is both encoded and decoded with reference both to a prior and to a successive reference frame, i.e. reference to decoded I or P frames 54 or 56. The MPEG-1 and MPEG-2 specifications define a GOP 52 to be one or more I frames 54 together with all of the P frames 56 and B frames 58 for which the one or more I frames 54 are a reference. MPEG-2 operates in a manner analogous to MPEG-1 with an additional feature that the I frames 54, P frames 56, and a B frames 58 of the MPEG-1 GOP 52 could be fields of the I frames 54, P frames 56, and a B frames 58, thus permitting field-to-field motion compensation in addition to frame-to-frame motion compensation.
Regardless of whether an I frame 54, a P frame 56, or a B frame 58 is being compressed, in performing MPEG compression each successive frame 62 of uncompressed digital video data is divided into slices 64 representing, for example, sixteen (16) immediately vertically-adjacent, non-interlaced television scan lines 66. An MPEG-1 slice 64 can be defined to specify an entire frame of decompressed video. However, an MPEG-2 slice 64 can be defined to specify video that has a maximum height of one slice 64, i.e. sixteen (16) immediately vertically-adjacent, non-interlaced television scan lines 66, and which spans the frame's width. MPEG compression further divides each slice 64 into macroblocks 68, each of which stores data for a matrix of picture elements (“pels”) 72 of digital video data, e.g. a 16×16 matrix of pels 72.
MPEG compression processes the digital video data for each macroblock 68 in a YCbCr color space. The Y component of this color space represents the brightness, i.e. luminance, at each pel 72 in the macroblock 68. The Cb and Cr components of the color space represent subsampled color differences, i.e. chrominance, for 2×2 groups of immediately adjacent pels 72 within the macroblock 68. Thus, each macroblock 68 consists of six (6) 8×8 blocks of digital video data that in the illustration of FIG. 1 are enclosed within a dashed line 74. The six (6) 8×8 blocks of digital video data making up each macroblock 68 includes:                1. four (4) 8×8 luminance blocks 76 that contain brightness data for each of the 16×16 pels 72 of the macroblock 68; and        2. two (2) 8×8 chrominance blocks 78 that respectively contain subsampled Cb and Cr color difference data also for the pels 72 of the macroblock 68.In compressing all the macroblocks 68 of each I frame 54 and certain macroblocks 68 of P frames 56 and B frames 58, MPEG digital video compression separately compresses data of the luminance blocks 76 and of the chrominance blocks 78, and then combines the separately compressed blocks 76 and 78 into the compressed video bitstream 42.        
Mathematically, the four (4) luminance blocks 76 and two (2) chrominance blocks 78 of each macroblock 68 respectively constitute 8×8 matrices. Referring now to FIG. 3, compressing each macroblock 68 includes independently computing an 8×8 Discrete Cosine Transform (“DCT”) 82 for each of the six (6) 8×8 blocks 76 and 78 making up the macroblock 68. The six (6) 8×8 DCTs 82, only one of which is depicted in FIG. 3, respectively map the data of the six (6) blocks 76 and 78 into sixty-four (64) frequency coefficients. Each frequency coefficient in the DCT 82 represents a weighing factor that is applied to a corresponding basis cosine curve. The sixty-four (64) basis cosine curves vary in frequency. Low cosine frequencies encode coarse luminance or chrominance structure in the macroblock 68. High cosine frequencies encode detail luminance or chrominance features in the macroblock 68. Adding together the basis cosine curves weighted by the sixty-four (64) DCT coefficients reproduces exactly the 8×8 matrix of an encoded block 76 or 78.
By themselves, the coefficients of the DCT 82 for a block 76 or 78 provide no compression. However, because video data for most macroblocks 68 lack detail luminance or chrominance features, most high-frequency coefficients for the DCTs 82 are typically zero (0) or near zero (0). To further increase the number of zero coefficients in each DCT 82, MPEG encoding divides each coefficient by a quantization value which generally increases with the frequency of the basis cosine curve for which the coefficient is a weight. Dividing the coefficients of the DCT 82 by their corresponding MPEG quantization values reduces image detail. Large numeric values for quantization reduce detail more, but also provide greater data compression for reasons described in greater detail below.
After quantizing the DCT 82, the quantized frequency coefficients are processed in a zigzag order as indicated by arrows 84a-84i in FIG. 3. Applying a zigzag order to the quantized frequency coefficients tends to produce long sequences of DCT frequency coefficients having zero (0) value. Run-length encoding, indicated by an arrow 86 in FIG. 3, is then applied to the zigzag order of the quantized DCT coefficients. For those quantized DCT coefficients that differ from the immediately preceding and succeeding DCT coefficient along the zigzag path, run-length encoding specifies a run-length of zero (0), i.e. a single occurrence of the quantized DCT coefficient. Long sequences of zero (0) coefficients along the zigzag path depicted in FIG. 3, are efficiently encoded using a lesser amount of data. MPEG run-length encoding represents each such sequence of consecutive identical valued quantized frequency coefficients by a token 88, depicted in FIG. 3, which specifies how many consecutive quantized frequency coefficients have the same value together with the numerical value for that set of quantized frequency coefficients.
The tokens 88 extracted from the sequence of quantized frequency coefficients are then further compressed through Huffman coding, indicated by an arrow 92 in FIG. 3. Huffman coding converts each token 88 into a variable length code (“VLC”) 94. MPEG assigns values that are only 2-3 binary digits (“bits”) long for the VLCs 94 representing the most common tokens 88. Conversely, MPEG video compression assigns values that are up to 28 bits long for the VLCs 94 representing rare tokens 88. The Huffman coded VLCs 94 thus determined are then appropriately merged to form compressed video bitstream 42 depicted in FIG. 1.
While decoding the compressed video bitstream 42 assembled as described above reproduces frames of motion video that are generally visually acceptable, reproduced frames of still images, particularly still images containing text, are in many instances, if not most, visually unacceptable. As described above, the process depicted in FIG. 3 of separately computing the DCTs 82 for the luminance blocks 76 and the chrominance blocks 78, quantizing the DCT coefficients, zigzag ordering of quantized DCT coefficients, run-length encoding, and finally Huffman coding generally remove a significant amount of high frequency data from MPEG compressed I frames 54. Decoding of I frames 54 from which high frequency data has been removed produces an image having less detail, e.g. sharp corners and abrupt transitions from one color or intensity to another, than appeared in the uncompressed frame of video data. However, MPEG compression does not completely discard this high frequency data, i.e. image detail. MPEG compression attempts to encode this high frequency data into successive P frames 56 and B frames 58 that use the I frame 54 as a reference, either directly or indirectly. Consequently, after decoding the lesser detail in each I frame 54 of a still image, decoding subsequent P frames 56 and B frames 58 increases, over time, the detail present in the video images until the next I frame 54 is decoded.
For the preceding reasons, image detail in frames 62 decoded from the conventional MPEG compressed video bitstream 42 that reproduce a still image, particularly a still image containing text, tends to be lower at the beginning of each GOP 52 when an I frame 54 is decoded, increase during decoding of successive P frames 56 and B frames 58 in the GOP 52, only to decrease again upon decoding the next I frame 54. Thus, a decoding of the MPEG compressed video bitstream 42 of a still image frequently produces a video image that appears to pulse visually, usually at a frequency that is identical to the frequency at which GOPs 52 occur in the compressed video bitstream 42, e.g. twice per second. This visual pulsing of a decompressed MPEG compressed video bitstream 42 of a still image in many instances makes them commercially unacceptable.
In addition to the conventional MPEG compressed video bitstream 42, there also exists another technique for compressing the video signal of a conventional television broadcast frequently identified as motion JPEG. The compressed video bitstream 42 for motion JPEG includes only I frames 54, and therefore omits both P frames 56 and B frames 58. Consequently, images decoded from a motion JPEG compressed video bitstream having a quality equivalent to that of MPEG compressed video require a larger amount of data. Alternatively, images decoded from motion JPEG compressed video bitstream that have an amount of data equivalent to MPEG compressed video possess a lesser quality than decoded MPEG-1 images.