The present invention is directed to a method for improving efficiency of data encoding and decoding. Specifically, the present invention is directed to a method for enhancing an entropy encoder to reduce the worst case inefficiency of such an encoder.
As people seek to communicate more and more frequently in the digital domain, that is transferring digital data between a source and a receiver, a limiting factor is the availability of channel resources. Channels typically are provided with a predetermined bandwidth based on the physical characteristics of the channel itself. To therefore increase the overall information throughput along the channel, it is beneficial to provide capabilities by which information can be transmitted through the channel utilizing the least amount of bandwidth possible.
FIG. 1 illustrates an example of a prior art configuration for transmitting data from a source to a receiver which keeps this goal in mind. In particular, source information, which ultimately could be in analog form and has been digitized by some analog to digital converter (not shown) is provided to a compression device 101. The compression device operates on the digital bit stream to reduce the number of bits needed to represent the information received from the source. Such information can take the form of a bit stream that is representative of some event, that is, for example, letters in text or pixels in an image. The compressed data is then provided to an encryption device 102 which can "scramble" the data to make it unusable by anyone who is not aware of both the encryption technique and the encryption key. This safeguards the data from being taken off of the channel by some third party and used without permission of the source. The encrypted data is provided to an error correction code (ECC) generator. This ECC generator provides additional information based on the encrypted data so as to permit the receiver to reconstruct data even where some limited number of errors occur in the course of the transmission of the data from the source to the receiver. The data is then provided to the channel 104. At a receiving end the data is then passed to a device referred to as "EEC-decoder" 105 which uses the transmitted error correction code information to present a corrected data bit stream to the decryption device 106. The decryption device then deciphers the received data based on some encryption key to essentially provide to the decompression device 107 the same data bit stream which was provided from the compression device 101 to the encryption device 102. The decompression device thus takes what amounts to be the compressed data bit stream and expands it for use by the receiver (not shown).
FIGS. 2A and 2B are block diagram representations of portions of the compression device and the decompression device of FIG. 1, respectively. In FIG. 2A the compression device may include an encoder modeler 201 and a coder 202. In this context the modeler and the coder will both receive the digital representation of events. The modeler will provide probabilities with respect to the events, for example, the probability that the letter "u" will follow the letter "q", to the coder. The coder takes this probability and based on the nature of the coder tries to produce a bit stream with the fewest bits possible.
FIG. 2B relates to what transpires at the receiver in the decompression module. In this instance, the coded bit stream is provided to the decoder 204. Again, the modeler provides probability information to the decoder so that the decoder can take the coded bit stream and provide a stream of decoded events to the receiver as well as feeding the events back to the modeler. Such modelers are well known in the art.
Sometimes it may be desirable to use different codes in the same application and to have the output from the different coders appear in the same coded bit stream. One way to insure decodability is to have the encoder arrange the bits and the coded bit stream so that the decoder, by reading bits sequentially, will automatically extract those bits it needs, exactly when it needs them. This mechanism is called decoder-synchronized encoding. This technique has been described in literature, for example, "Bi-Level Image Coding with Melcode-Comparison of Block-Type Code and Arithmetic-Type Code", Ono et al., proceedings IEEE Global Telecommunications Conference, November 1989.
The prior art specifically describes interleaving codes referred to as run-length codes. These codes are based on the notion of producing a bit stream that represents how many occurrences in a row there are of more probable events. One example of such a run length code is referred to as a Golomb code described in "Run-Length Encodings", Golomb, IEEE Trans. Inform Theory It-12, (July 1966), 399-401.
As an example, run-length codes can be based on a two-symbol alphabet (M,L) where M refers to a more probable event and L refers to a less probable event. The Golomb code is designed to send a bit stream that is representative of the length of a run of the more probable symbol, that is the number of consecutive times the more probable symbol has appeared. For purposes of this explanation, assume that the run corresponds to "n" consecutive occurrence of the more probable symbol, M. Then in accordance with the Golomb code, n=uR+b where 0.ltoreq.b&lt;R and R is considered to be the Golomb code parameter. R is typically expressed as R=2.sup.P +Q. For ease of discussion it is assumed for the moment that Q=0. Also as an example, let's assume that R=8, that is the coder is looking for blocks of eight consecutive occurrences of the more probable event. Also assume for this example that the run length n=19. In that circumstance, solving the equation n=uR+b yields that 19=2.times.8+3, so u=2 and b=3. The coder then outputs u in unary form, that is "u" number of is followed by 0, for instance, here since u=2 then the output of u in unary is 110. To complete the code for n=19, b is output in P-bit binary. Here since the block size is 2.sup.P =8, P=3, so b is expressed in 3-bit binary form, or as 011. The code then for n=19 is 110011. This represents the following string of events MMMMMMMMMMMMMMMMMMML. Thus, 20 events (19M+1L) are represented in six bits.
It is known that if Q does not equal 0 then if 0.ltoreq.b&lt;2.sup.P -q, b is then output is a P-bit binary number while if 2P-Q.ltoreq.b&lt;2.sup.P then b+2.sup.P -Q is output as a (P+1) -bit binary number. However, for ease of discussion, while the present invention may be applied to such circumstances, the embodiment described herein will presume that Q=0.
The Ono paper describes a method for interleaving Golomb codes that have different R parameters. Block Melcode uses decoder-synchronized encoding that is accomplished by encoder buffering. If a run of n occurrences of the more probable symbol (M) is being coded using Golomb parameter R, a block of R consecutive occurrences of M will add a one to the unary part of the expression for n. This "1" can be queued for output immediately. After some number of these blocks of M, a block will occur containing b Ms followed by a "less probable symbol" L and the run will end. To be properly decodable a block's encoding must appear in the output stream when the decoder is ready to decode the first symbol in the block. The output of blocks coded with other parameters will usually appear between blocks coded with any given parameter. Ordering and buffering the output blocks is the essence of the block Melcode. To decode a symbol that was coded using parameter R the decoder extracts either a 1 indicating the occurrence of R consecutive Ms, or a 0 followed by P or P+1 bits which indicate (after decoding the value of b from those bits) the occurrence of b consecutive Ms followed by an L. Until all the symbols from the block have been decoded, the decoder will not have to extract any more bits from the coded data stream for symbols coded with parameter R. At the end of encoding, unfinished blocks can be coded as though they were filled with Ms. This merely leaves the decoder with unneeded symbols. A similar mechanism can be used to flush the output queue of the encoder if too many blocks are queued for output. In this case, the decoder can recognize the situation by tracking the encoder's queue size and can discard the unneeded symbols.
To this point only the encoder operation has been described, however the decoder works in a similar fashion. First since this is a decoder-synchronized encoding technique the decoder knows which bits in the bit stream apply to a given block Melcode. Thus, the modeler and decoder combination, such as in FIG. 2 can take a block Melcode associated with a given R parameter and recreate the string of events. For example, if the decoder detects "11101" for a block Melcode with R parameter equal to 2 then this corresponds to MMMMMMML.
The block Melcode is known to be somewhat inefficient in its compression performance. Even where the parameter R is selected based on the probability of the more probable event, there are inefficiencies introduced by the block Melcode. FIG. 3 provides a graphical representation of the percentage of inefficiency versus the probability of the more probable symbol with Golomb codes with various R parameters. It can be seen that a spike of as high as 4.2% inefficiency occurs in the area where the probability of M is approximately 0.618 where probabilities are expressed in the range of from 0 to 1. It would be advantageous to provide a technique by which this inefficiency of the block Melcode could be improved thereby enhancing the compression performance of a coder employing such block Melcodes.