The arithmetic coder was developed by Rissanen and first published in an article entitled "Generalized Kraft Inequality Arithmetic Coding", IBM Journal of Research and Development, Volume 20, No. 3, May 1976. The Arithmetic Coding procedure introduced by Rissanen permits the compression of multi-alphabet data, i.e. data each of whose symbols may be found within a multi-symbol alphabet. By "compression" of a source data string is meant reducing the amount of data associated with the source data string, without reducing its information content. Thus, by compressing the source data string, output data may be constructed having a lower data content than the original source data, whilst still permitting the source data in its entirety to be reconstructed.
Arithmetic coding procedures normally represent the output data string as a binary fraction within the unit interval (0,1). As is explained in "An Introduction to Arithmetic Coding" by Langdon, Jr. in the IBM Journal of Research and Development, Volume 28, No. 2, March 1984, arithmetic coding is related to the process of sub-dividing the unit interval. This sub-division is achieved by marking along the unit interval code points Cn for each symbol within the source alphabet, each code point being equal to the sum of the probabilities of occurrence of the preceding symbols. The width or size An of the sub-interval to the right of each code point represents the probability of occurrence of the source data string up to the corresponding symbol (cf. FIG. 1 of that publication).
Consider, for example, a source data string whose alphabet comprises symbols a0 to am, having probabilities of occurrence equal to p(0) to p(m), respectively. If the source data string is a0a5a3 . . . then the first symbol a0 will be encoded within the sub-interval (0,p(0)). This represents a first subinterval within the original unit interval whose width A1 is equal to p(0) corresponding simply to the probability of occurrence of symbol a0. In order to encode the second symbol a5 of the source data string, its probability of occurrence conditional on the probability of symbol a0 occurring must be determined. Furthermore, the cumulative probability S(5) associated with the second symbol a5 must be calculated. Thus, the sub-interval corresponding to the second symbol a5 is a second sub-interval within the first sub-interval corresponding to a0. Mathematically, the width A2 of the second sub-interval is equal to p(0)*p(5), i.e. the product of the probabilities of occurrence of both symbols a0 and a5. The starting point of the second sub-interval within the unit interval depends on the width A1 of the first sub-interval and the cumulative probability S(5) associated with the second symbol a5, being equal to their product A1*S(5).
Thus, as each symbol of the source data string is successively encoded within the unit interval, a succession of sub-intervals is generated each of which may be specified in terms of a specific code point and width. The code point for the current sub-interval corresponds to the start of the current sub-interval within the previous interval or sub-interval. As explained above, this is equal to the cumulative probability associated with the current symbol. Thus, the code point associated with the nth sub-interval will be equal to the code point associated with the n1th sub-interval plus the width of the n-1th sub-interval multiplied by the cumulative probability of the current symbol, i.e. Cn=Cn-1+AnS(i). The width of the new sub-interval itself will be equal to the product of the probabilities of all symbols (including the current one) so far encoded, i.e. p(0)*p(5)*p(3) . . . for the above source data string. The data corresponding to the width An and code points Cn of the nth sub-interval thus encode the first n+1 symbols in the source data string. Arithmetic coders therefore require two memory registers, usually called the A and C registers, for storing these data.
Although arithmetic coders produce optimum compression, corresponding to the entropy of the source data string, when based on the exact probabilities of occurrence of the symbols constituting the data string, in fact prior implementations of arithmetic coding procedures have tended to introduce approximations on account of the difficulty in determining the exact probabilities. Such approximations reduce the efficiency of the arithmetic coding operation and result in an output data string being generated which has more symbols than the theoretical minimum, or entropy. Moreover, further approximations have been introduced in order to eliminate the multiplication operation, which is required for determining the width of each successive sub-interval.
Thus, for example, U.S. Pat. No. 4,286,256 (Langdon, Jr. et al.) discloses a method and means for arithmetic coding using a reduced number of operations. Langdon simplifies the multiplication operation by truncating one of the inner products corresponding to the width of the sub-interval prior to encoding the current code-point. However, Langdon's method is suitable only for binary sources (i.e. alphabets containing only two symbols) wherein it is possible to encode each symbol of the source data string either as a more probable or less probable event. This procedure is unsuitable for multi-alphabet codes.
U.S. Pat. No. 4,652,856 (Mohiuddin et al.) discloses a multiplication-free multi-alphabet arithmetic code in which each sub-interval is stored in floating point format, as explained above, such that the mantissa stored within the A register is a binary fraction greater than 0.1. In accordance with the approximation proposed by Mohiuddin, a variable criterion is adopted which either truncates the mantissa of the sub-interval to exactly 0.1 (binary) or, alternatively, rounds it up to 1. Such an approximation still achieves the desired compression, but at a loss of efficiency. In other words, more bits than the minimum are required for representing the compressed data string. The inefficiency associated with Mohiuddin's procedure depends on the nature of the source data being compressed.
Our co-pending Israel Patent Application No. 86993 discloses an improved method of generating a compressed representation of a source data string each symbol of which is taken from a finite set of m+1 symbols, ao to am. The method is based on an arithmetic coding procedure wherein the source data string is recursively generated as successive sub-intervals within a predetermined interval. The width of each sub-interval is theoretically equal to the width of the previous sub-interval multiplied by the probability of the current symbol. The improvement derives from approximating the width of the previous sub-interval so that the approximating can be achieved by a single shift and add operation using a suitable shift register.
In the above-mentioned patent application, a detailed worked example is given of the proposed method for encoding a source data string having 7 symbols taken from a 5 symbol alphabet. It was assumed, for ease of explanation, that the probabilities of occurrence for each symbol were known and invariable. In fact, the method is equally well suited to the more general situation wherein the probability of occurrence for each symbol varies for different occurrences of the same symbol along the source data string. Nevertheless, Israel Patent Application No. 86993 is not concerned with the derivation of probabilities but only with the specific implementation of an improved arithmetic coder.
In fact, it is well known that the probability of a symbol in a source data string depends on the context in which the symbol appears. By "context" is meant the pattern of symbols in the immediate vicinity of the symbol in question. Thus, the context of a particular symbol in a one-dimensional string may include one or more symbols on either or both sides of the particular symbol. Thus, for example, if the source data string represents pixel information in an image processing system, wherein each pixel can be either 0 or 1, depending on whether it is black or white in colour, respectively, then clearly in a dark section of the image all of the pixels are 0, whilst in a bright area of the image all of the pixels are 1. Thus, a particular pixel within a bright area will be surrounded on all four sides by pixels having a value of 1. If, when determining the probability of a particular pixel, we consider the context of the two preceding pixels, then it is clear that the context of a pixel in the middle of a bright section of the image will be equal to 11X (X being the particular pixel).
Rephrasing what has been stated above, it is clear that the probability of a particular symbol depends upon its context. Thus, in the example given above, if it is known that the context of a symbol is 11X, then it is much more likely that the symbol in question is 1 than if the context were 00. Furthermore, if the context of the symbol were not limited merely to two symbols but were increased to a greater number of symbols, then the probability of a particular symbol as a function of its context could be determined even more accurately. Thus, if the context comprises 10 symbols, all of which are equal to 1, then the probability that the symbol in question is equal to 0 is much more remote than if the context were only two symbols, both equal to 1.
Determining the probability of a symbol as a function of its context is well known in the art and may be employed in any of the prior art patent specification referred to above. It is also known that the information content of a symbol is given by: EQU i=-log2p (1)
where p is equal to the probability of occurrence of the symbol.
It may thus be shown that when transmitting binary data, wherein each symbol is either 0 or 1, the average number of bits appearing in the compressed data string for each symbol appearing in the source data string is equal to the expected value of i, i.e. EQU Average No. of Bits=-{plog2p+(1-p)log2(1-p)} (2)
From the above equation, it follows readily that, for a binary alphabet, when the probability of occurrence of a symbol is equal to 0.5, the average number of bits required to compress the symbol is equal to 1. In other words, compression is not possible. For an alphabet having n symbols, compression is impossible when the probability of a symbol is equal to 1/n.
Consider again the example described above the regard to the compression of image data. In the simple binary case corresponding to dark and bright areas of the image, the average number of bits required to compress each pixel whose context is either . . . 000 . . . or . . . 111 . . . will be significantly less than 1, thereby resulting in efficient compression of the source data. However, the probability of a particular pixel at the border between a dark and bright section of the image is equal to 0.5 since, based on the context including the current symbol and a predetermined number of pixels in the dark area, it would be assumed that the current pixel is 0, whilst based on a context including the current symbol and the same number of pixels in the bright area, it would be equally well expected that the current symbol is 1. Thus, when transmitting image data, a high price must be paid in terms of information in order unambiguously to transmit the data corresponding to those pixels at the border between substantially dark and substantially bright areas of the image.