This invention relates to detecting single-bit errors occurring in an arithmetic data compression code string C(s), embodied as a number in the semi-open coding range [0,1], the code stream C(s) being computed according to an arithmetically recursive function. More particularly, the invention relates to detection of single-bit errors produced by an arithmetic encoder, introduced into the code channel, or produced by an arithmetic decoder. Detection of such single-bit errors is enabled by n-scaling of the recursive function to produce a compressed binary representation C'(s) in the semi-open coding range [0,1] and testing C'(s) by a modulo-n function for a non-zero residue.
Arithmetic compression codes are established in the prior art and may be understood for the binary source case by reference to Langdon and Rissanen, "A Simple General Binary Source Code," IEEE Transactions On Information Theory, Volume IT-29, No. 5, September 1982. The following issued United States patents trace the development of arithmetic data compression coding, and are incorporated herein by reference: U.S. Pat. No. 4,122,440, of Langdon, Jr. et al.; U.S. Pat. No. 4,286,256, of Langdon, Jr. et al.; U.S. Pat. No. 4,295,125, of Langdon, Jr.; and U.S. Pat. No. 4,467,317, of Langdon, Jr. et al. These references generally relate to the theory of arithmetic compression coding and offer a number of encoder embodiments useful for operating on unencoded data streams drawn from binary alphabets.
As taught in U.S. Pat. No. 4,467,317, high-speed arithmetic data compression coding is a sequential process that recursively adds augends to the significant end of a so-far generated code string C(s) in response to an unencoded data string s. As stated in the referenced patent, the coding process depends upon coding parameters provided by a statistical model of the string. (As in the past, the inventors here are concerned only with the process of coding, the modeling process being well understood. Arithmetic data compression coding generates an arithmetic code stream C(s) which is a number contained in the semi-open coding range [0,1]. In the general case, the source string s=m(1), m(2), . . . , (m) (i), . . . consists of m-ary symbols (m). The next symbol, m(i) in s to be encoded has a joint probability that depends upon the probability of the portion of s that has preceded m(i). It should be evident that m(i) occurs further toward the end of the string s, its joint probability declines. Operatively, the arithmetic coding process reflects this attenuation of the joint probability of i by successively subdividing the available coding range. A subinterval of the range corresponding to the portion of the string s predecing m(i) that has already been encoded is defined by a lower bound, C(s), positioned in the range, and a value A(s) that sizes the subinterval. Thus, the subinterval corresponding to the portion of the coding range available to encode m(i) is expressed as [C(s),C(s)+A(s)].
During the cycle of the recursive function in which the next symbol m(i) is encoded, the subinterval magnitude A(s) is subdivided into as many parts as there are source symbols, with each part's magnitude corresponding to the conditional probability of the source symbol represented. The augend, the value added to C(s) to encode the next symbol m(i), is the sum of the conditional probability magnitudes of the symbols preceding m(i) in the source alphabet. This, of course implies the imposition of some prearranged order on the source alphabet, and in the binary source coding construct of U.S. Pat. No. 4,467,317, this arbitrary order is: LPS (least-probable symbol), MPS (most-probable symbol).
Subdivision of A(s) for the purpose of encoding the next symbol m(i) requires partitioning the present interval size A(s) into as many parts as there are symbols in the source alphabet. In the art, the mangitudes of the subdivisions are approximated according to a control parameter k provided by the source model. For the binary source alphabet, two sizes result from subdividing A(s), with one being assigned to the MPS and the other to the LPS. The prior art subdivision operation results in the following specific magnitudes: EQU size 1: A(s).multidot.(1=2.sup.-k) (1a) EQU size 2: A(s).multidot.2.sup.-k ( 1b)
resulting in the following subinterval coding: EQU IF LPS: EQU [C(s), C(s)+size 1] (2a) EQU If MPS: EQU [C(s)+size 1, C(s)+size (2)] (2b)
As can be seen from inspection of the MPS subinterval structure, an augend, size 1, is added to C(s) to form a new lower bound.
Equations (2a) and (2b) imply that the action of an arithmetic encoder is sequential in the sense that a coding step is undertaken in response to each symbol in the string s. Further, the procedure is recursive in the sense that the resulting code string value C(s) and the current subinterval magnitude A(s) are modified each step to a value determined by their respective values at the end of the previous step.
The prior art arithmetic data compression coding recursions for the binary alphabet are as follows: EQU For each MPS: EQU C(s.multidot.MPS)=C(s)+2.sup.-k ( 3a) EQU A(s.multidot.MPS)=A(s)-2.sup.-k ( 3b) EQU For each LPS: EQU C(s.multidot.LPS)=C(s) (3c) EQU A(s.multidot.LPS)=2.sup.-k ( 3d)
In this prior art, decoding involves recursive examination of the magnitude of the most significant part of the code string C(s) and determining whether the current augend exceeds the remaining numerical value of the code string. Each such examination involves a trial subtraction in which the augend is tentatively subtracted out of the code string. If the trial result is negative, the subtraction is nullified and the current symbol is decoded as LPS, otherwise, the subtraction is let stand and the symbol is decoded as MPS.
The algorithm for compressing and decompressing the string s involve the execution of binary numeric methods to produce C(s) and to recover the decoded string s. As is known, the procedures employed to encode and decode according to the recursions 3(a)-3(d) result in high-entropy code strings, which provide no evidence to decude whether the string was subject to random errors. Since, it is expected that, in the usual operational environments, an encoded data stream will be subjected to error sources which will corrupt its information content, it is essential to provide a means for protecting the integrity of the information. One means of detecting potential errors within an arithmetic encoder, an arithmetic decoder, and a channel over which an arithmetic code string is transmitted is suggested by "AN" arithmetic error coding. "AN" arithmetic error coding is a technique in which a first integer N, is encoded by multiplying it with a constant second integer A, to increase its redundance and thereby obtain error protection. Inspection of the first integer's AN representation by dividing it with the second integer, will reveal errors in the first integer when the result of such a division does not have a remainder of zero. Note that the second integer, A, must be odd and not equal to 1 in order to provide single error detection. To avoid confusion with earlier use of the symbol "A" in arithmetic compression coding, we will hereafter use the symbol "n" to represent the integer multiplication factor normally called "A" in "AN" arithmetic error coding. We will continue to use the established conventions in these fields for positioning the binary point when describing a code stream as a binary number: (1) For arithmetic compression coding the binary point is left of the most-significant-bit of the code stream; (2) For arithmetic error coding the binary point is right of the least-significant-bit of the code stream.