In recent years, arithmetic coding is another extremely practical lossless compression algorithm. A core idea of the arithmetic coding is that all encoded symbols that may appear in source data are mapped to an integer set, and a corresponding appearance probability is assigned to each encoded symbol (it is required that a sum of appearance probabilities of all characters is 1 herein). Each character occupies a half-open and half-closed consecutive interval in an interval [0, 1] according to the appearance probability of each character, an interval length value is a probability value, and intervals are mutually independent. A string that needs to be encoded is mapped into an integer sequence according to a mapping table. The source data is gradually converted, according to appearance probabilities of the encoded symbols in the source data that needs to be encoded, into a real number interval corresponding to the interval [0, 1]. A real number in the interval is used as a code value and is saved in a computer. An interval for performing encoding next time is an interval obtained by means of encoding previous time. Appearance probability ratios of all the symbols remain the same each time. During decoding, the binary code value is resaved to the corresponding integer sequence according to inverse conversion, and then the integer sequence is mapped to the original string. For example, for integer set space {0, 1, 2, 3}, appearance probability distribution is {0.2, 0.5, 0.2, 0.1}. In this case, corresponding to data whose input sequence is <210013>, encoding intervals are sequentially [0.7, 0.9], [0.74, 0.84], [0.74, 0.76], [0.74, 0.744], [0.7408, 0.7428], and [0.7426, 0.7428]. Finally, a code value interval corresponding to the data is [0.7426, 0.7428] (the encoding interval corresponding to the last character sequence), and a code value of the data is a value in [0.7426, 0.7428].
For to-be-encoded data, in existing arithmetic coding, data is directly compressed without considering whether there is a compression gain, and then a corresponding code value obtained after the arithmetic coding is saved. However, data storage space is increased in the prior art because a quantity of bits of a code value corresponding to some data is large.