The invention concerns a simple text compression method that uses some "negative" information about the text, which is described in terms of antidictionaries. Contrary to other methods that use, as a main tool, dictionaries, i.e. particular sets of words occurring as factors in the text, our method takes advantage from words that do not occur as factors in the text, i.e. that are forbidden. Such sets of words are called here antidictionaries.
More particularly, our invention concerns a data encoding and a data decoding process.
In the encoding process, data are converted from a decoded state
into an encoded state in which: PA1 the data, both in the encoded state and in the decoded state, are in the form of a stream of binary information, PA1 by processing the decoded string of data from left to right, its is current prefix and the next binary information being considered, PA1 a list of binary patterns being registered, in which for each is emphasized the last binary information and the corresponding prefix, PA1 a comparison is made between the current prefix of the decoded string and the prefixes of registered patterns. PA1 the data, both in the encoded state and in the decoded state, are in the form of a stream of binary information, PA1 by processing the encoded string of data from left to right, its current prefix being considered, PA1 a list of binary patterns being registered, in which for each is emphasized the last binary information and the corresponding prefix, PA1 a comparison is made between the current prefix of the decoded string and the prefixes of registered patterns. PA1 the list of registered patterns is finite, PA1 patterns are binary words, PA1 an algorithm is used to compute the list of registered patterns, PA1 during the encoding process, the data stream is read a first time to construct the list of registered patterns and a second time to convert said data stream, PA1 an encoder sends a message z in the form (x, y, .sigma.(n)) to a decoder, where x is a description of a list of the registered patterns, y is the encoded data stream and .sigma.(n) is the usual binary code of the length n of the data stream.
The list of patterns is a set of patterns that do not occur and that when a prefix of the decoded string matches with a prefix of a registered pattern, the next binary information of the decoded string is omitted from the decoded stream to make the encoded stream.
In the decoding process, data are converted from an encoded state into a decoded state in which:
The list of patterns is a set of patterns that do not occur and that when a prefix of the decoded string matches with a prefix of a registered pattern, a binary information opposite to the next binary information of the registered matching pattern is inserted in the encoded stream to make the decoded stream.
In preferred embodiments: