Since the available frequency bandwidth of a conventional transmission channel is limited in digitally televised systems, in order to transmit the large amount of digital data therethrough, it is necessary to compress or reduce the volume of data through the use of various data compression techniques.
Although many different classes of data compression techniques are known in the art, one of the most useful is the class of dictionary-based universal compression techniques. Among these, the most widely used is a standard coding algorithm, e.g., CCITT (International Telegraph and Telephone Consultative Committee) V.42bis, established by CCITT, wherein the standard coding algorithm is developed as a practical technique for compressing data based on the so-called Ziv-Lempel coding algorithm.
In the CCITT V.42bis, an encoding and a decoding apparatus are provided, each having a fixed, finite amount of memory as a codeword storing circuit. This memory, also referred to as a "dictionary", is adapted to contain a finite number of codewords corresponding to strings of characters. Each string has a unique codeword associated therewith. Dictionaries in the encoding apparatus and the decoding apparatus may be initialized at the beginning to contain identical information.
In a conventional data compression or encoding apparatus to encode an input stream of characters, e.g., alphabets, employing a conventional string matching technique, if the input stream of characters is inputted thereto on a character-by-character basis, a longest matched string of characters, i.e., a maximum length string, is matched with one of a plurality of codewords or pointers within a dictionary thereof.
In other words, the conventional string matching technique includes a step of parsing an input stream of characters into parsed strings, wherein each parsed string is a longest matched string of characters.
Specifically, a string, i.e., a sequence of characters, is formed from a first character and, if the string matches with a codeword of the dictionary, then a next character will be read and appended to the string to repeat this step.
If there is no codeword that matches with the string in the dictionary, the last character appended to the string will be removed to generate a longest matched string of characters, wherein the string shortened represents the longest matched string and the last character represents an unmatched character.
Then, the conventional encoding apparatus detects a codeword corresponding to the longest matched string within the dictionary and then encodes the codeword to thereby provide an encoded codeword.
It should be noted that the conventional encoding apparatus may include a codeword updating circuit. The codeword updating circuit deletes codewords corresponding to insignificant strings from the dictionary, so as to provide the dictionary with room for storing codewords corresponding to frequently occurring strings of characters, wherein the insignificant strings are selected among infrequently used strings.
Lengths of consecutive strings generated by the conventional encoding apparatus are independent of each other. A compression rate for a string can be characterized by a corresponding character length of the string. Hence, there frequently occurs a great deviation or difference of compression rates between consecutive strings in the conventional encoding apparatus.
Therefore, as deviations of compression rates for consecutive strings accumulate, the so-called overflow effect of encoded codewords becomes severe at an output end of the encoder in the conventional encoding apparatus, thereby deteriorating the transmission efficiency thereof.