1. Field of the Invention
This invention relates to data compression. The invention is particularly, though not exclusively, applicable in the field of image data compression.
2. Description of the Prior Art
It has been proposed to provide an image data processing system in which image data is decorrelated into sub-band components, quantised and then entropy coded. The quantisation provides some degree of data compression with some loss of information content. The subsequent entropy coding effects a further degree of data compression with no loss of information content.
One known entropy encoding technique is so called run length coding. A typical run length coder looks for sequences of successive zeros within a data stream and assigns a code word to substitute for each sequence of zeros within the data stream. When the data stream is subsequently read, the run-length codes can be expanded to recreate the original data stream.
A variation on this arrangement is the type of entropy encoding proposed in the standard being devised by the Joint Photographic Experts Group (JPEG) and currently under review by the International Standards Organisation. The run-length coding scheme proposed by the JPEG standard operates on an input stream of samples of twelve or more bits resolution. A sequence of successive zero value samples terminated by a non-zero value sample is treated as an "event". Each event is coded according to the number of samples in the run and the value of the terminating sample. Thus, for example, a run of input samples with values 0000004 is assigned a run-length code of 7 and a terminating value of 4. A run of samples with values 00-2 is assigned a run-length code of 3 and a terminating value of -2. A single sample of value 6 in an input stream (i.e. a sample of value 6 not following one or more zero value samples) is assigned a run-length code of 1 and a terminating value of 6.
The possible terminating values are notionally divided into groups for the purpose of the subsequent stage of entropy coding. Each group contains a symmetrical set of positive and negative terminating values, the number of values in each group being equal to a respective power of two. The assignment of terminating values to groups is illustrated in the table of FIG. 1. The first group contains the terminating values -1 and +1. The second group contains the terminating values -3,-2,2,3. The third group contains the values -7,-6,-5,-4,4,5,6,7, and so on. As indicated in the left hand column of the table, each group is identified by a 4-bit group code with a value between 1 and 15 though only the groups with codes 1 to 11 are shown in the table.
Each run-length coded event is assigned a 4-bit group code depending on the terminating value in accordance with the table of FIG. 1. Thus, a run of any length terminated by a value of .+-.1 will be assigned the group code 1 (i.e. 0001 in binary). A run of any length terminated by a value of .+-.2 or .+-.3 will be assigned a group code of 2 (i.e. 0010) and so on. The data is then subjected to Huffman encoding which is a form of commaless encoding whereby events are mapped to a set of codes having the property that no valid code is a prefix of a longer code. The Huffman codes are assigned according to input code popularity, the most common events being mapped to the shortest Huffman codes. A Huffman code is available for each possible combination of run-length code and group code. Thus, a given Huffman code identifies the run-length of the associated event and also the group to which the terminating value is allocated, but not the particular terminating value within that group. The final entropy encoded output for each event is obtained by appending to each Huffman code a "PCM code" consisting of enough additional bits to identify uniquely the particular terminating value within the group for that event. It will be noted from the table of FIG. 1, that, due to the way in which terminating values are allocated to groups, the group code indicates the number of additional bits needed for the PCM code to enable a unique bit sequence to be assigned to each terminating value in the group. For example, the group with code 3 contains eight terminating values, three bits being required to give a unique identification to each of the eight terminating values.
It can be seen that this technique of notionally assigning terminating values to groups substantially reduces the number of Huffman codes required. Events of a given run-length with large terminating values are allocated the same Huffman code, unique identification of the particular event being provided by the subsequent PCM code of length given by the group code. The total number of Huffman codes required is thus greatly reduced as compared with the number of codes that would be required for individual coding of each run-length and terminating value combination.
As indicated above, the JPEG coding scheme operates on samples of twelve or more bits resolution, each sample being processed according to its overall value. This "word-wise" approach can complicate implementation of the scheme and, in particular, makes the scheme difficult to implement in an ASIC (Application Specific Integrated Circuit) design. Further, while the JPEG scheme provides a Huffman code for a maximum run-length of 16 zeros, Huffman codes are otherwise available only for runs terminated by a non-zero value. Thus, the scheme does not provide a means of coding Puns of zeros terminated by a zero where the total run length is less than 16. This can cause problems in applications which demand a fixed block structure such that coding of data is performed on blocks of input data samples of a fixed maximum size as is the case in digital video tape recorders (DVTRs). If the data block ends with a run of zero value samples with a run-length of less than 16, then the JPEG scheme does not provide a means of coding this run and so the data block cannot be coded precisely. In addition, the JPEG scheme provides Huffman codes for very unlikely events, such as, for example, a run of 15 zeros terminated by a value of 2047.