Deflate compression (as defined by the DEFLATE Compressed Data Format Specification RFC 1951) is a compression method that is a base for ZLIB (as defined by the ZLIB Compressed Data Format Specification RFC 1950) and GZIP (as defined by the GZIP File Format Specification RFC 1952), which are data compression formats currently widely used in computers. In deflate compression, data is compressed using the LZ77 code. In the LZ77 code, a repeating portion of a character string in data is searched for, and the data is compressed by replacing the character string with the position and length of the repeating portion. For example, when LZ77-encoding of a character string “IBM is IBM” is performed, the second “IBM” is a repeating portion, and thus this portion is compressed. Specifically, the portion is compressed by being replaced with a code such as “7,3”, signifying that “3 character length is repeated from 7 characters ahead”. In this case, the longer the character length of a repeating portion is, the higher the compression rate is.
In the specifications of deflate compression, since a search for a repeating portion of a character string is performed for data up to 32 Kbytes ahead of the character string, a significant number of character string comparison operations need to be performed to search for a repeating portion of a character string.
Thus, the operations require significant time when being performed by software. Generally, in software, hashing is used in an attempt to reduce the search time. However, in hashing, when there are many character strings having the same hash value, some of the character strings may be discarded. Thus, a problem exists in that, in view of the processing time and the buffer size, it is difficult to precisely search for all character strings.
Accordingly, methods are provided in the known art for precisely searching all character strings using hardware. Furthermore, in accordance with the methods, a character string can be searched for rapidly.
In Japanese Unexamined Patent Application Publication No. 7-114577, when a character string (BABCABB . . . ) that is stored in content addressable memory cell rows in sequence is searched for a search character string (ABCA), comparison with the first character (A) is performed in all of the cell rows, and comparison with the next character (B) is performed only in respective cell rows at addresses (2) and (5) adjacent to the last matching cell rows. Similarly, comparison with the next character (C) is performed only in respective cell rows at addresses (3) and (6), and comparison with the last character (A) is performed only in a cell row at address (4). In this manner, a search operation is completed in a short period of time.
In Japanese Unexamined Patent Application Publication No. 8-147986, a first switching means is provided on a match line between the opposite side of a switching element to a ground terminal and a power supply, and power consumption is reduced by shortening a period in which a through current flows by turning on the first switching means during a period in which a content addressable memory is turning on or off the switching element in response to a result of comparison or during a period that is a part of a preparation period before the comparison.
In Japanese Unexamined Patent Application Publication No. 8-242176, the results of comparing search characters in a write buffer with character data stored in cell rows in a content addressable memory (CAM) are stored in first and second latches in sequence. When an input signal is low, each signal generation circuit outputs the AND result between the output of the corresponding first latch and the output of a corresponding preceding third latch to a priority encoder and an OR circuit via the corresponding third latch. When the input signal is high, the signal generation circuit outputs the AND result between the output of the first latch and the output of the corresponding preceding second latch to the priority encoder and the OR circuit via the third latch. First and second priority encoders output the respective OR results of input signals. A signal output from the OR circuit is input to each of the signal generation circuits via a corresponding fourth latch and another OR circuit. In this manner, the length of a path through which a signal needs to pass in a clock cycle is reduced by half, thereby improving the processing speed.
The probability is high that a repeating portion of a certain character string will occur in a position relatively close to the character string. Thus, in a method such as deflate compression, the compression rate is improved by assigning a shorter bit length to a repeating portion residing in a position closer to the character string.
Accordingly, in a memory to which data is cyclically written, when a repeating portion of a certain character string is stored in a plurality of positions, it is necessary to select a repeating portion stored in one of the plurality of positions. In such case, the compression rate is improved by selecting a repeating portion close to the character string, considering the order of writing, instead of using a priority encoder as in the aforementioned patent references, i.e., selecting a repeating portion according to fixed priority with respect to memory positions. That is, generally speaking, when data is stored in a plurality of positions, it is advantageous to select a position in which the data has been recently written.