1. Field of the Invention
The present invention relates to a data search circuit for data compression, and more particularly to a search circuit for data compression which searches repeat data existing in original data composed of text data to compress the length of the original data.
2. Related Art
Multimedia has been advanced in recent years and therefore a variety of systems need to handle massive amounts of various data such as image data, speech data, document data, and programs. When massive amounts of data are stored or transferred to memory devices, compressing data is very effective in costs and processing speed, so data compression techniques are becoming more important. Particularly, lossless compression has a wide range of application, because restored data exactly matches with original data before it is compressed and there is no loss of information.
A compression algoritnh (hereinafter referred to as simply LZ1) proposed as a sort of lossless compression in 1977 by Lemple and Ziv is a useful compression method in which a high compression rate is obtained with respect to a variety of data, as compared with entropy codes such as Huffman codes that are well known. LZ1 is a method in which repeat data included in original data is searched and replaced with another code (pointer indicating the position of the same data that has already appeared, and data length of repeat data) to remove redundancy and compress data. However, the repeat data is variable length data whose length is unpredictable, so there was the problem that the search process is complicated and the compressing process including search of this repeat data is difficult to perform at high speed with hardware.
For this reason, in Published Unexamined Patent Application No. 5-233212, there has been proposed a device for compressing data in which search of repeat data (and compression of data) is performed by a content addressable memory (hereinafter referred to as "CAM") as hardware used in search. In general memories, if the address of a storage position is specified, the data stored in a storage area corresponding to that address will be output, but in the CAM, if data is input, the address of a storage area where that data is stored will be output. More particularly, the CAM includes a comparison circuit for each memory portion for storing data having a predetermined bit length (for example, 8-bit data), and each comparison circuit compares data stored in the memory portion with input data and outputs a signal corresponding to the comparison result. A pair of a memory portion and a comparison circuit will hereinafter be referred to as a CAM cell row.
In the device shown in the above-described Published Unexamined Patent Application No. 5-233212, with respect to original data such as text data, unit data with a predetermined bit length (for example, character data of one character) is taken out from the head of the original data in sequence and input to each CAM cell row as a searched character, and comparison is performed in each CAM cell row every time. Also, based on a signal output from the comparison circuit of each CAM cell row, it is judged whether a repeat character string exists or not. The judgment of the existence of a repeat character string based on a signal output from the comparison circuit of each CAM cell row is performed with a circuit shown in FIG. 16.
In FIG. 16, search data is held in a write buffer and input to each of the CAM cell rows at the same time. The circuit for judging whether a repeat character string exists includes latches ML0 to MLN for holding a result of single character comparison, signal generation circuits 150.sub.0 to 150.sub.N, latches PS0 to PSN for holding a result of character string search, and OR circuits 158 and 160. The CAM cell rows are connected to match lines MATCH0 to MATCHN, respectively. The CAM cell rows perform comparison once with respect to a search character that was input, and make a corresponding match line a high level when the comparison result is "match" and make a corresponding match a low level when the comparison result is "non-match." The levels of the match lines are held in the latches ML0 to MLN once and then they are output to the signal generation circuits 150.sub.0 to 150.sub.N.
Although FIG. 16 shows the internal constitution of the signal generation circuit 150.sub.1 only, the signal generation circuits 150.sub.0 to 150.sub.N are the same constitutions. In the signal generation circuit 150.sub.1, if a feedback signal ORFB output from the OR circuit 160 is a low level, a signal output from the latch ML1 (signal equivalent to a search result of a single character) will be output to the latch PS1, and if the feedback signal ORFB is a high level, a signal equivalent to the logical product between a signal output from the latch PS0 of the preceding stage and a signal from the latch ML1 (signal equivalent to a search result of a character string having a length of two characters or more) will be output to the latch PS1. The signal held in the latch PS1 is output at the next cycle to a first priority encoder 162 and the signal generation circuit 150.sub.2 of the next stage.
The feedback signal ORFB that is output from the OR circuit 160 is the logical sum of signals output from the AND circuits 154.sub.0 to 154.sub.N of the signal generation circuits 150.sub.0 to 150.sub.N, the signal output from each AND circuit 154 being representative of the logical product between the output signal of the latch ML and the output signal of the latch PS of the preceding stage. If the signals from the latches ML and PS are input to each signal generation circuit 150 and then each AND circuit 154 output a signal, the signal from the AND signal circuit 154 will propagate through a signal path passing through the OR circuits 158 and 160 and returning to each signal generation circuit, within a period equivalent to one cycle of a clock signal, and will be input as a feedback signal ORFB to each signal generation circuit.
The feedback signal ORFB means whether in the entire search device a repeat character string with a length of two characters or more was found or not. If the feedback signal ORFB is a high level, the signal generation circuits 150.sub.0 to 150.sub.N will output a signal representative of whether there exists a character string in which a character searched this time was added to repeat character strings found in the comparisons performed up to the last time. If the feedback signal ORFB is a low level, the signal generation circuits 150.sub.0 to 150.sub.N will output to the latch PS the signal which was input from the latch ML for searching a repeat character string starting with the character searched this time.
In the first priority encoder 162, a signal equivalent to the logical sum of the signals input from the latches PS0 to PSN is output as a match signal MSIG1 to a second priority encoder 164, and an address of the latch PS that output a high level signal is output as a match address MADR1 to the second priority encoder 164. When the searches of repeat character string are performed with a plurality of CAMs in parallel, similar signals from the first priority encoders of other CAMs are input to the second priority encoder 164. Then, the second priority encoder 164 outputs to a controller (not shown) a match signal MSIG and a match address MADR of the plurality of CAMs according to the input signals. Also, when the searches of repeat character string are performed with a plurality of CAMs in parallel, signals from the signal generation circuits of other CAMs (not shown) are likewise input to the OR circuit 160.
If in the circuit of FIG. 16, a current cycle is expressed by m, a comparison signal output from the CAM cell row of an address n by ML(n,m), a logical product by ".times.," a logical sum by "+," and the maximum value of the address n by N, and also the address -1 is expressed by N, number of CAM Blocks is 1, the match of the comparison result by "1," and the non-match of the comparison result by "0," the feedback signal ORFB, the signal PS held in the latch PS, and the match signal MSIG will be given by the following equation (1) EQU ORFB={ML(0,m).times.PS(N,m)}+{ML(1,m0.times.PS(0,m)}+. . . +{ML(N,m).times.PS(N-1,m)} EQU when ORFB=1, PS(n,m+1)=ML(n,m).times.PS(n-1,m) when ORFB=0, PS(n,m+1)=ML(n,m) MSIG=S(0,m)+PS(1,m)+. . . +PS(N,m) Equation (1)
For reference, if the match of the comparison result is expressed by "0" and the non-match of the comparison result is expressed by "1," the feedback signal ORFB, the signal PS, and the match signal MSIG will be given by the following equation (2) EQU ORFB={ML(0,m)+PS(N,m)}.times.{ML(1,m0+PS(0,m)}.times.. . . .times.{ML(N,m)+PS(N-1,m)} EQU when ORFB=1, PS(n,m+1)=ML(n,m)+PS(n-1,m) when ORFB=0, PS(n,m+1)=ML(n,m) MSIG=S(0,m).times.PS(1,m).times.. . . .times.PS(N,m) Equation (2)
Also, in the circuit of FIG. 16, a character data string of "ABABBC" has already been in stored in the CAM cell rows of addresses 0 to 5 in ascendant order as an example. Also, in a case where character data are input as search data in order of "ABBBC . . . ," the level held in each latch and the transition of the level of each signal are shown in FIG. 17, and the timing diagram at that time is shown in FIG. 18.
However, in the above-described device the feedback signal ORFB is generated with the signals output from a large number of signal generation circuits 150 provided for each CAM cell row, and the generated feedback signal ORFB is input to each of the signal generation circuits 150, so there is the need of providing a large number of signal lines for guiding to the OR circuits 158 and 160 the signals output from the signal generation circuits 150 and also a large number of signal lines for guiding to a large number of signal generation circuits the feedback signal ORFB output from the OR circuit 160. As a result, the portion in which these signal lines occupy a large area (about 10%) of the entire circuit. Therefore, there was the problem that the scale of the circuit is large and the size of the device is difficult to reduce.
Note that the portion which generates a match signal MSIG equivalent to the logical sum of the signals input from the latches PS0 to PSN to the priority encoder, also takes a relatively large area of the entire circuit.
Also, in the above-described device, since the physical length of the signal path leading from the signal generation circuits through the OR circuits 158 and 160 to the signal generation circuits is long, a signal takes a relatively long time to pass through that signal path ("delay" of FIG. 18), and that signal path is the critical path of the search device of FIG. 16. It is necessary, in the signal generation circuits of the search device of FIG. 16, that a signal propagate through the signal path returning from the signal generation circuits through the OR circuits 158 and 160 to the signal generation circuits, within a period equivalent to one cycle of a clock signal, and be input as a feedback signal ORFB to each signal generation circuit. Therefore, the operational speed (cycle of clock signal) of the entire circuit is determined according to the time required for generating the feedback signal ORFB, so there was the problem that the processing speed is low.