The present invention relates to a semiconductor integrated circuit such a logical LSI, microcomputer or the like. More particularly, the present invention relates to a novel function apparatus (or module) which improves the speed of a symbol string search processing, a single-chip microcomputer which includes such an apparatus (or module) and a system which uses such a microcomputer.
As the amount of document data handled by an information processing equipment increases, the demand for performing a high-speed data search has also increased. In a full text search in which the search of text data is made by use of not an index but an arbitrarily set keyword, it is important to make at a high speed a so-called string search or a character string search which makes a search for a keyword existing in text data.
A high-speed algorithm for search of text data for a plurality of keywords has hitherto been known and is realized by a software on a general purpose processor. However, it is difficult to ensure a practical speed in the search of a large-scale database. Recently, there are proposed high-speed techniques which use special purpose hardwares in order to obtain a sufficient search speed. One example of such techniques is disclosed in JP-A-64-42784 entitled "CHARACTER STRING COMPARE APPARATUS AND HIERARCHICAL CHARACTER STRING COMPARE SYSTEM IN THE APPARATUS".
The prior art disclosed by the JP-A-64-42784 concerns a special purpose LSI for use in a character string search. The LSI includes a memory area for registering keywords therein and a logical circuit area for performing a search by comparing the keywords and text data character by character. The plurality of keywords are registered in the memory area and the text data is searched for by the keywords. The number of keywords and the keyword length, which can be set simultaneously, are restricted by the size of the memory area. The above prior art teaches a method and means by which many keywords can be registered by saving the memory area.
More especially, each keyword is symbolized by hierarchical division thereof into short character strings. In the case where a plurality of analogous composite words are set as keywords, that is, in the case where the same character string pattern appears in a plurality of keywords, it is possible to efficiently utilize the memory area since a divisional character string pattern can be used in common. Accordingly, the number of keywords increases which can be registered. However, since the hierarchical symbolization is made, a procedure for making a matching with the original keyword becomes necessary. This causes the increase in a processing time. Also, since the overall search processing is performed by the LSI for a character string search, there is involved a problem that the upper limit of the scale of a circuit, which can be formed with an LSI configuration, restricts the function and the number of keywords capable of being processed and hence only a character string search within the restricted range is possible.
As mentioned, the above prior art has the problem that the search function and the number of keywords capable of being simultaneously searched out or for are restricted by the scale of the hardware. In a character string search, there may frequently be required a so-called approximate search function, that is, a function of searching text data even for keywords which do not exactly match a desired keyword. Therefore, the circuit scale has a tendency to further increase. However, if the circuit scale becomes too large so that the circuit must be LSI-configured on a plurality of chips, the merit of the LSI configuration is decreased since it is not possible to make the best use of the high-speed ability. Further, the use of the special purpose hardware requires a host CPU for controlling this hardware. This also causes an increase in the number of chips.