1. Field of the Invention
This invention generally relates to data retrieving techniques, directed to searching for a text or a symbol strip, and to a method and an apparatus for retrieving voice and graphics, and more particularly to a method and an apparatus for data retrieval suitable for high-speed symbol string search processing.
2. DESCRIPTION OF THE PRIOR ART
Typically, this invention may be applied to the field of text searching, and the prior art in this field will be described.
With the recent trend in office automation, storing of document information as data a base has rapidly spread and the size of the data base tends to increase. Accordingly, it is a significant task to speed up data base processing of document information. One of the important types of processing is a text search processing for retrieving a specified character string, called a pattern, from data of character strings called a text. Fast execution of the text search is, therefore, imminently desired.
In the past, various types of text search and apparatus therefor have been proposed. For example, "Hardware Systems for Text Information Retrieval," written by L. A. Hollaar, ACM SIGIR 6th Conf., 1983 describes a cellular array method wherein characters of a pattern are stored in a register one by one in an array, and the pattern is detected by inputting characters of a text to the register one by one starting from the heading character, and a finite state automation method wherein while characters of a text are supplied, starting from the heading character, one by one to a finite state automaton, a pattern is detected by referring to a state transition table. Either of the prior art methods employs character by character supply of the text starting from the heading character, and for a text length of n characters, the inputting of all the n characters has to be done, resulting in an obstacle which in principle prevents faster processing.
Known as approaches on a software basis to text searching, on the other hand, are a KMP method described in "Fast Pattern Matching in Strings", by D. E. Kunuth et al, SIAM J Comput., Vol. 6, pp 323-350, 1977, and a BM method described in "A Fast String Searching Algorithm", by R. S. Boyer et al, CACM, Vol. 20, pp 762-772, 1977. In these approaches, characters of either of a text and a pattern are fetched for comparison one by one, and various kinds of processing are carried out in accordance with the comparison results. Disadvantageously, these approaches are unsuccessful in employing software adaptively for high-speed processing, and also unsuitable for implementation in hardware.
As described above, the prior art methods and approaches face such a problem for an algorithm that all the n characters equivalent to the text length have to be inputted one by one sequentially to detect the pattern.