1. Field of the Invention
The present invention relates to an information retrieval apparatus for looking up target literature data from larger quantities of document data recorded on information recording medium without key words being given by successive matched, comparison among target character strings and the character strings in coded documents.
2. Description of the Prior Art
In recent years, larger quantities of coded documents have been used at offices and households because of popularization of word processors and personal computers. And for arrangement of the larger quantities of coded documents to effectively use them, mass data bases and high speed information retrieval apparatuses are researched, developed.
Many conventional information retrieval apparatuses give key words to be used as retrieving indexes at the recording time of the coded documents and search a target document by the retrieving operation of the key words at the document retrieving operation.
But an information retrieval apparatus which uses key words requires tremendous labor in assigning the key words, which also increases the amount of data associated with each document. Also, at the document retrieving operation, there is a problem in that the selection of proper key words whose retrieving leakage is not provided is very difficult to effect.
On the other hand, an information retrieval apparatus using a retrieving method of a coded document without the use of the key words called a full text searching system is also developed. This system searches for all the documents including retrieval character strings by the successive comparison between the retrieval character strings and the document data specified by a user. By the successive comparison between, for example, a retrieval character string "disk" and a character string within coded document, a document including "This optical disk offers high capacity." is retrieved.
FIG. 15 is a block diagram of the conventional information retrieval apparatus for using a full text searching system. In FIG. 15, 1 is a host computer for controlling an information retrieval apparatus 30 in accordance with the retrieving conditions specified by a user, 30 is an information retrieval apparatus for carrying out the recording, reproducing operations of the data on the information recording and medium 3 and also, retrieving the retrieval character strings from among the reproducing data, 31 is a microcontroller for controlling the whole information retrieval apparatus 30 with a firmware accommodated therein, 5 is a host interface circuit for controlling the transferring operation of device command, recording data, reproducing data, command status and so on with respect to the host computer 1 through a host interface 100 like SCSI (Small Computer System Interface), 32 is a string retrieval circuit for detecting the retrieval character string set by the microcomputer 31, the construction is disclosed in, for example, Japanese Patent Laid-Open No. 3-268063. 10 is a recording and reproducing circuit for adding error correcting codes to the recording data, modulating, reproducing recording signal 101 and also, demodulating the reproducing signal 102 to be read from the information recording medium 3, carrying out the error correction processing operation, 11 is a drive unit for recording, reproducing signals with respect to the information recording medium 3 engaged therein, 33 is a memory circuit for connecting a microcontroller 31, a host interface circuit 5, a string retrieval circuit 32 and the recording and reproducing circuit 10 through the data bus 13, and temporarily preserving the data to be used by the information retrieval apparatus 30. In it, there is included a transfer data memory 14 for retaining the recording data and the reproducing data to be transferred between the host computer 1, and a retrieval data memory 15 for retaining the retrieved data.
The operation of the conventional information retrieval apparatus constructed in this manner will be described hereinafter. What many document files managed with an existing file system such as, for example, MS-DOS and UNIX are already recorded in the information recording medium will be described.
The host computer 1 reads from the information recording medium 3 a directory file which manages the document files, when a user specifies the retrieved document, so to find out the recording position and the file size of the document file within the information recording medium 3 from the file management information. Then, the host computer 1 transmits to a host interface circuit 5 a device command called SEARCH command for setting the recording position and the size of the document file recorded as a file management information and retrieving characters strings specified by the users.
When the microcontroller 31 receives a device command 103 from a host interface circuit 5, the recording, reproducing control signal 104 including the recording position and size of the document file is transmitted so as to start the reproducing operation of the retrieved data by the recording and reproducing circuit 10. The recording and reproducing circuit 10 reproduces a signal from the specified area of the information recording medium 3 so as to execute the demodulating processing and the error correcting processing operation and thereafter, stores within the retrieval data memory 15 the reproducing data through the data bus 13.
The microcontroller 31 transmits the retrieving control signal 106 including the retrieval character strings when the completion of the data reproducing operation is detected from the condition of the recording, reproducing busy signal 105 so as to start the retrieving operation by the string retrieval circuit 32. At this time, the string retrieval circuit 32 reads the reproducing data from the retrieval data memory 15 so as to execute the matched comparison with respect to the specified retrieval character strings. And the string retrieval circuit 32 completes the retrieving operation with respect to the whole reproducing data and transmits to the microcontroller 31 a retrieving status 107 showing whether or not the match with respect to the retrieval character string is detected.
When the microcontroller 31 detects the completion of the retrieving operation, the command status 108 corresponding to the condition of the retrieving status 107 is set in a host interface circuit 5 so as to complete the command execution by the transfer to the host computer 1.
By retrieving operation of such document file as described hereinabove, the information retrieval apparatus 30 can discriminate whether or not the retrieval character string is included in the document file specified by the user. Therefore, the user can look up a document file including the retrieval character string without use of the key words from the many document files recorded on the information recording medium 3.
But the retrieving time of the document file using this method is simply proportioned to the number and file size of the document files which become the retrieving object. Thus, there is a problem in that the retrieving time is not contracted if the document file is retrieved with the use of the same retrieving conditions in the past.