(1) Field of the Invention
The present invention relates to an index managing (controlling) unit, an index updating method, an index managing method, a computer-readable recording medium retaining an index updating program, and a computer-readable recording medium holding an index managing program, which are suitably employed for managing and updating an index file for use in a retrieval system which retrieves a large volume of data information using an item as a key (on the basis of an item) and extracts related information.
An index file is for the purpose of storing a large amount of record information in relation to key information, and particularly, an index file being of an inverted file type is for searching record information at a high speed on the basis of key information forming an item which organizes an index, and is available for full-text retrieval.
(2) Description of the Related Art
A common retrieval system is allowed to conduct an information retrieval with respect to, for example, a group of document files. In more detail, the retrieval system is designed to perform such information retrieval processing as to use some word as a key to output a group of document numbers, associated with this key, as a retrieval result concurrently with retrieving a group of document files with document numbers.
In such retrieval processing, in order to speed up the processing to be taken until the output of a retrieval result after the input of a retrieval key, the management of information about a group of document files is made in a manner that held on a storage area is an inverted file type index file being the collection of retrieval results corresponding to keys determined in advance.
Briefly, when one key is inputted to the retrieval system, a retrieval result can be outputted through only an operation of merely opening the corresponding record information in the above-mentioned index file on the basis of this key.
Meanwhile, for the area allocation on a storage area of an index file in a prior retrieval system, an area block with a given size is initially given as an area for a portion of record information corresponding to key information and the record information is stored therein, whereas record information exceeding the given size is recorded across a plurality of area blocks.
More specifically, an area block with a given length is allocated as an initially set value onto a storage area in connection with each key so that the record information corresponding to each key is put in that area block, and if the area size of the allocated area block is insufficient to the record information to be stored, an area block for storing the record corresponding to that key is given at a location remote from the initially allocated area and a chain is established between these area blocks, thereby securing the area for storing the record information corresponding to the key information.
However, in the case of the prior area allocation technique on the storage area in the index file, when applying as the index file the inverted file type for use in a full-text retrieval, the record information to be stored in conjunction with the key information significantly differ in size from each other depending upon the key information. In such a condition, if an area block with a given size is initially allocated in units of K bytes, a problem will arises in that, because the most keys only require a considerably smaller area than this area block, an excessive area consequently takes place for the storage of the record information, which can interfere with the effective utilization of the storage area.
On the other hand, in the prior retrieval system, if a constant area is allocated even in incrementing the areas, the increment expectable area sizes vary in accordance with the keys. Particularly, a serious problem here is that an excessive area is allocated with respect to a small-increment expectable key.