Conventionally, a search technique is present that includes a character component table indicating the correlation between a character and a document including the character, and a condensed text file correlating a document and condensed text generated by removing bound-form words from the document. According to the searching technique: the character component table is referred to; a document that corresponds to the character included in a search keyword is identified; and the document including the search keyword is identified from the condensed text in the condensed text file according to the execution result of the step of referring to the character component table (see, e.g., Japanese Patent No. 2986865).
A search technique is disclosed according to which: when a real-time process request is accepted, the process request from a user can be accepted assuming that the acceptance is immediately completed; and, even while an index file is being generated in a real-time process, both a search for the index file and a search for real-time process data are executed and their results are compared (see, e.g., Japanese Patent No. 3024544).
However, according to the conventional techniques, the character component table is generated using 64,000 kinds of character codes, each of which is a 16-bit character code for content configured by a huge number (for example, 10,000) of document files. On the other hand, to reduce the read speed of a document file, a huge number of document files are compressed. Even using the same document files, the compression process and the character component table generation process have nothing in common. Therefore, the compression process and the generation process must be executed separately from each other. Therefore, a problem arises that the processing time is increased.